User Tools

Site Tools


linux:slurm

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
linux:slurm [2023/02/21 10:02] a.boevlinux:slurm [2023/12/20 18:47] (current) – [Squeue tips] admin
Line 1: Line 1:
 ====== SLURM (Simple Linux Utility for Resource Management) ====== ====== SLURM (Simple Linux Utility for Resource Management) ======
  
 +===== Basic commands =====
  
-To see the details of all the nodes you can use:+  - //**sbatch** run_script// is used to submit a job script for later execution. The script will typically contain one or more srun commands to launch parallel tasks. 
 +  - //**scancel** job_id// is used to cancel a pending or running job or job step. It can also be used to send an arbitrary signal to all processes associated with a running job or job step. 
 +  - //**squeue**// reports the state of jobs or job steps. It has a wide variety of filtering, sorting, and formatting options. By default, it reports the running jobs in priority order and then the pending jobs in priority order. 
 + 
 + 
 +More info see [[https://slurm.schedmd.com/quickstart.html|here]] 
 + 
 +===== Squeue tips ===== 
 + 
 +=== How to see path to job folder === 
 +add alias to your ~/.bashrc: 
 +<code> 
 +alias qp="squeue -o '%o' | awk -F / '{\$(NF--)} {gsub(\" \",FS)};  \$0=\"cd \"\$0 '" 
 +</code> 
 + 
 + 
 + 
 +Output 
 + 
 +<code> 
 +cd /home/a.dembitskiy/project_template/NaGPO4F/phonons_minimum_121 
 +cd /home/a.boev/vasp/surseg_tem//polaron_seg//LCO.104.7.is.Ti.o_coord.1ULC_g 
 +</code> 
 + 
 +=== How to see the details of all the nodes you can use ===
 <code> <code>
 scontrol show node scontrol show node
 </code> </code>
  
-To see jobs' info including number of nodes and cores you can use the format mark %C, for instance:+Output 
 + 
 +<code> 
 +NodeName=node-amg01 Arch=x86_64 CoresPerSocket=8  
 +   CPUAlloc=4 CPUTot=16 CPULoad=9.63 
 +   AvailableFeatures=CEST,sm,e5-2630,haswell,hdd 
 +   ActiveFeatures=CEST,sm,e5-2630,haswell,hdd 
 +   Gres=(null) 
 +   NodeAddr=node-amg01 NodeHostName=node-amg01 Version=18.08 
 +   OS=Linux 3.10.0-862.14.4.el7.x86_64 #1 SMP Fri Sep 21 09:07:21 UTC 2018  
 +   RealMemory=122880 AllocMem=8192 FreeMem=119484 Sockets=2 Boards=1 
 +   State=MIXED ThreadsPerCore=1 TmpDisk=0 Weight=50 Owner=N/A MCS_label=N/
 +   Partitions=AMG,AMG-medium,AMG-long,AMG-short  
 +   BootTime=2019-10-08T13:00:43 SlurmdStartTime=2021-02-15T16:33:51 
 +   CfgTRES=cpu=16,mem=120G,billing=16 
 +   AllocTRES=cpu=4,mem=8G 
 +   CapWatts=n/
 +   CurrentWatts=0 LowestJoules=0 ConsumedJoules=0 
 +   ExtSensorsJoules=n/s ExtSensorsWatts=0 ExtSensorsTemp=n/
 +</code> 
 + 
 +=== How to see jobs' info including number of nodes and cores === 
 + 
 +You can use the format mark %C, for instance:
 <code> <code>
 squeue -o"%.7i %.9P %.8j %.8u %.2t %.10M %.6D %C" squeue -o"%.7i %.9P %.8j %.8u %.2t %.10M %.6D %C"
 </code> </code>
 +
 +Output
 +
 <code> <code>
   JOBID PARTITION     NAME     USER ST       TIME  NODES CPUS   JOBID PARTITION     NAME     USER ST       TIME  NODES CPUS
linux/slurm.1676962947.txt.gz · Last modified: 2023/02/21 10:02 by a.boev

Donate Powered by PHP Valid HTML5 Valid CSS Driven by DokuWiki