Differences

This shows you the differences between two versions of the page.

Link to this comparison view

compute_slurm [2020/08/27 00:59]
pgh5a [Information Gathering]
compute_slurm [2022/04/04 14:25]
Line 1: Line 1:
-===== Scheduling a Job using the SLURM job scheduler ===== 
- 
-{{ ::​introtoslurm.pdf | Intro to SLURM slides }} 
- 
-The Computer Science Department uses a "job scheduler"​ called [[https://​en.wikipedia.org/​wiki/​Slurm_Workload_Manager|SLURM]]. ​ The purpose of a job scheduler is to allocate computational resources (servers) to users who submit "​jobs"​ to a queue. The job scheduler looks at the requirements stated in the job's script and allocates to the job a server (or servers) which matches the requirements specified in the job script. For example, if the job script specifies that a job needs 192GB of memory, the job scheduler will find a server with at least that much memory free. 
- 
-==== Using SLURM ==== 
-**[[https://​slurm.schedmd.com/​pdfs/​summary.pdf| 
-Slurm Commands Cheat Sheet]]** 
- 
-==== Information Gathering ==== 
- 
-To view information about compute nodes in the SLURM system, use the command ''​%%sinfo%%''​. 
- 
-<​code>​ 
-[pgh5a@portal04 ~]$ sinfo 
-PARTITION AVAIL  TIMELIMIT ​ NODES  STATE NODELIST 
-main*        up   ​infinite ​     3  drain falcon[3-5] 
-main*        up   ​infinite ​    ​27 ​  idle cortado[01-10],​falcon[1-2,​6-10],​lynx[08-12],​slurm[1-5] 
-gpu          up   ​infinite ​     4    mix ai[02-03,​05],​lynx07 
-gpu          up   ​infinite ​     1  alloc ai04 
-gpu          up   ​infinite ​    ​12 ​  idle ai[01,​06],​lynx[01-06],​ristretto[01-04] 
-</​code>​ 
- 
-With ''​%%sinfo%%''​ we can see a listing of the job queues or "​partitions"​ and a list of nodes associated with these partitions. ​ A partition is a grouping of nodes, for example our //main// partition is a group of all general purpose nodes, and the //gpu// partition is a group of nodes that each contain GPUs.  Sometimes hosts can be listed in two or more partitions. 
- 
-To view jobs running on the queue, we can use the command ''​%%squeue%%''​. ​ Say we have submitted one job to the main partition, running ''​%%squeue%%''​ will look like this: 
- 
-<​code>​ 
-pgh5a@portal01 ~ $ squeue 
-             JOBID PARTITION ​    ​NAME ​    USER ST       ​TIME ​ NODES NODELIST(REASON) 
-            467039 ​     main    my_job ​   pgh5a  R      0:06      1 artemis1 
-</​code>​ 
- 
-and now that a node has been allocated, that node ''​%%artemis1%%''​ will show as //alloc// in ''​%%sinfo%%''​ 
- 
-<​code>​ 
-pgh5a@portal01 ~ $ sinfo 
-PARTITION ​    ​AVAIL ​ TIMELIMIT ​ NODES  STATE NODELIST 
-main*            up   ​infinite ​    ​37 ​  idle hermes[1-4],​artemis[2-7],​slurm[1-5],​nibbler[1-4],​trillian[1-3],​granger[1-6],​granger[7-8],​ai0[1-6] 
-main*            up   ​infinite ​     1  alloc artemis1 
-qdata            up   ​infinite ​     8   idle qdata[1-8] 
-qdata-preempt ​   up   ​infinite ​     8   idle qdata[1-8] 
-falcon ​          ​up ​  ​infinite ​    ​10 ​  idle falcon[1-10] 
-intel            up   ​infinite ​    ​24 ​  idle artemis7,​slurm[1-5],​granger[1-6],​granger[7-8],​nibbler[1-4],​ai0[1-6] 
-amd              up   ​infinite ​    ​13 ​  idle hermes[1-4],​artemis[1-6],​trillian[1-3] 
-</​code>​ 
- 
-==== Jobs ==== 
- 
-To use SLURM resources, you must submit your jobs (program/​script/​etc.) to the SLURM controller. ​ The controller will then send your job to compute nodes for execution, after which time your results will be returned. 
- 
-Users can submit SLURM jobs from ''​%%portal.cs.virginia.edu%%''​. ​ From a shell, you can submit jobs using the commands [[https://​slurm.schedmd.com/​srun.html|srun]] or [[https://​slurm.schedmd.com/​sbatch.html|sbatch]]. ​ Let's look at a very simple example script and ''​%%sbatch%%''​ command. 
- 
-Here is our script, all it does is print the hostname of the server running the script. ​ We must add ''​%%SBATCH%%''​ options to our script to handle various SLURM options. 
- 
-<code bash> 
-#!/bin/bash 
- 
-#SBATCH --job-name="​Slurm Simple Test Job" #Name of the job which appears in squeue 
-# 
-#SBATCH --mail-type=ALL 
-#SBATCH --mail-user=pgh5a@virginia.edu 
-# 
-#SBATCH --error="​my_job.err" ​                   # Where to write std err 
-#SBATCH --output="​my_job.output" ​               # Where to write stdout 
-#SBATCH --nodelist=slurm1 
- 
-hostname 
-</​code>​ 
- 
-Let's put this in a directory called ''​%%slurm-test%%''​ in our home directory. ​ We run the script with ''​%%sbatch%%''​ and the results will be put in the file we specified with ''​%%--output%%''​. ​ If no output file is specified, output will be saved to a file with the same name as the SLURM jobid. 
- 
-<​code>​ 
-pgh5a@portal01 ~ $ cd slurm-test/ ​ 
-pgh5a@portal01 ~/​slurm-test $ chmod +x test.sh ​ 
-pgh5a@portal01 ~/​slurm-test $ sbatch test.sh ​ 
-Submitted batch job 466977 
-pgh5a@portal01 ~/​slurm-test $ ls 
-my_job.err ​ my_job.output ​ test.sh 
-pgh5a@portal01 ~/​slurm-test $ cat my_job.output ​ 
-slurm1 
-</​code>​ 
- 
-Here is a similar example using ''​%%srun%%''​ running on multiple nodes: 
- 
-<​code>​ 
-pgh5a@portal01 ~ $ srun -w slurm[1-5] -N5 hostname 
-slurm4 
-slurm1 
-slurm2 
-slurm3 
-slurm5 
-</​code>​ 
- 
-==== Terminating Jobs ==== 
- 
-Please be aware of jobs you start and make sure that they finish executing. ​ If your job does not exit gracefully, it will continue running on the server, taking up resources and preventing others from running their jobs. 
- 
-To cancel a running job, use the ''​%%scancel [jobid]%%''​ command 
- 
-<​code>​ 
-abc1de@portal01 ~ $ squeue 
-             JOBID PARTITION ​    ​NAME ​    USER ST       ​TIME ​ NODES NODELIST(REASON) 
-            467039 ​     main    sleep    abc1de ​ R       ​0:​06 ​     1 artemis1 ​          <​-- ​ Running job 
-abc1de@portal01 ~ $ scancel 467039 
-</​code>​ 
- 
-==== Partitions ==== 
- 
-Slurm refers to job queues as //​partitions//​. ​ These queues can have unique constraints such as compute nodes, max runtime, resource limits, etc.  There is a ''​%%main%%''​ queue, which will make use of all non-GPU compute nodes. 
- 
-Partition is indicated by ''​%%-p partname%%''​ or ''​%%--partition partname%%''​. 
- 
-To specify a partition with ''​%%sbatch%%''​ file: 
- 
-<​code>​ 
-#SBATCH --partition=gpu 
-</​code>​ 
- 
-Or from the command line with ''​%%srun%%''​ 
- 
-<​code>​ 
--p gpu 
-</​code>​ 
- 
-An example running the command ''​%%hostname%%''​ on the //main// partition, this will run on any node in the partition: 
- 
-<​code>​ 
-srun -p main hostname 
-</​code>​ 
- 
-==== GPUs ==== 
- 
-Slurm handles GPUs and other non-CPU computing resources using what are called [[https://​slurm.schedmd.com/​gres.html|GRES]] Resources (Generic Resource). ​ To use the GPU(s) on a system using Slurm, either using ''​%%sbatch%%''​ or ''​%%srun%%'',​ you must request the GPUs using the ''​%%--gres:​x%%''​ option. ​ You must specify the ''​%%gres%%''​ flag followed by ''​%%:​%%''​ and the quantity of resources 
- 
-Say we want to use 4 GPUs on a system, we would use the following ''​%%sbatch%%''​ option: 
- 
-<​code>​ 
-#SBATCH --gres=gpu:​4 
-</​code>​ 
- 
-Or from the command line  
- 
-<​code>​ 
---gres=gpu:​4 
-</​code>​ 
- 
-==== Direct login to servers (without a job script) ==== 
- 
-You can use ''​%%srun%%''​ to login directly to a server controlled by the SLURM job scheduler. ​ This can be useful for debugging purposes as well as running your applications without using a job script. Directly logging in also reserves the node for your exclusive use.  
- 
-To spawn a shell we must pass the ''​%%--pty%%''​ option to ''​%%srun%%''​ so output is directed to a pseudo-terminal:​ 
- 
-<​code>​ 
-abc1de@portal ~$ srun -w cortado04 --pty bash -i -l - 
-abc1de@cortado04 ~$ hostname 
-cortado04 
-abc1de@cortado04 ~$ 
-</​code>​ 
- 
-The ''​%%-w%%''​ argument selects the server into which to login. The ''​%%-i%%''​ argument tells ''​%%bash%%''​ to run as interactive. ​ The ''​%%-l%%''​ argument instructs bash that this is a login shell, this, along with the final ''​%%-%%''​ are important to reset environment variables that otherwise might cause issues using [[linux_environment_modules|Environment Modules]] 
- 
-If a node is in a partition other than the default "​main"​ partition (for example, the "​gpu"​ partition), then you must specify the partition in your command, for example: 
-<​code>​ 
-pgh5a@portal ~$ srun -w lynx05 -p gpu --pty bash -i -l - 
-</​code>​ 
- 
-==== Reservations ==== 
- 
-Reservations for specific resources or nodes can be made by submitting a request to <​cshelpdesk@virginia.edu>​. ​ For more information about using reservations,​ see the main article on [[compute_slurm_reservations|SLURM Reservations]] 
- 
-==== Note on Modules in Slurm ==== 
- 
-Due to the way sbatch spawns a bash session (non-login session), some init files are not loaded from ''​%%/​etc/​profile.d%%''​. ​ This prevents the initialization of the [[linux_environment_modules|Environment Modules]] system and will prevent you from loading software modules. 
- 
-To fix this, simply include the following line in your sbatch scripts: 
- 
-<code bash> 
-source /​etc/​profile.d/​modules.sh 
-</​code>​ 
  
  • compute_slurm.txt
  • Last modified: 2022/04/04 14:25
  • (external edit)