Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
compute_slurm [2020/08/26 01:42]
pgh5a [Interactive Login Use of SLURM servers]
compute_slurm [2022/12/01 18:11] (current)
Line 1: Line 1:
-====== Scheduling a Job using the SLURM job scheduler ​======+==== Scheduling a Job using the SLURM job scheduler ====
  
 {{ ::​introtoslurm.pdf | Intro to SLURM slides }} {{ ::​introtoslurm.pdf | Intro to SLURM slides }}
Line 5: Line 5:
 The Computer Science Department uses a "job scheduler"​ called [[https://​en.wikipedia.org/​wiki/​Slurm_Workload_Manager|SLURM]]. ​ The purpose of a job scheduler is to allocate computational resources (servers) to users who submit "​jobs"​ to a queue. The job scheduler looks at the requirements stated in the job's script and allocates to the job a server (or servers) which matches the requirements specified in the job script. For example, if the job script specifies that a job needs 192GB of memory, the job scheduler will find a server with at least that much memory free. The Computer Science Department uses a "job scheduler"​ called [[https://​en.wikipedia.org/​wiki/​Slurm_Workload_Manager|SLURM]]. ​ The purpose of a job scheduler is to allocate computational resources (servers) to users who submit "​jobs"​ to a queue. The job scheduler looks at the requirements stated in the job's script and allocates to the job a server (or servers) which matches the requirements specified in the job script. For example, if the job script specifies that a job needs 192GB of memory, the job scheduler will find a server with at least that much memory free.
  
-===== Using SLURM =====+The job scheduler supports a direct login option (see below) that allows direct interactive logins to servers controlled by the scheduler, without the need for a job script. 
 + 
 +As of 10-Jan-2022,​ the SLURM job scheduler was updated to the latest revision which enforces memory limits. As a result, jobs that exceed their requested memory size will be terminated by the scheduler. 
 + 
 +As of 02-Apr-2022,​ the time limit enforcement policy within SLURM has changed. All jobs submitted with time limits are extended 60 minutes past the user-submitted time limit. E.g. If the user submits a job with a time limit of 10 minutes using parameter "-t 10", SLURM will kill the job after 70 minutes. 
 + 
 +==== Using SLURM ====
 **[[https://​slurm.schedmd.com/​pdfs/​summary.pdf| **[[https://​slurm.schedmd.com/​pdfs/​summary.pdf|
 Slurm Commands Cheat Sheet]]** Slurm Commands Cheat Sheet]]**
 +
 +The SLURM commands below are ONLY available on the portal cluster of servers. They are not installed on the gpusrv* or the SLURM controlled nodes themselves.
  
 ==== Information Gathering ==== ==== Information Gathering ====
  
-SLURM provides a set of tools to use for interacting with the scheduling system.  ​To view information about compute nodes in the SLURM system, ​we can use the command ''​%%sinfo%%''​.+To view information about compute nodes in the SLURM system, use the command ''​%%sinfo%%''​.
  
 <​code>​ <​code>​
-[pgh5a@portal04 ~]$ sinfo+[abc1de@portal04 ~]$ sinfo
 PARTITION AVAIL  TIMELIMIT ​ NODES  STATE NODELIST PARTITION AVAIL  TIMELIMIT ​ NODES  STATE NODELIST
 main*        up   ​infinite ​     3  drain falcon[3-5] main*        up   ​infinite ​     3  drain falcon[3-5]
Line 28: Line 36:
  
 <​code>​ <​code>​
-pgh5a@portal01 ~ $ squeue +abc1de@portal01 ~ $ squeue 
-             JOBID PARTITION ​    ​NAME ​    USER ST       ​TIME  NODES NODELIST(REASON) +             JOBID PARTITION ​    ​NAME ​    ​USER ​   ST     ​TIME  NODES NODELIST(REASON) 
-            467039 ​     main    my_job ​   ​pgh5a  ​R ​     0:06      1 artemis1+            467039 ​     main    my_job ​   ​abc1de ​ ​R ​     0:06      1 artemis1
 </​code>​ </​code>​
  
Line 36: Line 44:
  
 <​code>​ <​code>​
-pgh5a@portal01 ~ $ sinfo+abc1de@portal01 ~ $ sinfo
 PARTITION ​    ​AVAIL ​ TIMELIMIT ​ NODES  STATE NODELIST PARTITION ​    ​AVAIL ​ TIMELIMIT ​ NODES  STATE NODELIST
 main*            up   ​infinite ​    ​37 ​  idle hermes[1-4],​artemis[2-7],​slurm[1-5],​nibbler[1-4],​trillian[1-3],​granger[1-6],​granger[7-8],​ai0[1-6] main*            up   ​infinite ​    ​37 ​  idle hermes[1-4],​artemis[2-7],​slurm[1-5],​nibbler[1-4],​trillian[1-3],​granger[1-6],​granger[7-8],​ai0[1-6]
Line 49: Line 57:
 ==== Jobs ==== ==== Jobs ====
  
-To use SLURM resources, you must submit your jobs (program/​script/​etc.) to the SLURM controller. ​ The controller will then send your job to compute nodes for execution, after which time your results will be returned.+To use SLURM resources, you must submit your jobs (program/​script/​etc.) to the SLURM controller. ​ The controller will then send your job to compute nodes for execution, after which time your results will be returned. There is also an //direct login option// (see below) that doesn'​t require a job script.
  
-Users can submit SLURM jobs from ''​%%portal.cs.virginia.edu%%''​.  ​From a shell, you can submit jobs using the commands [[https://​slurm.schedmd.com/​srun.html|srun]] or [[https://​slurm.schedmd.com/​sbatch.html|sbatch]]. ​ Let's look at a very simple example script and ''​%%sbatch%%''​ command.+Users can submit SLURM jobs from ''​%%portal.cs.virginia.edu%%''​.  ​You can submit jobs using the commands [[https://​slurm.schedmd.com/​srun.html|srun]] or [[https://​slurm.schedmd.com/​sbatch.html|sbatch]]. ​ Let's look at a very simple example script and ''​%%sbatch%%''​ command.
  
 Here is our script, all it does is print the hostname of the server running the script. ​ We must add ''​%%SBATCH%%''​ options to our script to handle various SLURM options. Here is our script, all it does is print the hostname of the server running the script. ​ We must add ''​%%SBATCH%%''​ options to our script to handle various SLURM options.
Line 57: Line 65:
 <code bash> <code bash>
 #!/bin/bash #!/bin/bash
- +# --- this job will be run on any available node 
-#SBATCH ​--job-name="​Slurm Simple Test Job" #Name of the job which appears in squeue +and simply output the node's hostname to 
-+my_job.output 
-#SBATCH --mail-type=ALL +#SBATCH --job-name="Slurm Simple Test Job" 
-#SBATCH --mail-user=pgh5a@virginia.edu +#SBATCH --error="​my_job.err"​ 
-# +#SBATCH --output="​my_job.output"​ 
-#SBATCH --error="​my_job.err" ​                   # Where to write std err +echo "​$HOSTNAME"​
-#SBATCH --output="​my_job.output" ​               # Where to write stdout +
-#SBATCH --nodelist=slurm1 +
- +
-hostname+
 </​code>​ </​code>​
  
-Let's put this in a directory called ''​%%slurm-test%%''​ in our home directory.  ​We run the script with ''​%%sbatch%%''​ and the results will be put in the file we specified with ''​%%--output%%''​. ​ If no output file is specified, output will be saved to a file with the same name as the SLURM jobid.+We run the script with ''​%%sbatch%%''​ and the results will be put in the file we specified with ''​%%--output%%''​. ​ If no output file is specified, output will be saved to a file with the same name as the SLURM jobid.
  
 <​code>​ <​code>​
-pgh5a@portal01 ​~ $ cd slurm-test/  +[abc1de@portal04 ​~]sbatch ​slurm.test 
-pgh5a@portal01 ~/​slurm-test $ chmod +x test.sh  +Submitted batch job 640768 
-pgh5a@portal01 ~/slurm-test $ sbatch test.sh ​ +[abc1de@portal04 ​~]more my_job.output 
-Submitted batch job 466977 +cortado06
-pgh5a@portal01 ​~/slurm-test $ ls +
-my_job.err ​ my_job.output ​ test.sh +
-pgh5a@portal01 ~/​slurm-test ​cat my_job.output  +
-slurm1+
 </​code>​ </​code>​
  
Line 86: Line 86:
  
 <​code>​ <​code>​
-pgh5a@portal01 ~ $ srun -w slurm[1-5] -N5 hostname+abc1de@portal01 ~ $ srun -w slurm[1-5] -N5 hostname
 slurm4 slurm4
 slurm1 slurm1
Line 94: Line 94:
 </​code>​ </​code>​
  
 +==== Direct login to servers (without a job script) ====
 +
 +You can use ''​%%srun%%''​ to login directly to a server controlled by the SLURM job scheduler. ​ This can be useful for debugging purposes as well as running your applications without using a job script.
 +
 +We must pass the ''​%%--pty%%''​ option to ''​%%srun%%''​ so output is directed to a pseudo-terminal. ​
 +
 +For example, to open a direct login job on the node "​cortado04",​ use:
 +
 +<​code>​
 +abc1de@portal ~$ srun -w cortado04 --pty bash -i -l -
 +abc1de@cortado04 ~$ hostname
 +cortado04
 +abc1de@cortado04 ~$
 +</​code>​
 +
 +The ''​%%-w%%''​ argument selects the server into which to login. The ''​%%-i%%''​ argument tells ''​%%bash%%''​ to run as an interactive shell. ​ The ''​%%-l%%''​ argument instructs bash that this is a login shell, this, along with the final ''​%%-%%''​ are important to reset environment variables that otherwise might cause issues using [[linux_environment_modules|Environment Modules]]
 +
 +If a node is in a partition (see below for partition information) other than the default "​main"​ partition (for example, the "​gpu"​ partition), then you //must// specify the partition in your command, for example:
 +<​code>​
 +abc1de@portal ~$ srun -w lynx05 -p gpu --pty bash -i -l -
 +</​code>​
 +
 +If you are using a reservation,​ you must specify the "​--reservation=<​reservationname>"​ option to srun.
 ==== Terminating Jobs ==== ==== Terminating Jobs ====
  
Line 107: Line 130:
 </​code>​ </​code>​
  
-==== Partitions ====+The default signal sent to a running job is SIGTERM (terminate). If you wish to send a different signal to the job's processes (for example, a SIGKILL which is often needed if a SIGTERM doesn'​t terminate the process), use the ''​%%--signal%%''​ argument to scancel, i.e.: 
 +<​code>​ 
 +abc1de@portal01 ~ $ scancel --signal=KILL 467039 
 +</​code>​ 
 + 
 + 
 +==== Queues/Partitions ==== 
 + 
 +Slurm refers to job queues as //​partitions//​. We group similar systems into separate queues. For example, there is a "​main"​ queue for general purpose systems, and a "​gpu"​ queue for systems with GPUs. These queues can have unique constraints such as compute nodes, max runtime, resource limits, etc. 
 + 
 +If no partition is specified in your job script or when using the '​srun'​ command, it will go to the default partition "​main"​.
  
-Slurm refers to job queues as //partitions//.  These queues can have unique constraints such as compute nodesmax runtime, resource limits, etc.  There is a ''​%%main%%''​ queuewhich will make use of all non-GPU compute nodes.+The "​main"​ and "​gpu" ​partitions have a time limit setso jobs will terminate after specified number of days, as shown in the output of the 'sinfo' ​command. Howeverthere are two additional partitions, "​nolim"​ and "​gnolim",​ that have no time limit, so jobs will run to completion without any set time of termination.
  
 Partition is indicated by ''​%%-p partname%%''​ or ''​%%--partition partname%%''​. Partition is indicated by ''​%%-p partname%%''​ or ''​%%--partition partname%%''​.
Line 131: Line 164:
 </​code>​ </​code>​
  
-==== GPUs ====+==== Long Running Jobs ==== 
 + 
 +If a job is expected to run longer than the default for a given partition, two other paritions with unlimited runtime named ''​%%nolim%%''​ and ''​%%gnolim%%''​ exist for jobs that have long runtimes.\\ 
 + 
 +The parition ''​%%nolim%%''​ is for long running jobs that do not require a GPU, and ''​%%gnolim%%''​ is for long running jobs that do require a GPU. 
 + 
 +To utilize these partitions, simply specify the name of the partition in ''​%%srun%%''​ or ''​%%sbatch%%''​. 
 + 
 +==== Using GPUs ====
  
 Slurm handles GPUs and other non-CPU computing resources using what are called [[https://​slurm.schedmd.com/​gres.html|GRES]] Resources (Generic Resource). ​ To use the GPU(s) on a system using Slurm, either using ''​%%sbatch%%''​ or ''​%%srun%%'',​ you must request the GPUs using the ''​%%--gres:​x%%''​ option. ​ You must specify the ''​%%gres%%''​ flag followed by ''​%%:​%%''​ and the quantity of resources Slurm handles GPUs and other non-CPU computing resources using what are called [[https://​slurm.schedmd.com/​gres.html|GRES]] Resources (Generic Resource). ​ To use the GPU(s) on a system using Slurm, either using ''​%%sbatch%%''​ or ''​%%srun%%'',​ you must request the GPUs using the ''​%%--gres:​x%%''​ option. ​ You must specify the ''​%%gres%%''​ flag followed by ''​%%:​%%''​ and the quantity of resources
Line 147: Line 188:
 </​code>​ </​code>​
  
-==== Direct login to servers (without a job script) ​====+==== Reservations ​====
  
-We can use ''​%%srun%%'' ​to login directly to a server controlled by the SLURM job scheduler.  ​This can be useful for debugging purposes as well as running your applications without ​using a job script. Creating a login session also reserves ​the node for your exclusive use. +Reservations for specific resources or nodes can be made by submitting a request ​to <​cshelpdesk@virginia.edu>​.  ​For more information about using reservations,​ see the main article on [[compute_slurm_reservations|SLURM Reservations]]
  
-To spawn a shell we must pass the ''​%%--pty%%''​ option to ''​%%srun%%''​ so output is directed to a pseudo-terminal:​ +==== Job Accounting ==== 
- +The SLURM scheduler implements the Accounting features of slurm. So users can execute the ''​%%sacct%%'' ​command ​to find job accounting information,​ like job ids, job names, partition ​run upon, allocated CPUsjob stateand exit codes. There are numerous other options supported. Type ''​%%man sacct%%'' ​on portal ​to see all the options.  ​
-<​code>​ +
-pgh5a@portal ~$ srun -w slurm1 --pty bash -i -l - +
-pgh5a@slurm1 ~$ hostname +
-slurm1 +
-pgh5a@slurm1 ~$ +
-</​code>​ +
- +
-The ''​%%-i%%''​ argument tells ''​%%bash%%''​ to run as interactive. ​ The ''​%%-l%%''​ arg instructs bash that this is a login shellthisalong with the final ''​%%-%%'' ​are important ​to reset environment variables that otherwise might cause issues using [[linux_environment_modules|Environment Modules]] +
- +
-If a node is in a partition other than the default "​main"​ partition (for example, the "​gpu"​ partition), then you must specify the partition in your command, for example: +
-<​code>​ +
-pgh5a@portal ~$ srun -w lynx05 -p gpu --pty bash -i -l - +
-</​code>​ +
- +
-===== Reservations ===== +
- +
-Reservations for specific resources or nodes can be made by submitting a request to <​cshelpdesk@virginia.edu>​.  ​For more information about using reservations,​ see the main article on [[compute_slurm_reservations|SLURM Reservations]]+
  
-===== Note on Modules in Slurm =====+==== Note on Modules in Slurm ====
  
 Due to the way sbatch spawns a bash session (non-login session), some init files are not loaded from ''​%%/​etc/​profile.d%%''​. ​ This prevents the initialization of the [[linux_environment_modules|Environment Modules]] system and will prevent you from loading software modules. Due to the way sbatch spawns a bash session (non-login session), some init files are not loaded from ''​%%/​etc/​profile.d%%''​. ​ This prevents the initialization of the [[linux_environment_modules|Environment Modules]] system and will prevent you from loading software modules.
  • compute_slurm.1598406149.txt.gz
  • Last modified: 2020/08/26 01:42
  • by pgh5a