SLURM Reservations
Getting a Reservation
Reservations for slurm nodes/resources can be requested by submitting a ticket by email to cshelpdesk@virginia.edu. We can give out reservations for a specific amount of computing resources (Cores, Mem, GPU count, etc.) or by reserving an entire compute node or node group (ex. “cortado01” or “cortado[01-05]”).
In your request for a reservation, please be specific about:
- What servers are needed (ex. “cortado01-05”)
- The start and end time of the reservation (ex. today to Dec 31st)
- Who can use the reservation (a list of user IDs)
Note: the node must be free of any running jobs before a reservation can be created. So if you are requesting a reservation on a node upon which you are running a job, we can't create the reservation until the job completes on the node. So it is best to cancel ('scancel') the job before requesting a reservation.
Note: nodes may be reserved for a maximum of fourteen days, and extended in one week increments.
Using reservations when submitting jobs
If you have been granted a reservation you will receive a reservation tag, a short string that is required to submit jobs. This string must be included in any srun
or sbatch
submissions via the flag --reservation=...
.
If our tag is “abc1de_4” then you would submit jobs using the flag --reservation=abc1de_4
This can be done at the command line:
[abc1de@portal03 ~]$ sbatch --reservation=abc1de_4 slurm.sh ... [abc1de@portal03 ~]$ srun --reservation=abc1de_4 slurm.sh
Or you can include the flag in the header of your sbatch file:
[abc1de@portal03 ~]$ cat reservation.sh #!/bin/bash #SBATCH --job-name="Reservation Example" # #SBATCH --error="stderr.txt" #SBATCH --output="stdout.txt" # #SBATCH --reservation=abc1de_4 <- Include reservation tag ... [abc1de@portal03 ~]$ sbatch reservation.sh
Listing Reservations
You can see a listing of all active reservations by using the scontrol
command:
[abc1de@portal03 ~]$ scontrol show res ReservationName=abc1de_4 StartTime=2019-06-25T00:00:00 EndTime=2019-07-02T16:00:00 Nodes=slurm1 NodeCnt=1 CoreCnt=12 Features=(null) PartitionName=(null) Flags=SPEC_NODES TRES=cpu=24 Users=abc1de Accounts=(null) Licenses=(null) State=ACTIVE BurstBuffer=(null) Watts=n/a