SLURM provides “reservations” of compute nodes. A reservation reserves a node (or nodes) for your exclusive use. So if you need to use a node without any other users on the node, or for an extended period of time, we can reserve the node (or nodes) for your use only.
Getting a Reservation
Reservations for nodes can be requested by sending an email to email@example.com. We can reserve an entire compute node or node group (ex. “cortado01” or “cortado[01-05]”).
In your request for a reservation, please be specific about:
- What servers are needed (ex. “cortado01-05”)
- The start and end time of the reservation (ex. today to Dec 31st)
- Who can use the reservation (a list of user IDs)
Note: the requested node must be free of any running jobs before a reservation is started. So if you are requesting a reservation on a node upon which you are running a job, we will create the reservation but it will not take effect until the job completes on the node. So please cancel ('scancel') your job before requesting a reservation.
Note: nodes may be reserved for a maximum of fourteen days, and extended in one week increments. Reservations that are not being used (no jobs running) will be removed.
Using reservations when submitting jobs
If you have been granted a reservation you will receive a reservation tag, a short string that is required to submit jobs. This string must be included in any
sbatch submissions via the flag
If our tag is “abc1de_4” then you would submit jobs using the flag
This can be done at the command line:
[abc1de@portal03 ~]$ sbatch --reservation=abc1de_4 slurm.sh ... [abc1de@portal03 ~]$ srun --reservation=abc1de_4 slurm.sh
Or you can include the flag in the header of your sbatch file:
[abc1de@portal03 ~]$ cat reservation.sh #!/bin/bash #SBATCH --job-name="Reservation Example" # #SBATCH --error="stderr.txt" #SBATCH --output="stdout.txt" # #SBATCH --reservation=abc1de_4 <- Include reservation tag ... [abc1de@portal03 ~]$ sbatch reservation.sh
You can see a listing of all active reservations by using the
[abc1de@portal03 ~]$ scontrol show reservation ReservationName=abc1de_4 StartTime=2019-06-25T00:00:00 EndTime=2019-07-02T16:00:00 Nodes=slurm1 NodeCnt=1 CoreCnt=12 Features=(null) PartitionName=(null) Flags=SPEC_NODES TRES=cpu=24 Users=abc1de Accounts=(null) Licenses=(null) State=ACTIVE BurstBuffer=(null) Watts=n/a
Avoiding Default Cluster Limits with Reservations
The CS cluster has a default SLURM QoS named
csdefault which limits the number of concurrent jobs a user may run. Reservations may circumvent this limit by using the QoS
srun commands or
sbatch scripts using the parameter
-q csresnolim or
This QoS requires a reservation be specified to be used, that is, you must also include
–reservation=abc1de_4 in your
srun commands or