Table of Contents
CS Workshop: Software
Welcome! This workshop is to familiarize new users with software in the CS environment. This is not comprehensive and is designed for a quick introduction into software that is available after being granted a CS account.
0: Software in CS
There are several layers of software that are available across CS servers. As such, generally, you do not need to install software yourself.
On Managed systems, (cluster such as portal, gpusrv, and SLURM nodes), software is made available through Software Modules.
On Self-Managed systems, software modules may or may not be available depending on the requested configuration, and software may be local to the self-managed server(s).
Container software is also supported. The supported container software is Apptainer. For further reading about containers in CS, see our (Apptainer wiki page).
Docker software is not supported. Security vulnerabilities were found for this software and thus Docker was removed and replaced with Apptainer.
Virtual environments such as conda are also supported. The software used to support conda environments is Miniforge, which is made available as a module.
Software may be built from source within a project directory or home directory. Though, in such cases it may be recommended to use an Apptainer Container for increased portability.
0.1: Software Requests
If software is needed but is not listed, please send an email to cshelpdesk@virginia.edu with the following information:
- In the subject line, write: Software Module Request: <software name and version>
- In the body of the email, include information about the requested software
- Links to software documentation page(s) if available
- Version(s) required
1: Environment Variables
Before explaining how modules work, a fundamental concept of Linux is required to fully understand the module system.
For each shell (or terminal session), unique variables are set for the shell to use and access. These are variables such as PATH, which define the path(s) to executable binary files, which allows commands such as ~$ echo "hello, world" to work. Other examples are $HOME, which is the path to your home directory, and $USER which is set to your userid.
1.1: View Individual Environment Variables
To view details about specific environment variables, replace <variable name> with the name of the environment variable
~$ echo $<variable name> ~$ echo $PATH /usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
1.2: To View All Environment Variables
To view all currently set environment variables
~$ printenv
Note, it's recommended to filter the results by piping to a grep command, replace <search string> with a string to use to search for. Note, -i specifies case-insensitive
~$ printenv | grep -i "<search string>" ~$ printenv | grep -i "path" PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
2: Software Modules
Software Modules are individual pieces of software, or a bundle of individual pieces of software that can be dynamically loaded/unloaded and used across the CS computing environment.
For example, this allows for loading specific pieces of software needed to run or build a program, such as a specific version of GCC.
Software modules are synchronized across managed systems. That is, the module system you have when logged into portal or NoMachine, is the same when using the gpusrv cluster or SLURM cluster.
For more information about software modules, see our (Software Module wiki page)
2.1: How Modules Work
When a software module is loaded, environment variables such as PATH are updated to include path(s) to additional executable(s) and libraries. Unloading module(s) will remove these additional path(s).
2.2: Toolchains
A toolchain can be considered simply as the bundle of software used to compile a software module. For example, most software is compiled using a specific version of GCC, such as gcc/11.4.0. GCC is considered a toolchain, which is then used to compile other software such as CLang, Emacs, LLVM, and others.
At times, a toolchain is not used for compiling software. Instead, a module may be compiled against the packages installed locally on the system. These are software modules such as apptainer and cuda.
For any module that was compiled using a toolchain, you must load the respective toolchain before you can load the software module.
2.3: How to Load Modules
Loading software module(s) can be done using following command, replace <name> with the name of the software, and replace <version> with the desired version
Note, if you do not include /<version>, the default version will be loaded
~$ module load <name>/<version>
For example, to load the default module for python, first load the default module for gcc
~$ module load gcc
~$ module load python
~$ python3
>>> print("Hello, world!")
Hello, world!
Note, modules are loaded in the order listed.
The shorthand command for the module load command is ml
~$ ml gcc python
Note, the toolchain used to compile the module for python was gcc, and thus must be loaded first
2.4: How to Unload Modules
To unload a specific module that has already been loaded
~$ module unload <name>/<version>
To unload ALL loaded modules
~$ module purge
2.5: List Loaded Modules
To list all currently loaded modules
~$ module list
2.6: How to Find Modules
To show all currently available modules, execute the following command.
Note, after loading a toolchain, the output of this command will change as more modules will become available
~$ module avail
To search for a module, replace <name> with the name of the software to look for
~$ module spider <name>
This will search for the provided module by <name>, using a case-insensitive search
If a toolchain is required to load a specific software module, this command will also indicate which toolchain to load first.
2.7: How to Use Modules in SLURM
As previously mentioned, the software module system is synchronized across systems. Hence, for all SLURM jobs, the module commands used are the same as what is used in a terminal.
For example, observe the following SBATCH script, which loads two modules for running a python program
#!/bin/bash #SBATCH -n 1 #SBATCH -t 00:02:00 #SBATCH -p cpu module purge module load gcc python python3 my_python_script.py
2.8: How to Use Container Modules
There are some Apptainer containers that are available as modules.
To view these, the following commands can be run
~$ module load apptainer ~$ module avail
Then a specific container module can be loaded and used, instructions will be output after loading a container module
~$ module load mips-gcc To execute the default application inside the container, run: apptainer run $CONTAINERDIR/mips-gcc-10.3.0.sif ~$ apptainer run $CONTAINERDIR/mips-gcc-10.3.0.sif Apptainer> mips-linux-gnu-gcc --version mips-linux-gnu-gcc (Ubuntu 10.3.0-1ubuntu1) 10.3.0
2.7: Best Practices for Modules
Before loading modules, it is generally recommended to clear all loaded modules to ensure that you only have loaded what's needed
~$ module purge
Note for SLURM, if you load a module BEFORE submitting a job, the module will still be loaded when your job runs unless otherwise specified.
3: Containers
The supported container software in CS is Apptainer. Docker is no longer supported in the CS environment.
Advantages of Apptainer
- Containers run as a user process
- Can run pre-built Docker containers with only a few exceptions
- Can convert Docker containers to Apptainer containers with ease
- Containers are stored as a file and can be used across all systems without local installations
- Easily run containers within the SLURM cluster or on self-managed machines
For more information on Apptainer, including building containers, see our (Apptainer wiki page).
4: Virtual Python Environments
Virtual Python environments such as Conda and VENV can be created using the software module system. This is done via the software module miniforge.
For example, to create a conda environment
~$ module load miniforge ~$ conda create --name <environment name> ~$ conda activate <environment name>
For more information, see our (Virtual Environment wiki page).