- 1 Getting Started
- 2 Troubleshooting
- 3 Advanced Topics
- 4 Links
CUDA support includes three main components:
- Nvidia driver with CUDA support
- CUDA Toolkit
- CUDA SDK with code samples
We now provide support for these tools for lab, research group and grad desktop machines, under both Windows and Linux. We administer CUDA differently depending on the machine - please find the case that matches yours below. If you have CUDA support questions that aren't addressed here, please email root.
You should find CUDA installed under C:\CUDA. If that directory doesn't exist, or it doesn't contain a subdirectory called NVIDIA_CUDA_SDK, check under C:\Program Files\CUDA and copy the appropriate files/directories to C:\. This is needed because you are not allowed to modify any files in Program Files.
Once you've passed that step, open up C:\CUDA\NVIDIA_CUDA_SDK\projects in Windows Explorer. Go into any of the project directories, and double-click the Visual Studio solution file, which should open up that project in Visual Studio. Try to compile the project. The output at the bottom of the screen will tell you whether the compile was successful or not.
Any of the SDK example projects should compile out of the box. If not, contact your TA; the machine is either not setup correctly, or you did something wrong along the way.
If you get a successful compile, try to run your binary in debug mode (which is the default for Visual Studio). For most projects, a window will pop up with some output (hopefully telling you that everything is fine) and prompting you to hit ENTER to exit. Do so.
Now you can start your own project. Close the example project that you've been working on. Create another directory with a new name under projects. Copy all the files in the project directory you have been working in up to now to that new directory.
Open up the project in the new directory. You should probably rename the project inside Visual Studio to the same name you gave to the directory to avoid future confusion.
Now you can start modifying the code, and see if anything still works after you have done so. :-)
The Nvidia driver and CUDA toolkit are already installed; the toolkit is available at /usr/local/cuda. Follow the instructions for installing a 32-bit copy of the SDK in your home directory at #Installing the CUDA SDK and code samples - the same procedure applies to the 002a machines.
To run CUDA programs on the GPU, you will need a CUDA-capable Nvidia graphics card - a G8X series or greater. These are still relatively rare within the department, so if you're not sure whether your machine is CUDA-ready, it probably isn't. When in doubt, feel free to send mail to root, and we'll take a look at your card and let you know whether it'll work.
Regardless of whether your card supports CUDA, you may install the CUDA Toolkit and SDK on your machine and do development work there. You can even run your programs on the emulator that comes with the CUDA Toolkit, but the emulator is incredibly slow by comparison, so we don't recommend it for anything but lightweight testing.
If you are an administrator on your machine, you can download and install the driver (if applicable; see note above about supported cards), Toolkit and SDK.
If you have a card that supports CUDA, send email to root with your hostname, and we'll give you sudo access for installing and configuring the NVIDIA driver and CUDA Toolkit. Otherwise, you can download the Toolkit and SDK and install them into /usr/local or your home directory.
Professor Skadron's research machines
To set up CUDA tools on a departmental Linux machine, first send mail to firstname.lastname@example.org requesting that the host be added to the CUDA Cfengine group and the main departmental /etc/sudoers file. If you are new to the group, you will also need to request to be added to the CUDA_USERS group of /etc/sudoers.
Once those changes are made on the server, the host will pick up CUDA-related configuration within an hour of being up on the network and booted into Linux. A /cuda directory will be automatically created on the host, which contains the installers for the driver, toolkit and SDK (Note: due to version changes, your /cuda directory may not be identical to this listing).
root@skadrondell3:~# ls /cuda NVIDIA_CUDA_sdk_2.0beta2_linux.run NVIDIA_CUDA_Toolkit_2.0beta2_Ubuntu7.10_x86_64.run NVIDIA_CUDA_Toolkit_2.0_ubuntu7.10_x86.run NVIDIA-Linux-x86-177.67-pkg1.run NVIDIA-Linux-x86_64-177.13-pkg2.run NVIDIA-Linux-x86_64-177.67-pkg2.run
Members of the CUDA_USERS sudo group have the permission to run everything in /cuda, along with several system utilities required for the driver installation.
Installing the Nvidia CUDA driver
First, shut down the X server; the Nvidia driver installer requires that X not be running.
sudo killall gdm
Next, find within the /cuda directory the driver installer that matches the architecture of your host - they start with "NVIDIA-Linux". Run "uname -a" if you're unsure of the architecture; x86 machines will list "i686" and x86_64 machines will list "x86_64". As of September 2008, the current version of the Nvidia driver is 177.67. Via sudo, run the installer that matches the host architecture.
Select the defaults, and answer yes to the question "Install NVIDIA's 32-bit compatibility OpenGL libraries?" At the end of the installer, you will be asked whether to overwrite your X server settings file (xorg.conf); do so.
Once the driver installer completes, restart the X server:
sudo /etc/init.d/gdm start
Installing the CUDA Toolkit
Within the /cuda directory, find the toolkit installer that matches the architecture of your host, and invoke it with sudo. Versions 2.0 and earlier start with "NVIDIA_CUDA_Toolkit"; later versions start with "cudatoolkit".
Accept the defaults. At the end of the installation, you'll see the following notices:
* Please make sure your PATH includes /usr/local/cuda/bin * Please make sure your LD_LIBRARY_PATH includes /usr/local/cuda/lib
Do so - to change them permanently, edit your .profile to include the following lines:
PATH=/usr/local/cuda/bin:$PATH LD_LIBRARY_PATH=/usr/local/cuda/lib:$LD_LIBRARY_PATH export PATH LD_LIBRARY_PATH
Installing the CUDA SDK and code samples
In order to give each user the ability to build into and modify code samples provided by the SDK, we recommend installing a copy in each user's home directory for each architecture used for CUDA work. Run it as yourself (without using sudo):
cd ~ /cuda/NVIDIA_CUDA_sdk_2.0beta2_linux.run
The installer will prompt for the toolkit location: /usr/local/cuda. It will also what you want the path to be for the SDK you're setting up; if you'll be using 32-bit and 64-bit CUDA machines, you'll need to maintain two copies of the SDK, so we recommend that you name them appropriately (i.e. ~/NVIDIA_CUDA_SDK-x86 and ~NVIDIA_CUDA_SDK-x86_64).
Run "make" within the "common" directory of the SDK(s) you install in your home directory. Among other things, this builds libcutil.a into the "lib" directory where you installed the SDK and will allow you to build code examples.
: /af13/jlg9n/NVIDIA_CUDA_SDK ; cd common : /af13/jlg9n/NVIDIA_CUDA_SDK/common ; make
To test the SDK, build the "scan" utility, which will allow you to test your SDK:
: /af13/jlg9n/NVIDIA_CUDA_SDK ; cd projects/scan : /af13/jlg9n/NVIDIA_CUDA_SDK ; make
Then, cd into the root of the SDK(s) you've installed and run the scan test:
: /af13/jlg9n/NVIDIA_CUDA_SDK ; bin/linux/release/scan
You should see all three scan tests pass:
scan_naive: Test PASSED scan_workefficient: Test PASSED scan_best: Test PASSED
If not, send email to root with the output of scan and the path to the copy of the SDK you're using.
As new versions of the driver, toolkit and SDK become available, systems staff can easily add them to the /cuda directory on all of the boxes - just send email to root with the pointer to the new installers, and we'll copy them into the master directory.
You may need to tweak your Xorg configuration to set up the display settings you prefer; the CUDA_USERS sudo group is authorized to run "/usr/bin/nvidia-xsettings", "/usr/bin/nvidia-settings", and "/usr/bin/sudoedit /etc/X11/xorg.conf" via sudo.
For your convenience, the CUDA_USERS sudo group is also authorized to run "/sbin/shutdown" and "/sbin/reboot" for remote boot management.
Problem: When executing a CUDA program under Linux, you get the following error:
error while loading shared libraries: libcudart.so.2: cannot open shared object file: No such file or directory
Solution: You need to add the path to the CUDA libraries to your $LD_LIBRARY_PATH environment variable.
Explanation: When a CUDA program is executed, it needs to dynamically link to the CUDA runtime libraries. By default, these libraries are located in the /usr/local/cuda/lib directory. When searching for these libraries, the operating system looks in directories specified in the $LD_LIBRARY_PATH environment variable. If the CUDA library directory is not specified here, the program will fail with the error shown above.
There are two solutions (these assume that you are using the bash shell, which is the default CS Department shell):
1. Run the following command:
This change is not persistent and will need to be re-run each time you log in.
2. Edit your .profile file (located at ~/.profile). Find the line that sets the $LD_LIBRARY_PATH variable, which should look similar to the following:
Modify that line to add the path to the CUDA libraries:
After editing the file, you either need to log out and log back in or run the following command:
This solution is persistent and only needs to be performed once.
By default, CUDA only supports single-precision floating point arithmetic. You must explicitly enable double-precision support.
Choosing among multiple GPUs
When running a CUDA program on a computer with multiple GPUs, your program may not be executed on the most powerful GPU. To maximize performance, you should modify your program to automatically choose the best GPU.
Measuring kernel runtime
A naive approach to measuring the runtime of a CUDA kernel may produce wildly inaccurate results. Make sure that you measure runtime correctly.