I’m going to walk you through a basic install and configuration for a development system to do CUDA and OpenACC GPU programming. This is not a detailed howto but if you have some linux admin skills it will be a reasonable guide to get you started. We’ll do a basic NVIDIA GPU programming setup including CentOS 6.5, CUDA development environment and a PGI compiler setup with OpenACC. The most interesting part may be the OpenACC setup. OpenACC is a relatively new option for GPU programming and allows for a directive (pragma) based coding model. I’m personally really interested in that and I’ve always liked the PGI compilers!
The setup outlined here is basically what we do at Puget Systems for a “Tesla StarterKit” which is a nice bundle under $5000 including a Peak Mini with a Tesla K20 and the PGI Accelerator compiler bundle including a 1 year support subscription.
We’ll start with what I consider to be the standard HPC development system install, CentOS 6.5 with the “Software Development Workstation” install group. You could use one of the other NVIDIA CUDA and PGI supported distributions and versions but my setup notes would need to be modified in that case.
Go to the CentOS site, find a mirror near you and download the DVD1 .iso For the install you can use your method of choice, DVD, USB, Network, whatever you would normally do for an install.
You will want to have a “developer” install, so in CentOS I usually chose the “Software Development Workstation” install option. I like to do “Customize Now” add the KDE desktop at this point too. I use the Gnome desktop on CentOS but I like some of the KDE applications, especially “Konsole” so it’s convenient to just add this in during install. Tweak the install to your taste. After your install and basic setup bring everything up to date with,
EPEL repo install and other dependencies
You will want to install dkms so the NVIDIA drivers can rebuild automatically if you do any kernel updates. This is in the EPEL repo (along with a lot of other good stuff).
Note: I’ve used wget to download files but you can certainly just use firefox instead.
wget http://dl.fedoraproject.org/pub/epel/6/x86_64/epel-release-6-8.noarch.rpm yum install epel* yum install dkms
The OS install above will take care of most dependencies but one that will come up when you start testing CUDA will be the openGL toolkit. Let’s add that now;
yum install freeglut freeglut-devel
CUDA 6.0 install
Install the latest CUDA repo. This will give you a CUDA base that you need and will also install and setup the NVIDIA graphics driver. It takes care of things like blacklisting the nouveau driver for you! It’s the easiest way to install the NVIDIA graphics driver on CentOS. This will install a bunch of dependencies, build the kernel modules and take care of the hassle of getting the driver setup right. (the following cuda repo was fresh as of this writing, get the latest one!)
wget http://developer.download.nvidia.com/compute/cuda/repos/rhel6/x86_64/cuda-repo-rhel6-6.0-37.x86_64.rpm yum install cuda-repo* yum install cuda shutdown -r now
After the reboot you should have your NVIDIA drivers rebuilt against your current Linux kernel and the setup for CUDA 6.0 should be installed in /usr/loca/. You can open nvidia-settings to check things out with the driver and then do a check of the cuda setup with (from your regular login account),
rsync -av /usr/local/cuda/samples . cd samples make
That will build the example programs. Run some of them to confirm things are right and to get some instant CUDA joy.
Now, to finish the CUDA setup, we can do a little system configuration to put cuda/bin on the default path and add a conf file to get cuda/lib64 on the library path. (this may help if you had some of the sample jobs fail :-)
As root, create a file /etc/profile.d/cuda.sh and put the following lines in it;
PATH=$PATH:/usr/local/cuda/bin export PATH
Then create a file /etc/ld.so.conf.d/cuda.conf and put the following line in it,
Now on next login (or when you open a new shell) you will have the CUDA compiler and tools on your path and the CUDA libraries will be available.
That should get you started for the CUDA setup. Be sure to read the excellent documentation in /usr/local/cuda/doc
CUDA 5.5 Install
You will probably want to install the CUDA 5.5 tool kit along with the latest release since there were many changes in version 6 that existing code may have trouble with.
Download the .run script this time instead of the rpm file since it will ask us more questions about what components to install.
Make the script executable, and start it,
wget http://developer.download.nvidia.com/compute/cuda/5_5/rel/installers/cuda_5.5.22_linux_64.run . chmod 755 cuda_5.5.22_linux_64.run ./cuda_5.5.22_linux_64.run
After accepting the license agreement you may get a warning about an unsupported platform --that’s OK. Next you will be asked if you want to install the Graphics driver -- say no, since we already have that taken care of. Say yes, to “install the Toolkit?” and the default location in /usr/local/cuda-5.5 is good. If you want to install the 5.5 samples you will probably want to change the install location to something more reasonable like /usr/local/cuda-5.5/samples
During the install the run script will overwrite the symbolic link to /usr/local/cuda that was created during the version 6 install. You can change that back with,
rm /usr/local/cuda ln -s /usr/local/cuda-6.0 /usr/local/cuda
Note that we set the system environment variable to point to bin and lib64 in the /usr/local/cuda directory so can change default versions by changing the symbolic link if you like.
Now on to the PGI compilers and OpenACC!
The Portland Group a.k.a. PGI
PGI has been building high quality optimizing compilers since the 90’s. Their fortran compilers in particular have always been well regarded and I’ve used them for many projects in the past. In recent years they have been doing a lot of work building tools for GPU computing CUDA-fortran is a nice example. Their latest work for GPU computing has been the development and implementation of OpenACC for directive based programming similar to what you would do with OpenMP for multi-threaded development.
I think OpenACC has a huge potential to bring new developers to GPU programming. It’s at a much higher level than working directly with CUDA or OpenCL and coupled with GPU optimized library calls should make porting efforts and new development more productive. Low level work may still be needed for the most time critical code segments but this should make the overall process of getting code running on the GPU a more pleasant experience.
PGI Accelerator™ with OpenACC Install
You might want to go read some of the information and check out resources on http://www.pgroup.com/resources/accel.htm to get a feel for what you are dealing with.
To get started with the install you will need to create an account. Then download the install tar file. You can start by doing the install and running with a trial license.
The tar file doesn’t extract to it’s own directory so make a directory for the extraction, move it in there and then un-tar it.
mkdir pgi mv pgilinux-2014-146-x86_64.tar.gz pgi/ cd pgi tar xzf pgilinux-2014-146-x86_64.tar.gz
You may want to look at the install document http://www.pgroup.com/doc/pgiinstall.pdf before you go much further …
PGI uses FlexNet i.e. FlexLM license management even on standalone, single user, single machine installs. You will need to understand that this means that a license server will be running on your system. It’s not a problem but some users may not be used to this type of a license setup.
If you are ready for the install, then,
When the install script asks if you want to install CUDA components say “no” if you have followed my advice and already installed CUDA 5.5 and 6.0. [ It’s actually OK to let the install script add the cuda components since it just installs the CUDA “Toolkit” for version 5.5 and 6.0 under the pgi install directory. However, I prefer to have a full default CUDA setup. We’ll add some symbolic links latter to keep the compiler happy…]
When you are asked;
1 Generate a license key for this computer 2 Configure and start a license server on this computer 3 All of the above 4 I'm not sure (quit now and re-run this script later,) What do you want to do?
pick 3, "All of the above". You will be asked to enter your login id and password that you used when you created an account to do the download. This will start everything up for the license server and let you know if there are any problems (there shouldn’t be any). Alternatively you could set the license up later following instruction in the install manual.
Install and start the license server.
cp /opt/pgi/linux86-64/2014/bin/lmgrd.rc /etc/init.d/ chkconfig lmgrd.rc on service lmgrd.rc start
The license server will now start automatically on boot.
Now, we can setup the system environment so that the compilers and libraries will be accessible by default.
Create a file /etc/profile.d/pgi.sh with the following lines in it.
PGI=/opt/pgi PATH=PATH:/opt/pgi/linux86-64/2014/bin:$PATH MANPATH=$MANPATH:/opt/pgi/linux86-64/2014/man LM_LICENSE_FILE=$LM_LICENSE_FILE:/opt/pgi/license.dat export PGI PATH MANPATH LM_LICENSE_FILE
Now the next time you login (or start up a new shell) the compilers should be on your path.
Symbolic links to the CUDA instal
If you are building code that needs access to the CUDA binaries and libraries the compiler will, by default, expect to find them in /opt/pgi/linux86-64/2014/cuda/5.5 (6.0) directories. Since we have a default CUDA install in /usr/local it’s convenient to create symbolic links so the compiler can easily find what it needs.
ln -s /usr/local/cuda-5.5 /opt/pgi/linux86-64/2014/cuda/5.5 ln -s /usr/local/cuda-6.0 /opt/pgi/linux86-64/2014/cuda/6.0
That should do it! Let’s try it out!
Lets do a quick compile and run of some OpenACC code to be sure our environment and setup is work right.
If you haven’t logged out and back in then you may need to source the PGI environment variables script we setup earlier.
source /etc/profile.d/pgi.sh cp /opt/pgi/linux86-64/2014/examples/OpenACC/samples/acc_f1/acc_f1.f90 . pgfortran -acc -Minfo acc_f1.f90 -o testacc ./testacc
If you see something like the following in your terminal ...
kinghorn@dbk pgi-install$ pgfortran -acc -Minfo acc_f1.f90 -o testacc main: 28, Generating present_or_copyin(a(1:n)) Generating present_or_copyout(r(1:n)) Generating Tesla code 29, Loop is parallelizable Accelerator kernel generated 29, !$acc loop gang, vector(128) ! blockidx%x threadidx%x kinghorn@dbk pgi-install$ ./testacc 100000 iterations completed Test PASSED
...then you are good to go!
Happy computing! --dbk