Install CUDA and PGI Accelerator with OpenACC

I’m going to walk you through a basic install and configuration for a development system to do CUDA and OpenACC GPU programming. This is not a detailed howto but if you have some linux admin skills it will be a reasonable guide to get you started. We’ll do a basic NVIDIA GPU programming setup including CentOS 6.5, CUDA development environment and a PGI compiler setup with OpenACC. The most interesting part may be the OpenACC setup. OpenACC is a relatively new option for GPU programming and allows for a directive (pragma) based coding model. I’m personally really interested in that and I’ve always liked the PGI compilers!

The setup outlined here is basically what we do at Puget Systems for a “Tesla StarterKit” which is a nice bundle under $5000 including a Peak Mini with a Tesla K20 and the PGI Accelerator compiler bundle including a 1 year support subscription.

OS install

We’ll start with what I consider to be the standard HPC development system install, CentOS 6.5 with the “Software Development Workstation” install group. You could use one of the other NVIDIA CUDA and PGI supported distributions and versions but my setup notes would need to be modified in that case.

Go to the CentOS site, find a mirror near you and download the DVD1 .iso For the install you can use your method of choice, DVD, USB, Network, whatever you would normally do for an install.

You will want to have a “developer” install, so in CentOS I usually chose the “Software Development Workstation” install option. I like to do “Customize Now” add the KDE desktop at this point too. I use the Gnome desktop on CentOS but I like some of the KDE applications, especially “Konsole” so it’s convenient to just add this in during install. Tweak the install to your taste. After your install and basic setup bring everything up to date with,

yum update 

EPEL repo install and other dependencies

You will want to install dkms so the NVIDIA drivers can rebuild automatically if you do any kernel updates. This is in the EPEL repo (along with a lot of other good stuff).

Note: I’ve used wget to download files but you can certainly just use firefox instead.

yum install epel*
yum install dkms

The OS install above will take care of most dependencies but one that will come up when you start testing CUDA will be the openGL toolkit. Let’s add that now;

yum install freeglut  freeglut-devel 

CUDA 6.0 install

Install the latest CUDA repo. This will give you a CUDA base that you need and will also install and setup the NVIDIA graphics driver. It takes care of things like blacklisting the nouveau driver for you! It’s the easiest way to install the NVIDIA graphics driver on CentOS. This will install a bunch of dependencies, build the kernel modules and take care of the hassle of getting the driver setup right. (the following cuda repo was fresh as of this writing, get the latest one!)

yum install cuda-repo*
yum install cuda
shutdown -r now

After the reboot you should have your NVIDIA drivers rebuilt against your current Linux kernel and the setup for CUDA 6.0 should be installed in /usr/loca/. You can open nvidia-settings to check things out with the driver and then do a check of the cuda setup with (from your regular login account),

rsync -av /usr/local/cuda/samples .
cd samples

That will build the example programs. Run some of them to confirm things are right and to get some instant CUDA joy.

Now, to finish the CUDA setup, we can do a little system configuration to put cuda/bin on the default path and add a conf file to get cuda/lib64 on the library path. (this may help if you had some of the sample jobs fail 🙂

As root, create a file /etc/profile.d/ and put the following lines in it;

export PATH

Then create a file /etc/ and put the following line in it,


Then run,


Now on next login (or when you open a new shell) you will have the CUDA compiler and tools on your path and the CUDA libraries will be available.


That should get you started for the CUDA setup. Be sure to read the excellent documentation in /usr/local/cuda/doc


CUDA 5.5 Install

You will probably want to install the CUDA 5.5 tool kit along with the latest release since there were many changes in version 6 that existing code may have trouble with.

Download the .run script this time instead of the rpm file since it will ask us more questions about what components to install.

Make the script executable, and start it,

wget .
chmod 755

After accepting the license agreement you may get a warning about an unsupported platform –that’s OK. Next you will be asked if you want to install the Graphics driver — say no, since we already have that taken care of. Say yes, to “install the Toolkit?” and the default location in /usr/local/cuda-5.5 is good. If you want to install the 5.5 samples you will probably want to change the install location to something more reasonable like /usr/local/cuda-5.5/samples

During the install the run script will overwrite the symbolic link to /usr/local/cuda that was created during the version 6 install. You can change that back with,

rm /usr/local/cuda
ln -s /usr/local/cuda-6.0 /usr/local/cuda

Note that we set the system environment variable to point to bin and lib64 in the /usr/local/cuda directory so can change default versions by changing the symbolic link if you like.

Now on to the PGI compilers and OpenACC!

The Portland Group a.k.a. PGI

PGI has been building high quality optimizing compilers since the 90’s. Their fortran compilers in particular have always been well regarded and I’ve used them for many projects in the past. In recent years they have been doing a lot of work building tools for GPU computing CUDA-fortran is a nice example. Their latest work for GPU computing has been the development and implementation of OpenACC for directive based programming similar to what you would do with OpenMP for multi-threaded development.


The OpenACC standard is being developed by NVIDIA/PGI, CRAY and CAPS and is designed to simplify parallel programming on CPU/GPU systems.

I think OpenACC has a huge potential to bring new developers to GPU programming. It’s at a much higher level than working directly with CUDA or OpenCL and coupled with GPU optimized library calls should make porting efforts and new development more productive. Low level work may still be needed for the most time critical code segments but this should make the overall process of getting code running on the GPU a more pleasant experience.

PGI Accelerator™ with OpenACC Install

You might want to go read some of the information and check out resources on to get a feel for what you are dealing with.

To get started with the install you will need to create an account. Then download the install tar file. You can start by doing the install and running with a trial license.

The tar file doesn’t extract to it’s own directory so make a directory for the extraction, move it in there and then un-tar it.

mkdir pgi
mv pgilinux-2014-146-x86_64.tar.gz pgi/
cd pgi
tar xzf pgilinux-2014-146-x86_64.tar.gz

You may want to look at the install document before you go much further …

PGI uses FlexNet i.e. FlexLM license management even on standalone, single user, single machine installs. You will need to understand that this means that a license server will be running on your system. It’s not a problem but some users may not be used to this type of a license setup.

If you are ready for the install, then,


When the install script asks if you want to install CUDA components say “no” if you have followed my advice and already installed CUDA 5.5 and 6.0. [ It’s actually OK to let the install script add the cuda components since it just installs the CUDA “Toolkit” for version 5.5 and 6.0 under the pgi install directory. However, I prefer to have a full default CUDA setup. We’ll add some symbolic links latter to keep the compiler happy…]

When you are asked;

1  Generate a license key for this computer
2  Configure and start a license server on this computer
3  All of the above
4  I'm not sure (quit now and re-run this script later,)

What do you want to do? 

pick 3, "All of the above". You will be asked to enter your login id and password that you used when you created an account to do the download. This will start everything up for the license server and let you know if there are any problems (there shouldn’t be any). Alternatively you could set the license up later following instruction in the install manual.

Install and start the license server.

cp /opt/pgi/linux86-64/2014/bin/lmgrd.rc /etc/init.d/
chkconfig lmgrd.rc on
service lmgrd.rc start

The license server will now start automatically on boot.

Now, we can setup the system environment so that the compilers and libraries will be accessible by default.

Create a file /etc/profile.d/ with the following lines in it.


Now the next time you login (or start up a new shell) the compilers should be on your path.

Symbolic links to the CUDA instal

If you are building code that needs access to the CUDA binaries and libraries the compiler will, by default, expect to find them in /opt/pgi/linux86-64/2014/cuda/5.5 (6.0) directories. Since we have a default CUDA install in /usr/local it’s convenient to create symbolic links so the compiler can easily find what it needs.

ln -s /usr/local/cuda-5.5  /opt/pgi/linux86-64/2014/cuda/5.5
ln -s /usr/local/cuda-6.0  /opt/pgi/linux86-64/2014/cuda/6.0

That should do it! Let’s try it out!

Lets do a quick compile and run of some OpenACC code to be sure our environment and setup is work right.

If you haven’t logged out and back in then you may need to source the PGI environment variables script we setup earlier.

source /etc/profile.d/

cp /opt/pgi/linux86-64/2014/examples/OpenACC/samples/acc_f1/acc_f1.f90 .

pgfortran -acc -Minfo acc_f1.f90 -o testacc

If you see something like the following in your terminal …

kinghorn@dbk pgi-install$ pgfortran -acc -Minfo acc_f1.f90 -o testacc
     28, Generating present_or_copyin(a(1:n))
         Generating present_or_copyout(r(1:n))
         Generating Tesla code
     29, Loop is parallelizable
         Accelerator kernel generated
         29, !$acc loop gang, vector(128) ! blockidx%x threadidx%x

kinghorn@dbk pgi-install$ ./testacc 
       100000 iterations completed


…then you are good to go!

Happy computing! –dbk

Tags: , , ,