Puget Systems print logo
https://www.pugetsystems.com
Read this article at https://www.pugetsystems.com/guides/825
Dr Donald Kinghorn (Scientific Computing Advisor )

Install Ubuntu 16.04 or 14.04 and CUDA 8 and 7.5 for NVIDIA Pascal GPU

Written on August 29, 2016 by Dr Donald Kinghorn
Share:

In this post I will walk you through setting up a CUDA dev environment on Ubuntu 16.04 (or 14.04). We will install both CUDA 8.0 and 7.5 and go through all to the tricks you need to get a working setup. There are enough tricks that hopefully this page will have a long life because some of them seem to be timeless, unfortunately! This install method will work with the latest NVIDIA Pascal GPU's. It will also work just fine with Ubuntu 16.04 and 14.04.

This is more up-to-date and tested than what I had described a few months back about getting CUDA working on Ubuntu 16.04. This procedure will work with the NVIDA Pascal cards and will also work with Ubuntu 14.04.

Note: The following CUDA setup is reasonable for a development environment. For a production environment you might want to rethink a few of the things I've done!

The focus in this post is on the software install and I will try to do the install in a way that it is mostly hardware independent. The system I'm using as I write this is,

Peak Tower Single
MB: ASUS X99-E WS
CPU: Intel Core-i7 6950X 8-core @ 3.2GHz (3.5GHz All-Core-Turbo)
Memory: 64 GB DDR4 2133MHz Reg ECC
PCIe: (4) X16-X16 v3
GPU: (4) NVIDIA Titan X Pascal

Install Ubuntu 16.04 server

Start with a fresh server install. It's quick and will usually install without trouble even on bleeding-edge hardware.

You may want/need to add some boot time kernel options while doing the install. The systemd config in Ubuntu 16.04 is doing ugly things with network interfaces on some multi-nic boards and the motherboard I'm using doesn't like power management on the PCIe bus.

When the grub screen comes up to boot into the install you can hit "e" to edit the boot line. Look for the line that starts "Linux" and ends "quiet ---". You can remove the "quiet ---" part, (so you can see all the sys messages during boot), and then add any kernel options that you want. For this motherboard I add,

net.ifnames=0 biosdevname=0 pcie_aspm=off
that leaves the network interfaces names as eth0, eth1 etc. and leaves the PCIe bus in "performance" power state.

The above boot options are not necessarily needed. You need to know the quirks of your hardware and how it interacts with your install.

Do the base install following the prompts with choices you want.

Reboot

If you added any kernel options during the install boot you will want to interrupt the boot process by hitting "e" and edit the Linux line again. Note: it will look different this time since this is your actual install and not the installer boot. We can get the rest of the install complete before we need to do another reboot.

Updates, desktop environment, extra packages and grub update

Pretty much everything from here on will need to be done as root so add sudo to the beginning of the commands or just sudo -s to get a root shell.

Do updates.

apt-get update
apt-get dist-upgrade

Install your desktop environment

The easiest way to take care of adding a desktop GUI is to run the "tasksel" command. If you run it without any arguments it will give you a very nice menu with many options for different desktop setups and other good stuff. It's a handy tool.

You can run tasksel as follows to add the default Ubuntu desktop without looking at the menu.

tasksel install ubuntu-desktop

Add extra programs

You now have your base desktop install so you might want to add a few extras at this point. I usually add the following.

apt-get install build-essential emacs dkms synaptic ssh

Grub update

If you are using any kernel options on startup you should probably take care of that now before the next reboot. Use your editor of choice and do something like the following.

edit the file /etc/default/grub 
the server install will have an empty option line like this,

GRUB_CMDLINE_LINUX_DEFAULT=""

For the motherboard I'm using I would change this to,

GRUB_CMDLINE_LINUX_DEFAULT="net.ifnames=0 biosdevname=0 pcie_aspm=off"

Note that a "normal" desktop install would also have "quiet splash" 
in this line to hide all of the boot messages during startup 
... sometimes I like to see them!

After editing that file update grub with (surprise)

update-grub

Install the NVIDIA display driver

I've been using the well maintained "graphics-drivers" ppa for adding the NVIDIA display drivers. These have been up-to-date and well packaged. Using this will give you a convenient update path for new drivers. So far I haven't had any trouble with new drivers rebuilding against kernel source using dkms.

add-apt-repository ppa:graphics-drivers/ppa
apt-get update
apt-get install nvidia-367

CUDA install, setup and fixes

Dependencies

We are doing a manual CUDA toolkit install since we want both version 7.5 and 8.0rc (and since the packaged .deb and .rpm files are basically broken right now!)

I did a "dry-run" CUDA install from the deb files to pull a list of system packages that would get installed as dependencies (outside of the CUDA repo). You may or may not need these, but it is what the old (working) deb install from the CUDA repo would have pulled in. I put them in a file called cuda-deps... sorry for the long scroll line but I didn't want any line breaks in there in case you want to copy that to a file.

cat cuda-deps 
ca-certificates-java default-jre default-jre-headless fonts-dejavu-extra freeglut3 freeglut3-dev java-common libatk-wrapper-java libatk-wrapper-java-jni  libdrm-dev libgl1-mesa-dev libglu1-mesa-dev libgnomevfs2-0 libgnomevfs2-common libice-dev libpthread-stubs0-dev libsctp1 libsm-dev libx11-dev libx11-doc libx11-xcb-dev libxau-dev libxcb-dri2-0-dev libxcb-dri3-dev libxcb-glx0-dev libxcb-present-dev libxcb-randr0-dev libxcb-render0-dev libxcb-shape0-dev libxcb-sync-dev libxcb-xfixes0-dev libxcb1-dev libxdamage-dev libxdmcp-dev libxext-dev libxfixes-dev libxi-dev libxmu-dev libxmu-headers libxshmfence-dev libxt-dev libxxf86vm-dev lksctp-tools mesa-common-dev  x11proto-core-dev x11proto-damage-dev  x11proto-dri2-dev x11proto-fixes-dev x11proto-gl-dev x11proto-input-dev x11proto-kb-dev x11proto-xext-dev x11proto-xf86vidmode-dev xorg-sgml-doctools xtrans-dev libgles2-mesa-dev

If you put those package names in a file called cuda-deps you can do the following to install them easily,

cat cuda-deps | xargs sudo apt-get -y install

CUDA toolkit installs

Download the ".run" install files from NVIDIA (you will need to be registered as a developer to get the 8.0rc version)

You want to run these install scripts and NOT install the bundled display drivers!

You can run the scripts and answer the prompts or you can do,

./cuda_7.5.18_linux.run --help
to see the script options. Then, if you trust me, you can do the following,

chmod 755 cuda_*
./cuda_7.5.18_linux.run --silent --toolkit --samples --samplespath=/usr/local/cuda-7.5/samples --override
./cuda_8.0.27_linux.run --silent --toolkit --samples --samplespath=/usr/local/cuda-8.0/samples --override

That will give you both CUDA toolkit versions with the sample code directories where they belong.

There will be a symbolic link from /usr/local/cuda-8.0 to /usr/local/cuda. You can change this link to the 7.5 version like this,

sudo rm /usr/local/cuda
sudo ln -s /usr/local/cuda-7.5 /usr/local/cuda

I like doing the version switching this way because then I can set the system up to expect the toolkit at /usr/local/cuda regardless of which version is actually linked there.

System CUDA environment

I like to have have base development system tools like CUDA on the default bin and lib path so I create the following files,

/etc/profile.d/cuda.sh

export PATH=$PATH:/usr/local/cuda/bin
export CUDADIR=/usr/local/cuda
export GLPATH=/usr/lib

and for libs,

/etc/ld.so.conf.d/cuda.conf

/usr/local/cuda/lib64

Run "ldconfig" after adding that last file.

Fix "broken" stuff

The default gcc compiler version for Ubuntu 16.04 is now 5.4 and the CUDA configurations were not tested against that so they will error out when you try to build any code. The easiest thing to do is just comment out the error line in the appropriate header file. You should be aware of what compiler version you are using when you build code and realize that NVIDIA may not have tested everything against that! This is a "development" setup not a "production" setup!

The file you want to edit is "host_config.h" in the toolkit "include" directory. All you really need to do (as a hack) is to add // at the beginning of the error line to comment it out.

Here's a couple of little sed lines to do that for you.

sed -i '/unsupported GNU version/ s/^/\/\//' /usr/local/cuda-7.5/include/host_config.h
sed -i '/unsupported GNU version/ s/^/\/\//' /usr/local/cuda-8.0/include/host_config.h

The other thing that is broken is that many of the sample source files have hard wired display driver versions. If you want to build the samples for testing then you will want to fix this.

The following "find" and "sed" lines will fix this for you.

find /usr/local/cuda-7.5/samples -type f -exec sed -i 's/nvidia-3../nvidia-367/g' {} +
find /usr/local/cuda-8.0/samples -type f -exec sed -i 's/nvidia-3../nvidia-367/g' {} +

REBOOT

That's it! You have been doing all of this from the base server install command console, now reboot to your desktop environment and your CUDA setup should be ready to go!

Happy computting! --dbk

Tags: NVIDIA, CUDA, Pascal GPU, Linux
Comments