Install TensorFlow with GPU Support the Easy Way on Ubuntu 18.04 (without installing CUDA)

TensorFlow is a very important Machine/Deep Learning framework and Ubuntu Linux is a great workstation platform for this type of work. If you are wanting to setup a workstation using Ubuntu 18.04 with CUDA GPU acceleration support for TensorFlow then this guide will hopefully help you get your machine learning environment up and running without a lot of trouble. And, you don’t have to do a CUDA install!

This guide is for Ubuntu 18.04 but I will also be doing a similar post using the latest Windows 10 build.

Ubuntu 18.04 is out and in my opinion it is a big improvement over 16.04. 18.04 is the latest LTS (Long Term Support) build of Ubuntu. It will become the standard base platform for a lot of projects. There is usually some lag time before packages and projects move to a new base platform like Ubuntu 18.04, however, at this point nearly all of the projects that I care about are already supported on 18.04.

I said “nearly all” …! Right now I have Ubuntu 18.04 running supported version of Docker, NVIDIA-docker v2, Virtualbox, Anaconda Python, etc, there is only one package that I generally install that is not (officially) supported on 18.04 yet. That one package is NVIDIA CUDA. I had waited to write anything about Ubuntu 18.04 until CUDA 9.2 was released because I was sure it would have install support for 18.04. Well, guess what, it doesn’t. For Ubuntu the recent 9.2 CUDA release only has installer support for 16.04 and 17.10! I was really surprised to see that. 16.04 makes sense but 17.10 is a short term intermediate release and it is similar enough to 18.04 that I don’t understand why 18.04 didn’t happen. There may be something broken that they just decided to wait to fix rather than delay the 9.2 release any further. That would be understandable and reasonable.

I will do a detailed post on how to do an Ubuntu 18.04 install including an unofficial CUDA 9.2 install. In this post I am assuming you have successfully installed Ubuntu 18.04. If that is not the case then you may want to wait for my detailed install post.

If you are not doing CUDA development work then you may not need to install CUDA anyway. The focus here is to get a good GPU accelerated TensorFlow work environment up and running without a lot of fuss.

I’m adding a note here about some issues that have come up in the comments. If you see the following error when you try to run a TF or Keras job it’s because your NVIDIA display driver is not new enough for the new TF and Keras builds on the Anaconda cloud i.e. they are are linking against CUDA libs that need the nvidia-396 runtime and you have the nvidia-390 runtime installed.
Internal: cudaGetDevice() failed. Status: CUDA driver version is insufficient for CUDA runtime version

You should update your NVIDIA driver!!. Use either the “Software and Updates” GUI tool under the “Driver” section or update the driver manually from the command line like,
sudo apt purge nvidia*
sudo add-apt-repository ppa:graphics-drivers/ppa
sudo apt install nvidia-driver-430

Check the latest version here,

Best wishes –Don

Python environment setup with Anaconda Python

I highly recommend you use Anaconda Python. If you need some arguments for using Python take a look at my post Should You Learn to Program with Python. For arguments on why you should use the Anaconda Python distribution see, How to Install Anaconda Python and First Steps for Linux and Windows.

Anaconda is focused toward data-science and machine learning. It installs cleanly on your system in a single directory so it doesn’t make a mess in your systems application and library directories. It is also performance optimized and links important numerical packages like numpy to Intel’s MKL. Most importantly for this post, it includes easily installed modules for TensorFlow that include the CUDA dependencies!

Install Anaconda Python

  • You will be asked to accept a license agreement and then questioned about the install location. By default it will install at the top of your home directory under anaconda3. I recommend that you use that. [ If you ever want to get rid of it or reinstall you can just remove that directory.]
  • Next it will ask if you want to append the Anaconda executable directory to your PATH environment variable in .bashrc I recommend that you do that but, remember that you did. It will add something like the following at the end of your .bashrc file,
# added by Anaconda3 installer
export PATH="/home/dbk/anaconda3/bin:$PATH"
  • Then “re-source” your .bashrc file to execute that export. [ It will happen automatically on subsequent login. ]
source ~/.bashrc
  • Next you will be asked if you want to install Microsoft VSCode. VSCode is a really good editor and it is available for free on Windows, Linux and MacOS. However, if you are interested in trying it out I would recommend that you go to the VSCode website and check it out. If you you think you want to try it then go ahead and download it and install it yourself. I usually use the Atom editor which also runs on Windows, Linux and MacOS. If you are checking out editors I recommend you try both of these as well as Sublime Text. They are all great editors!
  • Check your install. If you have you sourced your .bashrc file and your PATH is correct you should see something like,
python --version

Python 3.6.4 :: Anaconda, Inc.
  • Update your base Anaconda packages. (conda is a powerful package and environment management tool for Anaconda and it’s not restricted to use with just Python)
conda update conda
conda update anaconda
conda update python
conda update --all

That should bring your entire base Anaconda install up to the latest packages.

There is a GUI for Anaconda called anaconda-navigator. I personally find it distracting/confusing/annoying and prefer using conda from the command-line. Your taste may differ! … and my opinion is subject to change if they keep improving it.

Create a Python “virtual environment” for TensorFlow using conda

You should set up an environment for TensorFlow separate from your base Anaconda environment. This keeps your base clean and will give TensorFlow a space for all of it’s dependencies. It is in general good practice to keep separate environments for projects especially when they have special package dependencies.

There are many possible options when creating an environment with conda including adding packages with specific version numbers and specific Python base versions. This is sometimes useful if you want fine control and it also helps with version dependencies resolution. Here we will keep it simple and just create a named environment and then activate that environment and install the packages we want inside of that.

From a command line do,

conda create --name tf-gpu

I named the environment ‘tf-gpu’ but you can use any name you want.

Now activate the environment, (I’ll show my full terminal prompt and output instead of just the commands)

dbk@i9:~$ source activate tf-gpu
(tf-gpu) dbk@i9:~$

You can see that my shell prompt is now preceded by the the name of the environment.

Note: the newer ‘conda’ uses a different syntax now, “conda activate tf-gpu” and “conda deactivate tf-gpu”

Install TensorFlow from the Anaconda Cloud Repositories

The TensorFlow documentation is in general very good but the install documentation does not present a very good way to get a setup working on a workstation.

Do not follow the install documentation from the TensorFlow site! If you do you will have a painful time getting things working and you will have a nearly impossible to maintain install setup.

There is no good reason to do an (old) CUDA install and a pip install when you are using Anaconda Python. There is an up-to-date official Anaconda package for TensorFlow with GPU acceleration that includes all of the needed CUDA dependencies and it is well optimized for performance.

Lets install TensorFlow with GPU acceleration and all of the dependencies.

(tf-gpu) dbk@i9:~$ conda install tensorflow-gpu

That’s it! That’s all you need to do!

Just running that one short command above gave the following list of packages to be installed. They are installed and isolated in the “tf-gpu” environment we created. There is no nasty mess on your system!

I’ve cut some of the packages out and just left the “most interesting” ones in this output listing.

The following NEW packages will be INSTALLED:
    cudatoolkit:       9.0-h13b8566_0         
    cudnn:             7.1.2-cuda9.0_0        
    cupti:             9.0.176-0              
    intel-openmp:      2018.0.0-8             
    mkl:               2018.0.2-1             
    mkl_fft:           1.0.1-py36h3010b51_0   
    mkl_random:        1.0.1-py36h629b387_0   

    libgcc-ng:         7.2.0-hdf63c60_3       
    libgfortran-ng:    7.2.0-hdf63c60_3       
    libprotobuf:       3.5.2-h6f1eeef_0       
    libstdcxx-ng:      7.2.0-hdf63c60_3       
    numpy:             1.14.3-py36hcd700cb_1  
    numpy-base:        1.14.3-py36h9be14a7_1  
    protobuf:          3.5.2-py36hf484d3e_0   
    python:            3.6.5-hc3d631a_2       
    tensorboard:       1.8.0-py36hf484d3e_0   
    tensorflow:        1.8.0-hb11d968_0       
    tensorflow-base:   1.8.0-py36hc1a7637_0   
    tensorflow-gpu:    1.8.0-h7b35bdc_0       

You now have GPU accelerated TensorFlow 1.8, CUDA 9.0, cuDNN 7.1, Intel’s MKL libraries (that are linked into numpy) and TensorBoard. Nice!

Note: the newer ‘tensorflow-gpu’ version will be updated from this.

Create a Jupyter Notebook Kernel for the TensorFlow Environment

You can work with an editor and the command line and you often want to do that, but, Jupyter notebooks are great for doing machine learning development work. In order to get Jupyter notebook to work the way you want with this new TensorFlow environment you will need to add a “kernel” for it.

With your tf-gpu environment activated do,

(tf-gpu) dbk@i9:~$ conda install ipykernel jupyter

Now create the Jupyter kernel,

(tf-gpu) dbk@i9:~$ python -m ipykernel install --user --name tf-gpu --display-name "TensorFlow-GPU"

With this “tf-gpu” kernel installed, when you open a Jupyter notebook you will now have an option to to start a new notebook with this kernel.
Jupyter kernel for TF

An Example using Keras with TensorFlow Backend

Note: I have a newer post that might be a better to follow than this example. How to Install TensorFlow with GPU Support on Windows 10 (Without Installing CUDA) UPDATED! Yes, even though that is a Win10 install everything after getting Anaconda Python working is pretty much the same on Windows and Linux!

In order to check everything out lets setup LeNet-5 using Keras (with our TensorFlow backend) using a Jupyter notebook with our “TensorFlow-GPU” kernel. We’ll train the model on the MNIST digits data-set.

Install Keras

With the tf-gpu environment activated do,

(tf-gpu) dbk@i9:~$ conda install keras-gpu

You now have Keras installed utilizing your GPU accelerated TensorFlow. It is that easy!

Note: the newer ‘tensorflow-gpu’ includes Keras so you don’t need to do a seperate install.

Launch a Jupyter Notebook

With the tf-gpu environment activated start Jupyter,

(tf-gpu) dbk@i9:~$ jupyter notebook

From the ‘New’ drop-down menu select the ‘TensorFlow-GPU’ kernel that you added (as seen in the image in the last section). You can now start writing code!

MNIST example

Following are Python snippets you can copy into cells in your Jupyter notebook to setup and train LeNet-5 with MNIST digits data.

Import dependencies

import keras
from keras.datasets import mnist
from keras.models import Sequential
from keras.layers import Dense, Dropout
from keras.layers import Flatten,  MaxPooling2D, Conv2D
from keras.callbacks import TensorBoard

Load and process the MNIST data

(X_train,y_train), (X_test, y_test) = mnist.load_data()

X_train = X_train.reshape(60000,28,28,1).astype('float32')
X_test = X_test.reshape(10000,28,28,1).astype('float32')

X_train /= 255
X_test /= 255

n_classes = 10
y_train = keras.utils.to_categorical(y_train, n_classes)
y_test = keras.utils.to_categorical(y_test, n_classes)

Create the LeNet-5 neural network architecture

model = Sequential()
model.add(Conv2D(32, kernel_size=(3,3), activation='relu', input_shape=(28,28,1)) )
model.add(Conv2D(64, kernel_size=(3,3), activation='relu'))
model.add(Dense(128, activation='relu'))
model.add(Dense(n_classes, activation='softmax'))

Compile the model

model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])

Set log data to feed to TensorBoard for visual analysis

tensor_board = TensorBoard('./logs/LeNet-MNIST-1')

Train the model, y_train, batch_size=128, epochs=15, verbose=1,
          validation_data=(X_test,y_test), callbacks=[tensor_board])

The results

After running that training for 15 epochs the last epoch gave,

Epoch 15/15
60000/60000 [==============================] - 5s 83us/step - loss: 0.0188 - acc: 0.9939 - val_loss: 0.0303 - val_acc: 0.9917

Not bad! Training accuracy 99.39% and Validation accuracy 99.17%

Look at the job run with TensorBoard

Start TensorBoard

 (tf-gpu) dbk@i9:~$ tensorboard --logdir=./logs --port 6006

It will give you an address similar to http://i9:6006 Open that in your browser and you will be greeted with (the wonderful) TensorBoard. These are the plots it had for that job run,
TensorBoard output

That was a model with 1.2 million training parameters and a dataset with 60,000 images. It took 1 minute and 9 seconds utilizing the NVIDIA GeForce 1080Ti in my system!

Happy computing! –dbk