Install TensorFlow with GPU Support on Windows 10 (without a full CUDA install)

Table of Contents

I have written a new post that does the TensorFlow install without requiring any CUDA install. It uses only Anaconda Python packages including all CUDA an cuDNN dependencies.

The Best Way to Install TensorFlow with GPU Support on Windows 10 (Without Installing CUDA)

I recommend you use the new guide. However, you may still find the present post interesting to see how I handled the CUDA dependencies with DLL’s and PATH.

Python environment setup with Anaconda Python
- Install Anaconda Python
Create a Python “virtual environment” for TensorFlow using conda
* “activate” the environment
Install TensorFlow-GPU from the Anaconda Community Repositories
“Interlude” — Install CUDA 9.0 and cuDNN 7.0 libraries (DLL’s) for TensorFlow
* Update your NVIDIA display driver
* Download CUDA 9.0 (and Patch-2)
* Create a “personal” lib directory
* Start the CUDA install …
Create a Jupyter Notebook Kernel for the TensorFlow Environment
An Example using Keras with TensorFlow Backend
Look at the job run with TensorBoard

In this post I’ll walk you through the best way I have found so far to get a good TensorFlow work environment on Windows 10 including GPU acceleration. I’ll go through how to install just the needed libraries (DLL’s) from CUDA 9.0 and cuDNN 7.0 to support TensorFlow 1.8. I’ll also go through setting up Anaconda Python and create an environment for TensorFlow and how to make that available for use with Jupyter notebook. As a “non-trivial” example of using this setup we’ll go through training LeNet-5 with Keras using TensorFlow with GPU acceleration. We’ll get a setup that is 18 times faster than using the CPU alone.

TensorFlow is a very important Machine/Deep Learning framework. If you are wanting to setup a workstation using Windows 10 with CUDA GPU acceleration support for TensorFlow then this guide will hopefully help you get your machine learning environment up and running without a lot of trouble.

Last week I wrote a post titled, Install TensorFlow with GPU Support the Easy Way on Ubuntu 18.04 (without installing CUDA).

The title for this post was supposed to be Install TensorFlow with GPU Support the Easy Way on Windows 10 (without installing CUDA).

Unfortunately, I had to drop “the Easy Way” and “(without installing CUDA)” for the Windows 10 version.

The Ubuntu 18.04 setup was greatly simplified by using an (official) Anaconda Python module that contained all of the CUDA library dependencies for the current version of TensorFlow. I was going to deliberately try to keep these two posts as similar as possible. However, for Windows there seems to be no alternative to installing the CUDA libraries.

Nearly all Machine Learning/AI frameworks and projects are being developed on Linux. You also see developers using MacOS, in particular with MacBooks for their personal computing platform. MacOS and Linux are UNIX-like at their core and easily inter-operate. Mac’s have the huge disadvantage in that they don’t have the possibility of GPU acceleration with CUDA. What if you have a nice workstation or gaming laptop with a powerful NVIDIA GPU in it and you want to do Machine Learning? Linux is most often the officially supported platform for this kind of work but … What about Windows?

You CAN use Windows! Windows 10 is fine. If you want/need interoperatability then you can easily enable WSL (Windows Subsystem for Linux). [ I really like WSL and I highly recommend you check it out but you wont need it for what we do in this post. ] Even though most ML/AI framework and application development is being done with Linux Windows 10 is gaining more official support and has good community support. It’s a good platform and I can certainly understand why people who use it for their day-to-day work and recreation would want to use it. It sometimes requires a little more work to get things working with Windows, which is the case in this post but, it does work and the performance and work environment are very good.

The focus here is to get a good GPU accelerated TensorFlow (with Keras and Jupyter) work environment up and running for Windows 10 without making a mess on your system. We will need to install (non-current) CUDA 9.0 and cuDNN-7 libraries for TensorFlow 1.8 but I’ll do this in a fairly self-contained way and will only install the needed libraries (DLL’s).

Python environment setup with Anaconda Python

I highly recommend you use Anaconda Python. If you need some arguments for using Python take a look at my post Should You Learn to Program with Python. For arguments on why you should use the Anaconda Python distribution see, How to Install Anaconda Python and First Steps for Linux and Windows.

Anaconda is focused toward data-science and machine learning. It installs cleanly on your system in a single directory so it doesn’t make a mess in your systems application and library directories. It is also performance optimized and links important numerical packages like numpy to Intel’s MKL.

Python is a great “OS equalizer”! Once you have a Python environment setup it will be mostly the same on Linux, MacOS and Windows.

Install Anaconda Python

1) Download and check the installer

Go to the Anaconda downloads page https://www.anaconda.com/downloads and get the Python 3.6 version.
It’s good to check the file hash to be sure you got a good copy.
- Open Powershell and cd to the directory where you downloaded the Anaconda installer exe file. In my case that is the Downloads directory.

cd Downloads

Then run

 Get-FileHash .\Anaconda3-5.2.0-Windows-x86_64.exe -Algorithm SHA256

Look up the hash for the file you downloaded. and check that it matches.

Run the installer

Since you have Powershell open in the directory with the Anaconda installer exe file you can start it by just typing it’s name (type A and hit tab to expand the name) and hitting return.

.\Anaconda3-5.1.0-Windows-x86_64.exe

The installer GUI should now be running.

You will be asked to accept a license agreement …
“Select Install Type” I recommend you chose “Just Me” since this is part of your personal development environment.
“Chose Install Location” I recommend you keep the default which is at the top level of you user directory.
“Advanced Installation Options”

My recommendation is to check both boxes. Make Anaconda Python 3 your default Python. And, as a developer you really should be aware of your PATH environment variable. So yes, go ahead and let the installer add the Anaconda bin directory to your PATH. If you haven’t looked at your environment variables in awhile you should have a look. Do a search from the Windows menu for “environment variables”. You should find a settings panel that will show your account environment and system wide environment. After the Anaconda install you will see that its application and library directories have been prepended to your user PATH. We’ll look at it, and modify it, after installing the CUDA libraries.

VSCode?

Next you will be asked if you want to install Microsoft VSCode. VSCode is a really good editor and it is available for free on Windows, Linux and MacOS. However, if you are interested in trying it out I would recommend that you go to the VSCode website and check it out first. If you think you want to try it, then go ahead and download it and install it yourself. I like VSCode but I usually use the Atom editor which also runs on Windows, Linux and MacOS. If you are checking out editors I recommend you try both of these as well as Sublime Text. They are all great editors!

Check your install

If you still have Powershell open you will need to close it and restart it so that it will re-read your environment variables and pick up your PATH which now includes the Anaconda Python directories. With Powershell reopened you can check that you now have Anaconda Python 3 as your default Python.

python --version

Python 3.6.5 :: Anaconda custom (64-bit)

Update your base Anaconda packages

conda is a powerful package and environment management tool for Anaconda. We’ll use conda from Powershell to update our base Python install. Run the following commands. It may take some time to do this since there are a lot of modules to update.

conda update conda
conda update anaconda
conda update python
conda update --all

That should bring your entire base Anaconda install up to the latest packages. (Anaconda 5.2 had just been released when I wrote this and nearly everything was fully up-to-date.)

Anaconda Navigator

There is a GUI for Anaconda called anaconda-navigator. I personally find it distracting/confusing/annoying and prefer using conda from the command-line. Your taste may differ! … and my opinion is subject to change if they keep improving it. If you are new to Anaconda then I recommend you read up on conda even (or especially!) if you are thinking about using the “navigator” GUI.

Create a Python “virtual environment” for TensorFlow using `conda`

You should set up an environment for TensorFlow separate from your base Anaconda environment. This keeps your base clean and will give TensorFlow a space for all of it’s dependencies. It is in general good practice to keep separate environments for projects especially when they have special package dependencies.

There are many possible options when creating an environment with conda including adding packages with specific version numbers and specific Python base versions. This is sometimes useful if you want fine control and it also helps with version dependencies resolution. Here we will keep it simple and just create a named environment and then activate that environment and install the packages we want inside of that.

From a command line do,

conda create --name tf-gpu

I named the environment ‘tf-gpu’ but you can use any name you want.

“activate” the environment

Now activate the environment, (I’ll show my full terminal prompt and output instead of just the commands)

Note: for some reason Powershell will not run the “activate” script! You will need to start “CMD” shell to do this. You can start CMD shell from Powershell (notice how the “PS” that was at the beginning of the Powershell prompt disappears). Having to switch to CMD is an annoyance but you can easily switch back and forth in a Powershell window

PS C:\Users\don> cmd
Microsoft Windows [Version 10.0.16299.461]
(c) 2017 Microsoft Corporation. All rights reserved.

C:\Users\don> activate tf-gpu

(tf-gpu) C:\Users\don>

You can see that my CMD shell prompt is now preceded by the the name of the environment (tf-gpu). Any conda package (or pip) installs will now be local to this environment.

Install TensorFlow-GPU from the Anaconda Community Repositories

My preference would be to install the “official” Anaconda maintained TensorFlow-GPU package like I did for Ubuntu 18.04, unfortunately the Anaconda maintained Windows version is way out-of-date (version 1.1). There is a current CPU-only version for Windows but we want GPU acceleration.

A search for “tensorflow” on the Anaconda Cloud will list the available packages from Anaconda and the community. There is a package “aaronzs / tensorflow-gpu 1.8.0” listed near the top that has builds for Linux and Windows. This is the only up-to-date package I know of that is working correctly with Windows 10. This package was built by, and is being nicely maintained by, Aaron Sun. You can check out his GitHub page for the project.

Note: I tried using a pip install from PiPy of the Google package but failed to get it working correctly together with Keras which I also want to install (and will use in a later example).

Lets install TensorFlow with GPU acceleration in this conda environment. [ We will take care of the CUDA dependencies in the next section. ]

(tf-gpu) C:\Users\don>conda install -c aaronzs tensorflow-gpu

At this point if you start Python and import tensorflow if will fail to load because of missing DLL’s … let’s fix that …

“Interlude” — Install CUDA 9.0 and cuDNN 7.0 libraries (DLL’s) for TensorFlow

The current CUDA version is 9.2 but we will need 9.0 for the current version of TensorFlow.

We will only install the libraries (DLL’s), NOT the full CUDA Toolkit or the drivers that come with it!

If you have CUDA 9.2 already installed or plan to install it what we are doing will hopefully not interfere with that. [… but you will need to be mindful of your PATH]

Update your NVIDIA display driver

First, if you are doing this then I assume you have a modern NVIDIA GPU in your system! I am doing this on a laptop with a GTX1070. I have just updated my display driver to the latest available from the NVIDIA Drivers site. I am running driver version 397.93

Download CUDA 9.0 (and Patch-2)

You will have to get the CUDA 9.0 install files from the CUDA Toolkit Archives. There is a “Base Installer” and two Patch files. Get the Base Installer file and the Patch 2 file. You don’t need Patch 1.

Create a “personal” lib directory

Create a directory in your user directory for the CUDA and cuDNN libraries. I created a “lib” directory and “CUDA9.0” under that,

C:\Users\don\lib\CUDA9.0

Start the CUDA install …

On the “Install options” panel select “Custom (Advanced)”.

Then on the “Custom installation options” panel un-select everything except “Runtime — Libraries”,

CUDA lib select

In the panel that asks for the install location “Browse” to the directory you created for the libraries and select that. Then finish the install.

You should now have a “bin” directory in that install directory. That is where all of the DLL’s live that TensorFlow is going to need.

CUDA bin dir

We have more stuff to put in there …

Do the same thing we did above with the Base Installer but this time using the “Patch 2 installer”. It’s name should be something like cuda_9.0.176.2_windows.exe. You only want the “Runtime” part of that install. It will update the cublas64_90.dll and nvblas64_90.dll Be sure to have it install in the same directory as you used for the base install.

Install cuDNN 7.0

We have most of the needed libraries but still need to get the CUDA Deep Neural Network library, cuDNN v 7.0.

You will have to have an NVIDIA developer account to get it! Go to the following URL https://developer.nvidia.com/rdp/form/cudnn-download-survey and either login or “Join”. After you login it will bounce you back to the developer home again. to save you the trouble of trying to find the cuDNN page it’s here https://developer.nvidia.com/cudnn After you get there you will need to click a license agreement box and then you will see the download page. Here you will need to go to “Archived cuDNN Releases”. Look for “Download cuDNN v7.0.5 (Dec 5, 2017), for CUDA 9.0” and then the Windows 10 file. Finally, download that.

That will be a “zip” file called cudnn-9.0-windows10-x64-v7.zip. Open that file and go to “cuda\bin”. There you will find cudnn64_7.dll Copy that file to the bin directory that has all of your other cuda DLL’s. In my case that is C:\Users\don\lib\CUDA9.0\bin

Fix your PATH environment variable.

Part of the TensorFlow install instructions say,

“Ensure that you append the relevant Cuda pathnames to the %PATH% environment variable as described in the NVIDIA documentation.”

Yes, now we will take a look at our environment variables. Open “Control Panel” go to “System” and the “Advanced” tab and click the “Environment Variables” button.

PATH before

Two things to notice here; You can see the directories that Anaconda added to my User PATH and there are now System variables for CUDA_PATH and CUDA_PATH_v9_0 If you had another CUDA install you may see something different for those CUDA paths. You should be OK to change these back to what they should be if they got munged from doing the old CUDA 9.0 library install. The only thing that should matter for our TensorFlow install is what we add next.

Select your User PATH line and then click on “Edit…” Add a “New” line with the path to the directory where your CUDA and cuDNN libraries are located.
PATH add

Click OK and then close that panel. We can now test to see if TensorFlow-GPU is working.

Check That TensorFlow is working with your GPU

Close any Powershell or CMD shells you had open and reopen one. You need to do that so that your new PATH settings get read in. You can use a CMD shell to activate your tf-gpu environment start Python and run the following lines,

>>> import tensorflow as tf
>>> hello = tf.constant('Hello, TensorFlow!')
>>> sess = tf.Session()
>>> print(sess.run(hello))

My session including the output looked like this, (there was a long delay during this “first run” session startup )

PS C:\Users\don> cmd
Microsoft Windows [Version 10.0.16299.461]
(c) 2017 Microsoft Corporation. All rights reserved.

C:\Users\don> activate tf-gpu

(tf-gpu) C:\Users\don>python
Python 3.6.5 |Anaconda custom (64-bit)| (default, Mar 29 2018, 13:32:41) [MSC v.1900 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import tensorflow as tf
>>> hello = tf.constant('Hello, TensorFlow')
>>> sess = tf.Session()
2018-06-01 16:37:57.666250: I T:\src\github\tensorflow\tensorflow\core\platform\cpu_feature_guard.cc:140] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2
2018-06-01 16:37:57.967130: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:1356] Found device 0 with properties:
name: GeForce GTX 1070 major: 6 minor: 1 memoryClockRate(GHz): 1.645
pciBusID: 0000:01:00.0
totalMemory: 8.00GiB freeMemory: 6.62GiB
2018-06-01 16:37:57.975868: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:1435] Adding visible gpu devices: 0
2018-06-01 16:40:10.162112: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:923] Device interconnect StreamExecutor with strength 1 edge matrix:
2018-06-01 16:40:10.168554: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:929]      0
2018-06-01 16:40:10.171214: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:942] 0:   N
2018-06-01 16:40:10.174162: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:1053] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 6400 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1070, pci bus id: 0000:01:00.0, compute capability: 6.1)
>>> print(sess.run(hello))
b'Hello, TensorFlow'
>>>

Yea! PATH’s are correct and everything is working. You can see that it is has GPU support.

Next we will do something a little more useful and fun with Keras, after we configure Jupyter notebook to use our ‘tf-gpu’ environment.

Create a Jupyter Notebook Kernel for the TensorFlow Environment

You can work with an editor and the command line and you often want to do that, but, Jupyter notebooks are great for doing machine learning development work. In order to get Jupyter notebook to work the way you want with this new TensorFlow environment you will need to add a “kernel” for it.

With your tf-gpu environment activated do,

(tf-gpu) C:\Users\don>conda install ipykernel

Now create the Jupyter kernel,

(tf-gpu) C:\Users\don>python -m ipykernel install --user --name tf-gpu --display-name "TensorFlow-GPU"

With this “tf-gpu” kernel installed, when you start Jupyter notebook you will now have an option to to open a new notebook using this kernel.

Jupyter kernel for TF

An Example using Keras with TensorFlow Backend

In order to check everything out lets setup LeNet-5 using Keras (with our TensorFlow backend) using a Jupyter notebook with our “TensorFlow-GPU” kernel. We’ll train the model on the MNIST digits data-set and then open TensorBoard to look at some plots of the job run.

Install Keras

With the tf-gpu environment activated do,

(tf-gpu) C:\Users\don\projects>conda install keras-gpu

You now have Keras installed utilizing your GPU accelerated TensorFlow.

Launch a Jupyter Notebook

With the tf-gpu environment activated start Jupyter,

(tf-gpu) C:\Users\don>jupyter notebook

From the ‘New’ drop-down menu select the ‘TensorFlow-GPU’ kernel that you added (as seen in the image in the last section). You can now start writing code!

MNIST example

Following are Python snippets you can copy into cells in your Jupyter notebook to setup and train LeNet-5 with MNIST digits data.

Import dependencies

import keras
from keras.datasets import mnist
from keras.models import Sequential
from keras.layers import Dense, Dropout
from keras.layers import Flatten,  MaxPooling2D, Conv2D
from keras.callbacks import TensorBoard

Load and process the MNIST data

(X_train,y_train), (X_test, y_test) = mnist.load_data()

X_train = X_train.reshape(60000,28,28,1).astype('float32')
X_test = X_test.reshape(10000,28,28,1).astype('float32')

X_train /= 255
X_test /= 255

n_classes = 10
y_train = keras.utils.to_categorical(y_train, n_classes)
y_test = keras.utils.to_categorical(y_test, n_classes)

Create the LeNet-5 neural network architecture

model = Sequential()
model.add(Conv2D(32, kernel_size=(3,3), activation='relu', input_shape=(28,28,1)) )
model.add(Conv2D(64, kernel_size=(3,3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2,2)))
model.add(Dropout(0.25))
model.add(Flatten())          
model.add(Dense(128, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(n_classes, activation='softmax'))

Compile the model

model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])

Set log data to feed to TensorBoard for visual analysis

tensor_board = TensorBoard('./logs/LeNet-MNIST-1')

Train the model

model.fit(X_train, y_train, batch_size=128, epochs=15, verbose=1,
          validation_data=(X_test,y_test), callbacks=[tensor_board])

The results

After running that training for 15 epochs the last epoch gave,

Epoch 15/15
60000/60000 [==============================] - 6s 102us/step - loss: 0.0192 - acc: 0.9936 - val_loss: 0.0290 - val_acc: 0.9914

Not bad! Training accuracy 99.36% and Validation accuracy 99.14%

Look at the job run with TensorBoard

You will need “bleach” for TensorBoard so install it first,

(tf-gpu) C:\Users\don>conda install bleach

Start TensorBoard

 (tf-gpu) C:\Users\don\projects>tensorboard --logdir=./logs --port 6006

It will give you an address similar to http://stratw:6006 Open that in your browser and you will be greeted with (the wonderful) TensorBoard. These are the plots it had for that job run,
TensorBoard output

That was a model with 1.2 million training parameters and a dataset with 60,000 images. It took 1 minute and 26 seconds utilizing the NVIDIA GeForce 1070 in my laptop system! For reference it took 26 minutes using all cores at 100% of the Intel 6700HQ CPU in that system. That’s an 18 fold speedup on the GPU!

Happy computing! –dbk

Tags: CUDA, GPU, Keras, Machine Learning, NVIDIA, TensorFlow, Windows 10