Table of Contents
Introduction
In June of 2018 I wrote a post titled The Best Way to Install TensorFlow with GPU Support on Windows 10 (Without Installing CUDA). That post has served many individuals as guide for getting a good GPU accelerated TensorFlow work environment running on Windows 10 without needless installation complexity. It's very satisfying to me personally to have been able to help so many to get started with TensorFlow on Windows 10! However, that guide is nearly a year old now and has needed an update for some time. I've been promising to do this in my comment reply's, so, here it is.
This post will guide you through a relatively simple setup for a good GPU accelerated work environment with TensorFlow (with Keras and Jupyter notebook) on Windows 10. You will not need to install CUDA for this!
I'll walk you through the best way I have found so far to get a good TensorFlow work environment on Windows 10 including GPU acceleration. This will be a complete working environment including,
- System preparation and NVIDIA driver update
- Anaconda Python installation
- Creating an environment for your TensorFlow configuration using "conda"
- Installing the latest stable build of TensorFlow in that environment
- Setting up Jupyter Notebook to work with your new "env"
- An example deep learning problem using TensorFlow with GPU acceleration, Keras, Jupyter Notebook, and TensorBoard visualization.
Lets do it.
Step 1) System Preparation – NVIDIA Driver Update and checking your PATH variable (Possible “Gotchas”)
This is a step that was left out of the original post and the issues presented here were the source of most difficulties that people had with the old post. The current state of your Windows 10 configuration may cause difficulties. I'll try to give guidance on things to look out for.
The primary testing for this post is on a fresh install of Windows 10 Home "October 2018 Update" on older hardware. (Intel Core i7 4770 + NVIDIA GTX 980 GPU). This turns out to be a good test systems because it would have failed with the old guide without the information in this step.
Check your NVIDIA Driver
This is important and I'll show you why.
Don't assume Microsoft gave you the latest NVIDIA driver! Check it and update if there is a newer version.
Right click on your desktop and then "NVIDIA Control Panel"
You can see that my fresh install of Windows 10 gave me a version 388 driver. That is way too old! Now click on "System Information" and then the "Components" panel. The next image shows why that 388 driver wont work with the newest TensorFlow,
The CUDA "runtime" is part of the NVIDIA driver. The CUDA runtime version has to support the version of CUDA you are using for any special software like TensorFlow that will be linking to other CUDA libraries (DLL's). As of this writing TensorFlow (v1.13) is linking to CUDA 10.0. The runtime has to be as new, or newer, than the extra CUDA libraries you need.
Update the NVIDIA Display Driver
Even if you think you have the latest NVIDIA driver check to be sure.
Go to [https://www.nvidia.com/Download/index.aspx] and enter the information for your GPU. Then click "search".
Click "search" to go to the download page,
It doesn't matter too much what GPU you put in on the search page the latest driver supports cards all the way back to the 600 series.
Download and install the driver following the prompts.
Note: I used the "Standard" driver if you are using an install that was done by Dell or HP etc. they may have put there own OEM version on your system. If the standard driver doesn't work try the "DCH" driver. Also, NVIDIA now has 2 drivers because some video processing applications were not working right. I used the "Game Ready Driver". After all, it's "Workstation by day, Battle-station by night". Right?
Check your PATH environment variable
This may not be something you think about very often, but it's a good idea to have an idea of the state of your PATH environment variable. Why? Development tools will often alter you PATH variable. If you are trying to run some code and getting errors that some library or executable cannot be found, or just having strange problems that doesn't seem to make sense, then your system may be grabbing something by looking at your PATH and finding a version that you are not expecting.
If you answer yes to any of the following then you should really look at your PATH,
- Have you installed Visual Studio?
- Did you install some version of CUDA?
- Have you installed Python.org Python?
- Have you tried a "pip" install of TensorFlow?
You may be reading this because you tried and failed to install TensorFlow following Google's instructions. If you feel that you made a mess on your system then you can try to do some clean-up by uninstalling what you did. But, you may not have to clean up. Try to do what I suggest for the TensorFlow install. However, first look at your PATH so you know it's state in case you run into strange errors.
Go to the "Start menu" and start typing PATH Variable, your should get a search result for the control panel "System Properties" advanced panel.
Click on "Environment Variables"
The PATH on my testing system is short because I haven't installed anything that would modify it.
If you have a long string then there is a great "Edit.." panel that will show you each entry and allow you to move things up or down and delete or add new entries.
The main idea to keep in mind is that when your systems searches for an executable or library it will start by looking in the current directory (folder) and then goes through directories listed in your User PATH entries followed by the System PATH. It keeps going until it finds the first thing that satisfies what you asked for (or fails) … but it might not be the thing you want it to find. It takes the first thing it finds. If you have folder entries in your PATH that have different version of an executable of DLL with the same name you can move the PATH for the one you want toward the beginning of your PATH so it's found first.
Be very careful with your PATH. Don't make changes unless you know what you are doing. It should mostly be something that you are aware of for trouble-shooting.
A special note for laptops
If you have a laptop with an NVIDIA GPU (like a nice gaming laptop) then you should succeed with the instructions in this post. However, one unique problem on laptops is that you will likely have power saving control that switches your display driver back to the CPU's integrated display. A current Windows 10 setup on your laptop along with the latest driver should automatically switch your display to the NVIDIA driver when you start TensorFlow (same as starting up a game) but, if you have trouble that looks like TensorFlow is not finding your GPU then you may need to manually switch your display. You will likely find options by right clicking on your desktop.
Step 2) Python Environment Setup with Anaconda Python
I highly recommend Anaconda Python. If you need some arguments for using Python take a look at my post Should You Learn to Program with Python. For arguments on why you should use the Anaconda Python distribution see, How to Install Anaconda Python and First Steps for Linux and Windows. Another reason for using Anaconda Python in the context of installing GPU accelerated TensorFlow is that by doing so you will not have to do a CUDA install on your system.
Anaconda is focused toward data-science and machine learning and scientific computing. It installs cleanly on your system in a single directory so it doesn't make a mess in your systems application and library directories. It is also performance optimized for important numerical packages like numpy, scipy etc..
Download and Install Anaconda Python
- Go to the Anaconda downloads page https://www.anaconda.com/distribution and get the 64-Bit Python 3.7 (or newer) version.
You can download an "Run" at the same time or download to your machine and double click on the "exe" file to start the installer.
- You will be asked to accept a license agreement …
- "Select Install Type" I recommend you chose "Just Me" since this is part of your personal development environment.
- "Chose Install Location" I recommend you keep the default which is at the top level of you user directory.
- "Advanced Installation Options"
"Register Anaconda as my default Python 3.7" is recommended." "Add Anaconda to my PATH environment variable" is OK to select. However, you don't really need to do that. If you use the GUI, Anaconda Navigator, the (DOS) shell or the PowerShell link in the Anaconda folder on your start menu they will temporarily set the proper PATH environment for you without making a "permanent" change to your PATH variable. For this install I will leave it un-checked.
My personal preference it to "Add Anaconda to my PATH" because I want it to be found whenever I use Python.
Note: This version of the Anaconda distribution supports "Python environments" in PowerShell which is my personal preferred way to to work with "conda" on Windows.
Check and Update your Anaconda Python Install
Go to the "Start menu" find the "Anaconda3" item and then click on the "Anaconda Powershell Prompt",
With "Anaconda Powershell" opened do a quick check to see that you now have Anaconda3 Python 3.7 as your default Python.
(base) PS>python
Python 3.7.3 (default, Mar 27 2019, 17:13:21) [MSC v.1915 64 bit (AMD64)] :: Anaconda, Inc. on win32
Type "help", "copyright", "credits" or "license" for more information.
>>>
Type CTRL-D to exit the Python prompt.
Update your base Anaconda packages
`conda` is a powerful package and environment management tool for Anaconda. We'll use `conda` from Powershell to update the base Python install. Run the following commands. It may take some time to do this since there may be a lot of modules to update.
conda update conda
conda update anaconda
conda update python
conda update --all
That should bring your entire base Anaconda install up to the latest packages. (Everything may already be up to date.)
Anaconda Navigator
There is a GUI for Anaconda called `anaconda-navigator`. I personally find it distracting/confusing/annoying and prefer using `conda` from the command-line. Your taste may differ! … and my opinion is subject to change if they keep improving it. If you are new to Anaconda then I highly recommend that you read up on `conda` even (or especially!) if you are thinking about using the "Navigator" GUI.
Step 3) Create a Python “virtual environment” for TensorFlow using conda
You should set up an environment for TensorFlow separate from your base Anaconda Python environment. This keeps your base clean and will give TensorFlow a space for all of it's dependencies. It is in general good practice to keep separate environments for projects especially when they have special package dependencies. Think of it as a separate "name-space" for your project.
There are many possible options when creating an environment with conda including adding packages with specific version numbers and specific Python base versions. This is sometimes useful if you want fine control and it also helps with version dependency resolution. Here we will keep it simple and just create a named environment, then activate that environment and install the packages we want inside of that.
- From the "Anaconda Powershell Prompt" command line do,
conda create --name tf-gpu
I named the environment 'tf-gpu' but you can use any name you want. For example you could add the version number.
NOTE: avoid using spaces in names! Python will not handle that well and you could get get strange errors. "-" and "_" are fine. (Python programmers often use underscores.)
- Now exit from the Powershell you are using and then open a new one before you activate the new "env". This is an annoying quirk but, powershell will not re-read it's environment until you restart it. If you activate the new "env" before you restart you will not be able to do any package installs because the needed utilities will not be on the path in the current shell until after a restart.
- "activate" the environment, (I'll show my full Powershell prompt and output instead of just the commands)
(base) PS C:Usersdon> conda info --envs
# conda environments:
#
base * C:UsersdonAnaconda3
tf-gpu C:UsersdonAnaconda3envstf-gpu
(base) PS C:Usersdon> conda activate tf-gpu
(tf-gpu) PS C:Usersdon>
The `conda info –envs` command shows the "envs" you have available.
After doing `conda activate tf-gpu` you can see that the prompt is now preceded by the the name of the environment `(tf-gpu)`. Any conda package installs will now be local to this environment.
Step 4) Install TensorFlow-GPU from the Anaconda Cloud Repositories
There is an "official" Anaconda maintained TensorFlow-GPU package for Windows 10!
A search for "tensorflow" on the Anaconda Cloud will list the available packages from Anaconda and the community. There is a package "anaconda / tensorflow-gpu 1.13.1" listed near the top that has builds for Linux and Windows. This is what we will be installing from the commands below.
This command will install the latest stable version of TensorFlow with GPU acceleration in this conda environment. (It will be the latest version maintained by the Anaconda team and may lag by a few weeks from any fresh release from Google.)
(tf-gpu) C:Usersdon> conda install tensorflow-gpu
That's it! You now have TensorFlow with NVIDIA CUDA GPU support!
This includes, TensorFlow, Keras, TensorBoard, CUDA 10.0 toolkit, cuDNN 7.3 along with all of the dependencies. It's all in your new "tf-gpu" env ready to use and isolated from other env's or packages on your system.
Step 5) Simple check to see that TensorFlow is working with your GPU
You can use the powershell that you have activated the tf-gpu env in and did the TensorFlow install with or open a new one and do ` conda activate tf-gpu`.
With your tf-gpu env active type the following,
python
Your prompt will change to the python interpreter prompt. this will be a simple test and we'll use a nice feature of recent TensorFlow releases, eager execution.
>>> import tensorflow as tf
>>> tf.enable_eager_execution()
>>> print( tf.constant('Hello from TensorFlow ' + tf.__version__) )
(that is 2 underscores before and after "version")
My session including the output looked like this,
(base) PS>conda activate tf-gpu
(tf-gpu) PS>python
Python 3.7.3 (default, Mar 27 2019, 17:13:21) [MSC v.1915 64 bit (AMD64)] :: Anaconda, Inc. on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import tensorflow as tf
>>> tf.enable_eager_execution()
>>> print( tf.constant( 'Hellow from TensorFlow ' + tf.__version__ ) )
2019-04-24 18:08:58.248433: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX AVX2
2019-04-24 18:08:58.488035: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1433] Found device 0 with properties:
name: GeForce GTX 980 major: 5 minor: 2 memoryClockRate(GHz): 1.2785
pciBusID: 0000:01:00.0
totalMemory: 4.00GiB freeMemory: 3.30GiB
2019-04-24 18:08:58.496081: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1512] Adding visible gpu devices: 0
2019-04-24 18:08:58.947914: I tensorflow/core/common_runtime/gpu/gpu_device.cc:984] Device interconnect StreamExecutor with strength 1 edge matrix:
2019-04-24 18:08:58.951226: I tensorflow/core/common_runtime/gpu/gpu_device.cc:990] 0
2019-04-24 18:08:58.953130: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1003] 0: N
2019-04-24 18:08:58.955149: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 3005 MB memory) -> physical GPU (device: 0, name: GeForce GTX 980, pci bus id: 0000:01:00.0, compute capability: 5.2)
tf.Tensor(b'Hellow from TensorFlow 1.13.1', shape=(), dtype=string)
>>>
When you first run TensorFlow it outputs a bunch of information about the execution environment it is in. You can see that it found the GTX 980 in this system and added it as an execution device.
Next we will do something a little more useful and fun with Keras, after we configure Jupyter notebook to use our 'tf-gpu' environment.
Step 6) Create a Jupyter Notebook Kernel for the TensorFlow Environment
You can work with an editor and the command line and you often want to do that but, Jupyter notebooks are great for doing machine learning development work. In order to get Jupyter notebook to work the way you want with this new TensorFlow environment you will need to add a "kernel" for it.
With your tf-gpu environment activated do,
conda install ipykernel jupyter
Note: I installed both ipykernel and jupyter above since jupyter was not installed by default when we created the tf-gpu env. jupyter is installed by default in the (base) env.
Now create the Jupyter kernel,
python -m ipykernel install --user --name tf-gpu --display-name "TensorFlow-GPU-1.13"
You can set the "display-name" to anything you like. I included the version number here.
With this "tf-gpu" kernel installed, when you start Jupyter notebook you will now have an option to to open a new notebook using this kernel.
Start a Jupyter notebook,
jupyter notebook
Look at the "New" menu,
Note: If you start a jupyter notebook from the (base) env you will see "TensorFlow-GPU-1.13" option but you will not be able to import tensorflow in that notebook because TensorFlow is only installed in the "tf-gpu" env. [You could have installed into your (base) env but, I recommend that you keep separate env's.]
Step 7) An Example Convolution Neural Network training using Keras with TensorFlow
In order to check everything out lets setup the classic neural network LeNet-5 using Keras using a Jupyter notebook with our "TensorFlow-GPU-1.13" kernel. We'll train the model on the MNIST digits data-set and then use TensorBoard to look at some plots of the job run.
You do not need to install Keras or TensorBoard separately since they are now included with the TensorFlow install.
Activate your "tf-gpu" env
Launch "Anaconda Powershell" and then do,
conda activate tf-gpu
Create a working directory (and log directory for TensorBoard)
I like to have a directory called "projects" in my user home directory. In the project directory I create directories for things I'm working on. Of course, you can organize your work however you like. … But I do highly recommend that you learn to use the command-line if your are not familiar with working like that. You can thank me later!
In powershell the the following commands are useful for managing directories,
To see what directory you are in,
pwd
(if you just opened "Anaconda Powershell" you should be in your "user home directory")
To create a new directory (and additional subdirectories all at once)
Note: when you are working with "code" I highly recommend that you **do not use spaces in directory or file names**.
# in the new version 1.14 you no longer need to create the logs file for Tensorboard
# It is still good to create a working directory
# mkdir projects/tf-gpu-MNIST/logs
mkdir projects/tf-gpu-MNIST
That one command above gives you a work directory, "tf-gpu-MNIST", and a "logs" subdirectory.
Note: In powershell you can use "/" or "" to separate directories. (It has many commands that would be the same in Linux and you can use those alternatively to "DOS" like commands. )
To change directory use "cd"
cd projects/tf-gpu-MNIST
(For completeness) To delete a directory you can use the ` rmdir` command
IMPORTANT!
***********************************************************
The older version (1.13.1) was able to use UNIX like file paths on Windows but it looks like version 1.14 does not! You need to change this,
tensor_board = tf.keras.callbacks.TensorBoard('./logs/LeNet-MNIST-1')
to this,
tensor_board = tf.keras.callbacks.TensorBoard('.\logs\LeNet-MNIST-1')
I also noticed that you no longer need to create the directory before hand i.e. if the directors .\logs\LeNet=MNIST-1 doesn't exist when you start the job run it will be created automatically.
*************************************************************
Launch a Jupyter Notebook
After "cd'ing: into your working directory and with the tf-gpu environment activated start a Jupyter notebook,
jupyter notebook
From the 'New' drop-down menu select the 'TensorFlow-GPU-1.13' kernel that you added (as seen in the image in the last section). You can now start writing code!
MNIST hand written digits example
The following "code blocks" can be treated as jupyter notebook "Cells". You can type them in (recommended for practice) or cut and past. To execute the code in a cell use `Shift-Return`.
We will setup and train LeNet-5 with the MNIST handwritten digits data.
Import TensorFlow
import tensorflow as tf
Load and process the MNIST data
mnist = tf.keras.datasets.mnist
(train_images, train_labels), (test_images, test_labels) = mnist.load_data()
# reshape and rescale data for the CNN
train_images = train_images.reshape(60000, 28, 28, 1)
test_images = test_images.reshape(10000, 28, 28, 1)
train_images, test_images = train_images/255, test_images/255
Create the LeNet-5 convolution neural network architecture
model = tf.keras.Sequential([
tf.keras.layers.Conv2D(32, (3,3), activation='relu', input_shape=(28,28,1)),
tf.keras.layers.Conv2D(64, (3,3), activation='relu'),
tf.keras.layers.MaxPooling2D(2,2),
tf.keras.layers.Dropout(0.25),
tf.keras.layers.Flatten(),
tf.keras.layers.Dense(128, activation='relu'),
tf.keras.layers.Dropout(0.5),
tf.keras.layers.Dense(10, activation='softmax')
])
Compile the model
model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])
Set log data to feed to TensorBoard for visual analysis
tensor_board = tf.keras.callbacks.TensorBoard('./logs/LeNet-MNIST-1')
Train the model (with timing)
import time
start_time=time.time()
model.fit(train_images, train_labels, batch_size=128, epochs=15, verbose=1,
validation_data=(test_images, test_labels), callbacks=[tensor_board])
print('Training took {} seconds'.format(time.time()-start_time))
The results
After running that training for 15 epochs the last epoch gave,
Train on 60000 samples, validate on 10000 samples
Epoch 1/15
60000/60000 [==============================] - 6s 105us/sample - loss: 0.2400 - acc: 0.9276 - val_loss: 0.0515 - val_acc: 0.9820
...
...
Epoch 15/15
60000/60000 [==============================] - 5s 84us/sample - loss: 0.0184 - acc: 0.9937 - val_loss: 0.0288 - val_acc: 0.9913
Training took 79.47694969177246 seconds
Not bad! Training accuracy 99.37% and Validation accuracy 99.13%
It took about 80 seconds on my old Intel i7-4770 box with an NVIDIA GTX 980 GPU (it's about 17 times slower on the CPU).
Look at the job run with TensorBoard
Open another "Anaconda Powershell" and activate your tf-gpu env, and "cd" to your working directory,
conda activate tf-gpu
cd projects/tf-gpu-MNIST
Then startup TensorBoard
tensorboard --logdir=./logs --port 6006
It will give you a local web address with the name of your computer (like the lovely name I got from this test Win10 install)
Open that address in your browser and you will be greeted with (the wonderful) TensorBoard. These are the plots it had for that job run,
Note: on Chrome I had to use localhost:6006 instead of the address returned from Tensorboard
Note: For a long training job you can run TensorBoard on a log file during the training. It will monitor the log file and let your refresh the plots as it progresses.
Conclusion
That MNIST digits training example was a model with 1.2 million training parameters and a dataset with 60,000 images. **It took 80 seconds utilizing the NVIDIA GTX 980 on my old test system! For reference it took 1345 seconds using all cores at 100% on the Intel i7-4770 CPU in that machine. That's an 17 fold speedup on the GPU. That's why you use GPU's for this stuff!**
Note: I used the same procedure for doing the CPU version. I created a new "env" naming it "tf-CPU" and installed the CPU only version of TensorFlow i.e. `conda install tensorflow` without the "-gpu" part. I then ran the same Jupyter notebook using a "kernel" created for that env.
I sincerely hope this guide helps get you up-and-running with TensorFlow. Feel free to add comments if you have any trouble. Either myself or someone else in the community will likely be able to help you!
Happy computing! –dbk
Looking for a
Scientific Compute System?
Do you have a project that needs serious compute power, and you don’t know where to turn? Puget Systems offers a range of HPC workstations and servers tailored for both CPU and GPU workloads.
Why Choose Puget Systems?
Built Specifically for You
Rather than getting a generic workstation, our systems are designed around your unique workflow and are optimized for the work you do every day.
We’re Here, Give Us a Call!
We make sure our representatives are as accessible as possible, by phone and email. At Puget Systems, you can actually talk to a real person!
Fast Build Times
By keeping inventory of our most popular parts, and maintaining a short supply line to parts we need, we are able to offer an industry-leading ship time.
Lifetime Labor & Tech Support
Even when your parts warranty expires, we continue to answer your questions and even fix your computer with no labor costs.
Click here for even more reasons!