Table of Contents
- Step-by-Step Instructions
- 1) If needed, follow my guide to install and configure Docker
- 2) Download the sources for TensorFlow
- 3) Setup the docker container build directory
- 4) Get the Anaconda3 install shell archive file and the bazel 11.1 deb file,
- 5) Create files called
cuda.sh, cuda.confin the
- 6) Create the Dockerfile to build the container
- 7) Create the container
- 8) Start the container and bind the directory with the source tree
- 9) Configure TensorFlow build
I’ve been working a lot with TensorFlow lately. The Google team and outside contributors do a very good job of providing binary packages and fresh docker images for TensorFlow, but, it’s not really built the way I would prefer. They are, understandably, somewhat conservative with the builds. The new 1.7 version release only includes CPU vector unit optimizations up to AVX which goes back to 2011. The Python in the official builds is "Python.org 3.5" and I prefer Anaconda 3.6. The GPU support is provided by CUDA 9.0 instead of the current 9.1.
Last week I went through how to do a custom build of TensorFlow 1.7 for CPU using a build environment inside a docker container. That build included links to MKL-ML (Intel Math Kernel Library – MKL-DNN). It was also built with the Anaconda Python 3.6 which is what I prefer to use for Python. The result was a TensorFlow build that showed a 2.5 fold speedup over the official build. Now I’ll go through the same basic process but this time include GPU acceleration with the current CUDA 9.1 cuDNN 7. We’ll make a new docker container with all of the dependences and configuration needed to do the build. That will avoid having to clutter the host system with the needed build environment.
The GPU acceleration in this build will not see a dramatic improvement like the CPU build did because the CUDA performance from 9.0 is already very good. Moving to 9.1 is just making the build current and a better fit for my development environment. There will be a big improvement for CPU performance and it will be built against the Python that I actually use i.e. Anaconda 3.6.
I will go through the same step-by-step instruction as I did in the CPU build but will include the necessary changes and details to get a GPU accelerated build. I recommend that you read though the CPU build post before you try this.
Note: I did have some difficulties with this build. It looks as though the Bazel build tool (or it’s configuration) had some problems. It would "forget" about library paths at various points during linking. I had to do a few "hacks" to get it to work right. Hopefully this will save you some grief if you are running into the same kinds of issues.
Note: Another bazel problem! While I was working on this post bazel was updated from 11.1 to 12.0 and that broke the tensorFlow build (bazel is also a Google project). In this post I explicitly use bazel 11.1 and have instructions on how to get and install it.
1) If needed, follow my guide to install and configure Docker
- How-To Setup NVIDIA Docker and NGC Registry on your Workstation – Part 1 Introduction and Base System Setup
- How-To Setup NVIDIA Docker and NGC Registry on your Workstation – Part 2 Docker and NVIDIA-Docker-v2
- How-To Setup NVIDIA Docker and NGC Registry on your Workstation – Part 3 Setup User-Namespaces
- How-To Setup NVIDIA Docker and NGC Registry on your Workstation – Part 4 Accessing the NGC Registry
- How-To Setup NVIDIA Docker and NGC Registry on your Workstation – Part 5 Docker Performance and Resource Tuning
2) Download the sources for TensorFlow
- Make a director to do the build,
mkdir TF-build-gpu cd Tf-build-gpu
- Get the TensorFlow source tree and "checkout" the branch you want,
git clone https://github.com/tensorflow/tensorflow cd tensorflow/ git checkout r1.7
3) Setup the docker container build directory
- From the TF-build-gpu directory create a directory for your Dockerfile and some other files we will copy into the container.
mkdir dockerfile cd dockerfile
4) Get the Anaconda3 install shell archive file and the bazel 11.1 deb file,
- Go to https://www.anaconda.com/download/#linux and download the Python 3.6 shell archive for Anaconda. Put this file in the
- Get the deb file for version 11.1 of bazel (version 12 is broken for TensorFlow 1.7!) https://github.com/bazelbuild/bazel/releases/download/0.11.1/bazel_0.11.1-linux-x86_64.deb
5) Create files called
cuda.sh, cuda.conf in the
These files will be copied into the docker image we will constructing.
This will set the PATH environment for cuda in the container.
This will add the cuda libraries to the default library path. The "stubs" directory is one of the hacks that I needed to keep the libraries paths working during this build.
/usr/local/cuda/lib64 /usr/local/cuda/extras/CUPTI/lib64 /usr/local/cuda/targets/x86_64-linux/lib/stubs
6) Create the Dockerfile to build the container
- Put the following in a file named
dockerfiledirectory (note the capital "D" in the file name)
# Dockerfile to setup a build environment for TensorFlow # using Intel MKL and Anaconda3 Python # GPU support with CUDA 9.1 and cudnn7.1 FROM nvidia/cuda:9.1-cudnn7-devel-ubuntu16.04 MAINTAINER nobody, not even me # Add a few needed packages to the base Ubuntu 16.04 # OK, maybe *you* don't need emacs :-) RUN \ apt-get update && apt-get install -y \ build-essential \ curl \ emacs-nox \ git \ openjdk-8-jdk \ && rm -rf /var/lib/lists/* # Use version 11.1 bazel! install from the deb file. COPY bazel_0.11.1-linux-x86_64.deb /root/ RUN \ cd /root; dpkg -i bazel_0.11.1-linux-x86_64.deb && \ rm -f bazel_0.11.1-linux-x86_64.deb # Copy in and install Anaconda3 from the shell archive # Anaconda3-5.1.0-Linux-x86_64.sh COPY Anaconda3* /root/ RUN \ cd /root; chmod 755 Anaconda3*.sh && \ ./Anaconda3*.sh -b && \ echo 'export PATH="$HOME/anaconda3/bin:$PATH"' >> .bashrc && \ rm -f Anaconda3*.sh # Copy in the CUDA configuration files COPY cuda.sh /etc/profile.d/ COPY cuda.conf /etc/ld.so.conf.d/ # That's it! That should be enough to do a TensorFlow 1.7 GPU build # using CUDA 9.1 Anaconda Python 3.6 Intel MKL with gcc 5.4
This Dockerfile will,
- use the official NVIDIA CUDA 9.1 image on an Ubuntu 16.04 base
- install some needed packages
- add the apt repo for bazel and install it
- Install Anaconda3 Python
- Add the configuration files needed for CUDA 9.1
7) Create the container
docker build -t tf-build-1.7-gpu .
That will create the container we will do the TensorFlow build in. This is a large container! It will take awhile to build and install everything.
8) Start the container and bind the directory with the source tree
docker run --runtime=nvidia --rm -it -v $HOME/projects/TF-build-gpu:/root/TF-build-gpu tf-build-1.7-gpu
That will start the container. Note that I have my directory for the build in $HOME/projects/TF-build-gpu and that is being bound into the container at /root/TF-build-gpu.
9) Configure TensorFlow build
Now that you are in the container,
cd /root/TF-build-gpu/tensorflow/ ./configure
./configure will ask a lot of questions. It should see Anaconda Python 3.6 as the system Python and use that. You will probably want to answer "No" to most of the questions. Answer "Yes" to GPU support since we set up CUDA in this container. I set the CUDA version to 9.1 and included compute capabilities from 5.2 to 7.0. That includes GPU’s from Maxwell to the current V100 (Titan V). You could add support for older cards. You can find a list of compute capabilities on the CUDA Wikipedia page.
Here’s are my answers to configure,
root@dc40c84fcef1:~/TF-build/tensorflow# ./configure /root/anaconda3/bin/python . Extracting Bazel installation... You have bazel 0.11.1 installed. Please specify the location of python. [Default is /root/anaconda3/bin/python]: Found possible Python library paths: /root/anaconda3/lib/python3.6/site-packages Please input the desired Python library path to use. Default is [/root/anaconda3/lib/python3.6/site-packages] Do you wish to build TensorFlow with jemalloc as malloc support? [Y/n]: y jemalloc as malloc support will be enabled for TensorFlow. Do you wish to build TensorFlow with Google Cloud Platform support? [Y/n]: n No Google Cloud Platform support will be enabled for TensorFlow. Do you wish to build TensorFlow with Hadoop File System support? [Y/n]: n No Hadoop File System support will be enabled for TensorFlow. Do you wish to build TensorFlow with Amazon S3 File System support? [Y/n]: n No Amazon S3 File System support will be enabled for TensorFlow. Do you wish to build TensorFlow with Apache Kafka Platform support? [y/N]: n No Apache Kafka Platform support will be enabled for TensorFlow. Do you wish to build TensorFlow with XLA JIT support? [y/N]: n No XLA JIT support will be enabled for TensorFlow. Do you wish to build TensorFlow with GDR support? [y/N]: n No GDR support will be enabled for TensorFlow. Do you wish to build TensorFlow with VERBS support? [y/N]: n No VERBS support will be enabled for TensorFlow. Do you wish to build TensorFlow with OpenCL SYCL support? [y/N]: n No OpenCL SYCL support will be enabled for TensorFlow. Do you wish to build TensorFlow with CUDA support? [y/N]: y CUDA support will be enabled for TensorFlow. Please specify the CUDA SDK version you want to use, e.g. 7.0. [Leave empty to default to CUDA 9.0]: 9.1 Please specify the location where CUDA 9.1 toolkit is installed. Refer to README.md for more details. [Default is /usr/local/cuda]: Please specify the cuDNN version you want to use. [Leave empty to default to cuDNN 7.0]:7.1 Please specify the location where cuDNN 7 library is installed. Refer to README.md for more details. [Default is /usr/local/cuda]: Do you wish to build TensorFlow with TensorRT support? [y/N]: n No TensorRT support will be enabled for TensorFlow. Please specify a list of comma-separated Cuda compute capabilities you want to build with. You can find the compute capability of your device at: https://developer.nvidia.com/cuda-gpus. Please note that each additional compute capability significantly increases your build time and binary size. [Default is: 3.5,5.2]5.2,6.0,6.1,7.0 Do you want to use clang as CUDA compiler? [y/N]: n nvcc will be used as CUDA compiler. Please specify which gcc should be used by nvcc as the host compiler. [Default is /usr/bin/gcc]: Do you wish to build TensorFlow with MPI support? [y/N]: n No MPI support will be enabled for TensorFlow. Please specify optimization flags to use during compilation when bazel option "--config=opt" is specified [Default is -march=native]: Would you like to interactively configure ./WORKSPACE for Android builds? [y/N]: n Not configuring the WORKSPACE for Android builds. Preconfigured Bazel build configs. You can use any of the below by adding "--config=<>" to your build command. See tools/bazel.rc for more details. --config=mkl # Build with MKL support. --config=monolithic # Config for mostly static monolithic build. Configuration finished
-march=native since on my machine that will give me AVX512 and FMA3.
10) Build TensorFlow
After you are finished with
configure you can do the build. I used,
bazel build --config=opt --config=mkl --config=cuda --action_env PATH="$PATH" //tensorflow/tools/pip_package:build_pip_package
Note that in addition to "opt" and "cuda" I used
--config=mkl that will cause the build to link in the Intel MKL-ML libs. Those libs are now included in the TensorFlow source tree. (Thank you Intel for making those important libraries available to the public.)
Also note, I had to add
--action_env PATH="$PATH" because bazel sometimes forgets it’s environment!
It will take some time to build since TensorFlow is a big package! I was greeted with the wonderful message,
INFO: Elapsed time: 891.344s, Critical Path: 501.24s INFO: Build completed successfully, 6853 total actions
11) Create the pip package
After your build finishes you will want to create the pip package,
You should now have a "whl" file in your TF-build-gpu/tensorflow_pkg directory. You can install that pip package in a conda environment on your local machine if you have Anaconda installed there. This is what I was planning on for the build. I also want this pip package for use in other Docker containers.
12) Install the pip package
I’ll test in the current Docker container. First create a conda env.
conda create -n tftest source activate tftest pip install tensorflow_pkg/tensorflow-1.7.0-cp36-cp36m-linux_x86_64.whl
The first thing I want to see is the linked in libraries. I used "ldd" to check that,
ldd ~/anaconda3/lib/python3.6/site-packages/tensorflow/libtensorflow_framework.so linux-vdso.so.1 => (0x00007fff5e9fe000) libcublas.so.9.1 => /usr/local/cuda-9.1/targets/x86_64-linux/lib/libcublas.so.9.1 (0x00007f7687f0c000) libcuda.so.1 => /usr/lib/x86_64-linux-gnu/libcuda.so.1 (0x00007f768736c000) libcudnn.so.7 => /usr/lib/x86_64-linux-gnu/libcudnn.so.7 (0x00007f7672b43000) libcufft.so.9.1 => /usr/local/cuda-9.1/targets/x86_64-linux/lib/libcufft.so.9.1 (0x00007f766b656000) libcurand.so.9.1 => /usr/local/cuda-9.1/targets/x86_64-linux/lib/libcurand.so.9.1 (0x00007f76676d3000) libcudart.so.9.1 => /usr/local/cuda-9.1/targets/x86_64-linux/lib/libcudart.so.9.1 (0x00007f7667465000) libmklml_intel.so => /root/anaconda3/lib/python3.6/site-packages/tensorflow/../_solib_local/_U_S_Sthird_Uparty_Smkl_Cintel_Ubinary_Ublob___Uexternal_Smkl_Slib/libmklml_intel.so (0x00007f765e622000) libiomp5.so => /root/anaconda3/lib/python3.6/site-packages/tensorflow/../_solib_local/_U_S_Sthird_Uparty_Smkl_Cintel_Ubinary_Ublob___Uexternal_Smkl_Slib/libiomp5.so (0x00007f765e27e000) libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007f765e07a000) libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007f765dd71000) libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007f765db54000) libstdc++.so.6 => /usr/lib/x86_64-linux-gnu/libstdc++.so.6 (0x00007f765d7d2000) libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007f765d5bc000) libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f765d1f2000) /lib64/ld-linux-x86-64.so.2 (0x00007f768caa4000) librt.so.1 => /lib/x86_64-linux-gnu/librt.so.1 (0x00007f765cfea000) libnvidia-fatbinaryloader.so.390.48 => /usr/lib/x86_64-linux-gnu/libnvidia-fatbinaryloader.so.390.48 (0x00007f765cd9e000)
Success! Linked to all of the libraries I wanted.
Happy computing! –dbk