HPC Posts Archive

How to install CUDA 9.2 on Ubuntu 18.04

Posted on June 15, 2018 by Dr. Donald Kinghorn

If you are wanting to use Ubuntu 18.04 and also want a CUDA install this post should help you get that working.

The Best Way To Install Ubuntu 18.04 with NVIDIA Drivers and any Desktop Flavor

Posted on June 8, 2018 by Dr. Donald Kinghorn

In this post I’ll be going over details of Installing Ubuntu 18.04 including the NVIDIA display driver and, any one of the available desktop environments. I’ll do this starting from a base server install. I’ll go over a few possible pitfalls and end with a short discussion on the new netplan configuration tool for Ubuntu networking.

Install TensorFlow with GPU Support on Windows 10 (without a full CUDA install)

Posted on June 4, 2018 by Dr. Donald Kinghorn

In this post I’ll walk you through the best way I have found so far to get a good TensorFlow work environment on Windows 10 including GPU acceleration. I’ll go through how to install just the needed libraries (DLL’s) from CUDA 9.0 and cuDNN 7.0 to support TensorFlow 1.8. I’ll also go through setting up Anaconda Python and create an environment for TensorFlow and how to make that available for use with Jupyter notebook. As a “non-trivial” example of using this setup we’ll go through training LeNet-5 with Keras using TensorFlow with GPU acceleration. We’ll get a setup that is 18 times faster than using the CPU alone.

Install TensorFlow with GPU Support the Easy Way on Ubuntu 18.04 (without installing CUDA)

Posted on May 25, 2018 by Dr. Donald Kinghorn

TensorFlow is a very important Machine/Deep Learning framework and Ubuntu Linux is a great workstation platform for this type of work. If you are wanting to setup a workstation using Ubuntu 18.04 with CUDA GPU acceleration support for TensorFlow then this guide will hopefully help you get your machine learning environment up and running without a lot of trouble. And, you don’t have to do a CUDA install!

PCIe X16 vs X8 with 4 x Titan V GPUs for Machine Learning

Posted on May 21, 2018 by Dr. Donald Kinghorn

One of the questions I get asked frequently is “how much difference does PCIe X16 vs PCIe X8 really make?” Well, I got some testing done using 4 Titan V GPU’s in a machine that will do 4 X16 cards. I ran several jobs with TensorFlow with the GPU’s at both X16 and X8. Read on to see how it went.

Microsoft Build 2018 — impressions

Posted on May 11, 2018 by Dr. Donald Kinghorn

I attended the Microsoft Build 2018 developers conference this week and really enjoyed it. I wanted to share my “big picture” feelings about it and some of the things that stood out to me. I’m not going to give you a “reporters” view or repeat press-release items. This is just my personal impression of the conference.

Multi-GPU scaling with Titan V and TensorFlow on a 4 GPU Workstation

Posted on May 4, 2018 by Dr. Donald Kinghorn

I have been qualifying a 4 GPU workstation for Machine Learning and HPC use. The last confirmation testing I wanted to do was running it with TensorFlow benchmarks on 4 NVIDIA Titan V GPU’s. I have that systems up and running and the multi-GPU scaling looks very good.

GPU Memory Size and Deep Learning Performance (batch size) 12GB vs 32GB — 1080Ti vs Titan V vs GV100

Posted on April 27, 2018 by Dr. Donald Kinghorn

Batch size is an important hyper-parameter for Deep Learning model training. When using GPU accelerated frameworks for your models the amount of memory available on the GPU is a limiting factor. In this post I look at the effect of setting the batch size for a few CNN’s running with TensorFlow on 1080Ti and Titan V with 12GB memory, and GV100 with 32GB memory.

NVIDIA Titan V plus Tensor-cores Considerations and Testing of FP16 for Deep Learning

Posted on April 20, 2018 by Dr. Donald Kinghorn

Tensor-cores are one of the compelling new features of the NVIDIA Volta architecture. In this post I discuss the some thought on mixed precision and FP16 related to Tensor-cores. I have some performance results for large convolution neural network training that makes a good argument for trying to use them. Performance looks very good.

Build TensorFlow-GPU with CUDA 9.1 MKL and Anaconda Python 3.6 using a Docker Container

Posted on April 12, 2018 by Dr. Donald Kinghorn

Building TensorFlow from source is challenging but the end result can be a version tailored to your needs. This post will provide step-by-step instructions for building TensorFlow 1.7 linked with Anaconda3 Python, CUDA 9.1, cuDNN7.1, and Intel MKL-ML. I do the build in a docker container and show how the container is generated from a Dockerfile.

HPC Posts