TitanXp vs GTX1080Ti for Machine Learning

Table of Contents

NVIDIA has released the Titan Xp which is an update to the Titan X Pascal (they both use the Pascal GPU core). They also recently released the GTX1080Ti which proved to be every bit as good at the Titan X Pascal but at a much lower price. The new Titan Xp does offer better performance and is currently their fastest GeForce card. How much faster? I decided to find out by running a large Deep Learning image classification job to see how it performs for GPU accelerated Machine Learning.

The Titan Xp offers 10-20% performance gain over the Titan X Pascal and the GTX1080Ti for training a large Deep Neural Network.

Titan X vs Xp
Visually the only difference between the Titan X and Titan Xp is the lack of DVI on the Xp!

Results

The details about the test system and how the jobs were setup will follow the results. The primary results are for a training a Deep Neural Network (GoogleLeNet) for image classification with a 1.3 million image data set from ImageNet. I also have comparative nbody benchmark performance for several cards.

I have included results from a couple of older posts for comparison.

GoogLeNet model training with Caffe on 1.3 million image dataset for 30 epochs

GPU’s	Model training runtime	~ GPU(s) cost ($)
(1) GTX 1070	32hr	400
(2) GTX 1070	19hr 32min	800
(4) GTX 1070	12hr 43min	1600
(1) GTX 1080Ti	19hr 43min	700
(2) GTX 1080Ti	13hr 12min	1400
(4) GTX 1080Ti	7hr 43min	2800
(1) Titan X Pascal	19hr 34min	1400
(2) Titan X Pascal	13hr 21min	2800
(4) Titan X Pascal	8hr 1min	5600
(1) Titan Xp	17hr 33min	1400
(2) Titan Xp	10hr 40min	2800

Notes:

The Titan Xp offers 10-20% performance gain over the Titan X Pascal and the GTX1080Ti but at twice the cost
The (1) and (2) GTX 1070 and the (1) Titan Xp job runs were done with an image batch size of 64 all others used an image batch size of 128
It’s not unusual to see fluctuations in run times on the order 30min.

The next table shows the results of nbody -benchmark -numbodies=256000 (nbody from the CUDA samples source code).

GTX 1070, 1080Ti, Titan X Pascal and Titan Xp nbody Benchmark

GPU	nbody GFLOP/s
(1) GTX 1070	4137 GFLOP/s
(1) GTX 1080Ti	7514 GFLOP/s
(1) Titan X Pascal	7524 GFLOP/s
(1) Titan Xp	7904 GFLOP/s

Details

Video cards used for testing.( Data from nvidia-smi )

Card	CUDA cores	GPU clock MHz	Memory clock MHz*	Application clock MHz*	FB Memory MiB
GTX 1070	1920	1506	4004	1506	8110
TITAN X Pascal	3584	1911	5005	1417	12186
GTX 1080Ti	3584	1911	5508	N/A	11172
TITAN Xp	3840	1911	5705	1430	12186

* Clocks can vary by manufacture and are not always displayed with nvidia-smi

Test System

The testing was done on my test-bench layout of our Peak Single (DIGITS GPU Workstation) recommended system for DIGITS/Caffe.

The Peak Single (“DIGITS” GPU Workstation)

CPU: Intel Core i7 6850K 6-core @ 3.6GHz (3.7GHz All-Core-Turbo)
Memory: 128 GB DDR4 2133MHz Reg ECC
PCIe: (4) X16-X16 v3
Motherboard: ASUS X99-E-10G WS

Caveat:
Heavy compute on GeForce cards can shorten their lifetime! I believe it is perfectly fine to use these cards but keep in mind that you may fry one now and then!

Software

The OS I used for this testing was Ubuntu 16.04.2 install with the Docker and NVIDIA-Docker Workstation configuration I’ve been working on. See, these posts for information about that;

Following is a list of the software in the nvidia/digits Docker image used in the testing.

Ubuntu 14.04
CUDA 8.0.61
DIGITS 5.0.0

caffe-nv (0.15.13-3ubuntu14.04+cuda8.0), cuDNN 5

Host environment was,

Test job image dataset

I used the training image set from
IMAGENET Large Scale Visual Recognition Challenge 2012 (ILSVRC2012)
I only used the the training set images from the “challenge”. All 138GB of them! I used the tools in DIGITS to partition this set into a training set and validation set and then used the GoogLeNet 22-layer network.

Training set — 960893 images
Validation set — 320274 images
Model — GoogLeNet
Duration — 30 Epochs

Many of the images in the IMAGENET collection are copyrighted. This means that usage and distribution is somewhat restricted. One of the things listed in the conditions for download is this,
“You will NOT distribute the above URL(s)”
So, I wont. Please see the IMAGENET site for information on obtaining datasets.

Citation

ImageNet Large Scale Visual Recognition Challenge

arXiv:1409.0575,

paper

bibtex

Conclusions

The NVIDIA Titan Xp is a great card for GPU accelerated machine learning workloads and offers a noticeable improvement the Titan X Pascal card that it replaces. However, for these workloads running on a workstation the GTX 1080Ti offers a much better value. There is also a compelling argument for the GTX 1070 since it is also an excellent value given the respectable performance it is capable of.

Happy computing –dbk

Tags: GPU compute, GTX1080Ti, Machine Learning, Titan X, Titan Xp