Dr Donald Kinghorn (HPC and Scientific Computing)

TitanXp vs GTX1080Ti for Machine Learning

Written on April 14, 2017 by Dr Donald Kinghorn
Share:

NVIDIA has released the Titan Xp which is an update to the Titan X Pascal (they both use the Pascal GPU core). They also recently released the GTX1080Ti which proved to be every bit as good at the Titan X Pascal but at a much lower price. The new Titan Xp does offer better performance and is currently their fastest GeForce card. How much faster? I decided to find out by running a large Deep Learning image classification job to see how it performs for GPU accelerated Machine Learning.

The Titan Xp offers 10-20% performance gain over the Titan X Pascal and the GTX1080Ti for training a large Deep Neural Network.

Titan X vs Xp
Visually the only difference between the Titan X and Titan Xp is the lack of DVI on the Xp!


Results

The details about the test system and how the jobs were setup will follow the results. The primary results are for a training a Deep Neural Network (GoogleLeNet) for image classification with a 1.3 million image data set from ImageNet. I also have comparative nbody benchmark performance for several cards.

I have included results from a couple of older posts for comparison.


GoogLeNet model training with Caffe on 1.3 million image dataset for 30 epochs

GPU'sModel training runtime~ GPU(s) cost ($)
(1) GTX 1070 32hr 400
(2) GTX 1070 19hr 32min 800
(4) GTX 1070 12hr 43min 1600
(1) GTX 1080Ti 19hr 43min 700
(2) GTX 1080Ti 13hr 12min 1400
(4) GTX 1080Ti 7hr 43min 2800
(1) Titan X Pascal 19hr 34min 1400
(2) Titan X Pascal 13hr 21min 2800
(4) Titan X Pascal 8hr 1min 5600
(1) Titan Xp 17hr 33min 1400
(2) Titan Xp 10hr 40min 2800

Notes:

  • The Titan Xp offers 10-20% performance gain over the Titan X Pascal and the GTX1080Ti but at twice the cost

  • The (1) and (2) GTX 1070 and the (1) Titan Xp job runs were done with an image batch size of 64 all others used an image batch size of 128

  • It's not unusual to see fluctuations in run times on the order 30min.

The next table shows the results of nbody -benchmark -numbodies=256000 (nbody from the CUDA samples source code).


GTX 1070, 1080Ti, Titan X Pascal and Titan Xp nbody Benchmark

GPUnbody GFLOP/s
(1) GTX 1070 4137 GFLOP/s
(1) GTX 1080Ti 7514 GFLOP/s
(1) Titan X Pascal 7524 GFLOP/s
(1) Titan Xp 7904 GFLOP/s

Details

Video cards used for testing.( Data from nvidia-smi )

Card CUDA cores GPU clock MHz Memory clock MHz* Application clock MHz* FB Memory MiB
GTX 1070 1920 1506 4004 1506 8110
TITAN X Pascal 3584 1911 5005 1417 12186
GTX 1080Ti 3584 1911 5508 N/A 11172
TITAN Xp 3840 1911 5705 1430 12186

* Clocks can vary by manufacture and are not always displayed with nvidia-smi

Test System

The testing was done on my test-bench layout of our Peak Single (DIGITS GPU Workstation) recommended system for DIGITS/Caffe.

The Peak Single ("DIGITS" GPU Workstation)

  • CPU: Intel Core i7 6850K 6-core @ 3.6GHz (3.7GHz All-Core-Turbo)

  • Memory: 128 GB DDR4 2133MHz Reg ECC

  • PCIe: (4) X16-X16 v3

  • Motherboard: ASUS X99-E-10G WS

    Caveat:
    Heavy compute on GeForce cards can shorten their lifetime! I believe it is perfectly fine to use these cards but keep in mind that you may fry one now and then!

Software

The OS I used for this testing was Ubuntu 16.04.2 install with the Docker and NVIDIA-Docker Workstation configuration I've been working on. See, these posts for information about that;

Following is a list of the software in the nvidia/digits Docker image used in the testing.

Host environment was,


Test job image dataset

I used the training image set from
IMAGENET Large Scale Visual Recognition Challenge 2012 (ILSVRC2012)
I only used the the training set images from the "challenge". All 138GB of them! I used the tools in DIGITS to partition this set into a training set and validation set and then used the GoogLeNet 22-layer network.

  • Training set -- 960893 images
  • Validation set -- 320274 images
  • Model -- GoogLeNet
  • Duration -- 30 Epochs

Many of the images in the IMAGENET collection are copyrighted. This means that usage and distribution is somewhat restricted. One of the things listed in the conditions for download is this,
"You will NOT distribute the above URL(s)"
So, I wont. Please see the IMAGENET site for information on obtaining datasets.

Citation
    Olga Russakovsky*, Jia Deng*, Hao Su, Jonathan Krause, Sanjeev Satheesh, Sean Ma, Zhiheng Huang, Andrej Karpathy, Aditya Khosla, Michael Bernstein, Alexander C. Berg and Li Fei-Fei. (* = equal contribution) ImageNet Large Scale Visual Recognition Challenge. arXiv:1409.0575, 2014. paper | bibtex

Conclusions

The NVIDIA Titan Xp is a great card for GPU accelerated machine learning workloads and offers a noticeable improvement the Titan X Pascal card that it replaces. However, for these workloads running on a workstation the GTX 1080Ti offers a much better value. There is also a compelling argument for the GTX 1070 since it is also an excellent value given the respectable performance it is capable of.

Happy computing --dbk

Tags: Titan Xp, Titan X, GTX1080Ti, Machine Learning, GPU compute
Andre Garkauskas

"Heavy compute on GeForce cards can shorten their lifetime! I believe it is perfectly fine to use these cards but keep in mind that you may fry one now and then!" - Let me ask you: Have you ever fried a Geforce GTX before? Just curious!

Posted on 2017-04-20 01:05:46