In this post I will be compiling NAMD from source for good performance on modern GPU accelerated Workstation hardware. Doing a custom NAMD build from source code gives a moderate but significant boost in performance. This can be important considering that large simulations over many time-steps can run for days or weeks. I wanted to do some custom NAMD builds to ensure that that modern Workstation hardware was being well utilized. I include some results for the STMV benchmark showing the custom build performance boost. I’ve included some results using NVIDIA 1080Ti and Titan V GPU’s as well as an “experimental” build using an Ubuntu 18.04 base.
One of the questions I get asked frequently is “how much difference does PCIe X16 vs PCIe X8 really make?” Well, I got some testing done using 4 Titan V GPU’s in a machine that will do 4 X16 cards. I ran several jobs with TensorFlow with the GPU’s at both X16 and X8. Read on to see how it went.
I have been qualifying a 4 GPU workstation for Machine Learning and HPC use. The last confirmation testing I wanted to do was running it with TensorFlow benchmarks on 4 NVIDIA Titan V GPU’s. I have that systems up and running and the multi-GPU scaling looks very good.
Batch size is an important hyper-parameter for Deep Learning model training. When using GPU accelerated frameworks for your models the amount of memory available on the GPU is a limiting factor. In this post I look at the effect of setting the batch size for a few CNN’s running with TensorFlow on 1080Ti and Titan V with 12GB memory, and GV100 with 32GB memory.
Tensor-cores are one of the compelling new features of the NVIDIA Volta architecture. In this post I discuss the some thought on mixed precision and FP16 related to Tensor-cores. I have some performance results for large convolution neural network training that makes a good argument for trying to use them. Performance looks very good.
NIVIDA announced availability of the the Titan V card Friday December 8th. We had a couple in hand for testing on Monday December 11th, nice! I ran through many of the machine learning and simulation testing problems that I have done on Titan cards in the past. Results are not the near doubling in performance of past generations… but read on.