We have a new collection of GPU accelerated Molecular Dynamics benchmark packages put together for GROMACS, NAMD 2, and NAMD 3-alpha10. (The benchmark packages will be available to the public soon.) In this post we present results for, - 3 applications: GROMACS, NAND 2 and NAMD 3alpha10, - 8 MD simulations, - 12 different NVIDIA GPUs, - 96 total results.
NVIDIA Enroot has a unique feature that will let you easily create an executable, self-contained, single-file package with a container image AND the runtime to start it up! This allows creation of a container package that will run itself on a system with or without Enroot installed on it! "Enroot Bundles".
Enroot is a simple and modern way to run "docker" or OCI containers. It provides an unprivileged user "sandbox" that integrates easily with a "normal" end user workflow. I like it for running development environments and especially for running NVIDIA NGC containers. In this post I'll go through steps for installing enroot and some simple usage examples including running NVIDIA NGC containers.
WSL2 offers improved performance over version 1 by providing more direct access to the host hardware drivers. Recent "Insider Dev Channel" builds of Win10 even allows access to the Windows NVIDIA display driver for GPU computing applications for WSL2 Linux applications! The performance improvements with WSL2 are largely because this version is running as a privileged virtual machine on to of MS Hyper-V. This means that at least low level support for the Hyper-V virtualization layer needs to be enabled to use it. In particular, the Windows feature "VirtualMachinePlatform" must be enabled for WSL2. We tested to see if there was any negative application performance impact.
Ubuntu 19.04 will be released soon so I decided to see if CUDA 10.1 could be installed on it. Yes, it can and it seems to work fine. In this post I walk through the install and show that docker and nvidia-docker also work. I ran TensorFlow 2.0- alpha on Ubuntu 19.04 beta.
NAMD Custom Build for Better Performance on your Modern GPU Accelerated Workstation -- Ubuntu 16.04, 18.04, CentOS 7Written on July 20, 2018 by Dr Donald Kinghorn
In this post I will be compiling NAMD from source for good performance on modern GPU accelerated Workstation hardware. Doing a custom NAMD build from source code gives a moderate but significant boost in performance. This can be important considering that large simulations over many time-steps can run for days or weeks. I wanted to do some custom NAMD builds to ensure that that modern Workstation hardware was being well utilized. I include some results for the STMV benchmark showing the custom build performance boost. I've included some results using NVIDIA 1080Ti and Titan V GPU's as well as an "experimental" build using an Ubuntu 18.04 base.
One of the questions I get asked frequently is "how much difference does PCIe X16 vs PCIe X8 really make?" Well, I got some testing done using 4 Titan V GPU's in a machine that will do 4 X16 cards. I ran several jobs with TensorFlow with the GPU's at both X16 and X8. Read on to see how it went.
I attended the Microsoft Build 2018 developers conference this week and really enjoyed it. I wanted to share my "big picture" feelings about it and some of the things that stood out to me. I'm not going to give you a "reporters" view or repeat press-release items. This is just my personal impression of the conference.
I have been qualifying a 4 GPU workstation for Machine Learning and HPC use. The last confirmation testing I wanted to do was running it with TensorFlow benchmarks on 4 NVIDIA Titan V GPU's. I have that systems up and running and the multi-GPU scaling looks very good.
GPU Memory Size and Deep Learning Performance (batch size) 12GB vs 32GB -- 1080Ti vs Titan V vs GV100Written on April 27, 2018 by Dr Donald Kinghorn
Batch size is an important hyper-parameter for Deep Learning model training. When using GPU accelerated frameworks for your models the amount of memory available on the GPU is a limiting factor. In this post I look at the effect of setting the batch size for a few CNN's running with TensorFlow on 1080Ti and Titan V with 12GB memory, and GV100 with 32GB memory.
Tensor-cores are one of the compelling new features of the NVIDIA Volta architecture. In this post I discuss the some thought on mixed precision and FP16 related to Tensor-cores. I have some performance results for large convolution neural network training that makes a good argument for trying to use them. Performance looks very good.
NVIDIA's Graphics Technology Conference (GTC) is probably my all-time favorite conference. It's an interesting blend of "Scientific Research meeting" and Trade-Show. It's put on by a hardware vendor but still feels like a scientific meeting. It's not just a "Kool-Aid" fest! In this post I go present some of my thoughts about this years conference.
TensorFlow is a very powerful numerical computing framework. However, like any large research level program it can be challenging to install and configure. In this post I'll try to give some guidance on relatively easy ways to get started with TensorFlow. I'll only look at relatively simple "CPU only" Installs with "standard" Python and Anaconda Python in this post. (I also have a quick test with Intel Python.)
TensorFlow is on it's way to becoming the "standard" framework for machine learning. There are many reasons for that, and, it is not just for machine learning! In this post I'll give a descriptive introduction to TensorFlow. This is the first post in a series on how to work with TensorFlow. Hopefully after reading thsi you will have a better understanding of the What? and Why? of TensorFlow.
This post will look at the molecular dynamics program, NAMD. NAMD has good GPU acceleration but is heavily dependent on CPU performance as well. It achieves best performance when there is a proper balance between CPU and GPU. The system under test has 2 Xeon 8180 28-core CPU's. That's the current top of the line Intel processor. We'll see how many GPU's we can add to those Xeon 8180 CPU's to get optimal CPU/GPU compute balance with NAMD.
TensorFlow Scaling on 8 1080Ti GPUs - Billion Words Benchmark with LSTM on a Docker Workstation ConfigurationWritten on March 2, 2018 by Dr Donald Kinghorn
In this post I present some Multi-GPU scaling tests running TensorFlow on a very nice system with 8 1080Ti GPU's. I use the Docker Workstation setup that I have recently written about. The job I ran for this testing was the "Billion Words Benchmark" using an LSTM model. Results were very good and better than expected.
The Intel CPU flaw and the Meltdown and Spectre security exploits are causing a lot of concern. There is a possibility of application slowdown from the kernel patches to mitigate the exploits. This slowdown concern is a concern for GPU accelerated application because of the systems calls they require for moving data between CPU and GPU memory space. I did some testing on a couple of large Tensorflow and Caffe machine learning jobs along with the creation of a LMDA database from 1.3 million images.
In this post I'll be working up, analyzing, visualizing, and doing Gradient Descent for Linear Regression. It's a Jupyter notebook with all the code for plots and functions in Python available on my github account.
In Part 3 of this series on Linear Regression I will go into more detail about the Model and Cost function. Including several graphs that will hopefully give insight into the their nature and serve as a reference for developing algorithms in the next post.
In Part 2 of this series on Linear Regression I will pull a data-set of house sale prices and "features" from Kaggle and explore the data in a Jupyter notebook with pandas and seaborn. We will extract a good subset of data to use for our example analysis of the linear regression algorithms.
Linear regression could possibly be considered the "Hello World" problem of Machine Learning. It's implementation touches on many of the fundamental ideas and problems in this field. I'll give you some guidance for understanding and implementation of this fundamental idea.
This is the start of a series of posts on Machine Learning and Data Science. I'll be exploring the algorithms and tools of Machine Learning and Data Science. It will be tutorials, guides, how-to, reviews and "real world" application. The post will be done using Juypter notebooks and the notebooks will be available on GitHub.
I've been doing this series of posts about setting up Docker for your desktop system, so why not literally add containers to your desktop! The way we have Docker configured, containers are the same as other applications you run. In this post I'll show you how to add icons and menu items to launch containers.
Docker can be complex but for use on single-user-workstation you can get a lot done with just a few commands. This post will go through some commands to manage your images and containers. We will also go through the process of building a docker image for CUDA development that includes OpenGl support.
A few weeks ago I wrote a blog post titled Should You Learn to Program with Python. If you read that and decided the answer is yes then this post is for you.