Puget Systems print logo

https://www.pugetsystems.com



Read this article at https://www.pugetsystems.com/guides/1303
Dr Donald Kinghorn (Scientific Computing Advisor )

AMD Threadripper 2990WX 32-core vs Intel Xeon-W 2175 14-core - Linpack NAMD and Kernel Build Time

Written on December 6, 2018 by Dr Donald Kinghorn

I recently wrote a post about building and running AMD Threadripper 2990WX with HPL Linpack - a "How-To". Most of the time I had with the processor went into getting that to work. However, I did run a few other test jobs that I thought the 2990WX would do well with. I compared that against my personal workstation with a Xeon-W 2175. In this post I share those test runs with you. It's not thorough testing by any means but it was interesting and I was surprised a couple of times with the results.


Read this article at https://www.pugetsystems.com/guides/1291
Dr Donald Kinghorn (Scientific Computing Advisor )

How to Run an Optimized HPL Linpack Benchmark on AMD Ryzen Threadripper -- 2990WX 32-core Performance

Written on November 30, 2018 by Dr Donald Kinghorn

The AMD Ryzen Threadripper 2990WX with 32 cores is an intriguing processor. I've been asked about performance for numerical computing and decided to find out how well it would do with my favorite benchmark the "High Performance Linpack" benchmark. This is used to rank Supercomputers on the Top500 list. It is not always simple to run this test since it can require building a few libraries from source. This includes the all important BLAS library which AMD has optimized in their BLIS package. I give you a complete How-To guide for getting this running to see what the 2990WX is capable of.


Read this article at https://www.pugetsystems.com/guides/1267
Dr Donald Kinghorn (Scientific Computing Advisor )

RTX 2080Ti with NVLINK - TensorFlow Performance (Includes Comparison with GTX 1080Ti, RTX 2070, 2080, 2080Ti and Titan V)

Written on October 26, 2018 by Dr Donald Kinghorn

More Machine Learning testing with TensorFlow on the NVIDIA RTX GPU's. This post adds dual RTX 2080 Ti with NVLINK and the RTX 2070 along with the other testing I've recently done. Performance in TensorFlow with 2 RTX 2080 Ti's is very good! Also, the NVLINK bridge with 2 RTX 2080 Ti's gives a bidirectional bandwidth of nearly 100 GB/sec!


Read this article at https://www.pugetsystems.com/guides/1262
Dr Donald Kinghorn (Scientific Computing Advisor )

NVLINK on RTX 2080 TensorFlow and Peer-to-Peer Performance with Linux

Written on October 16, 2018 by Dr Donald Kinghorn

NVLINK is one of the more interesting features of NVIDIA's new RTX GPU's. In this post I'll take a look at the performance of NVLINK between 2 RTX 2080 GPU's along with a comparison against single GPU I've recently done. The testing will be a simple look at the raw peer-to-peer data transfer performance and a couple of TensorFlow job runs with and without NVLINK.


Read this article at https://www.pugetsystems.com/guides/1247
Dr Donald Kinghorn (Scientific Computing Advisor )

NVIDIA RTX 2080 Ti vs 2080 vs 1080 Ti vs Titan V, TensorFlow Performance with CUDA 10.0

Written on October 3, 2018 by Dr Donald Kinghorn

Are the NVIDIA RTX 2080 and 2080Ti good for machine learning? Yes, they are great! The RTX 2080 Ti rivals the Titan V for performance with TensorFlow. The RTX 2080 seems to perform as well as the GTX 1080 Ti (although the RTX 2080 only has 8GB of memory). I've done some testing using **TensorFlow 1.10** built against **CUDA 10.0** running on **Ubuntu 18.04** with the **NVIDIA 410.48 driver**.


Read this article at https://www.pugetsystems.com/guides/1236
Dr Donald Kinghorn (Scientific Computing Advisor )

How To Install CUDA 10 (together with 9.2) on Ubuntu 18.04 with support for NVIDIA 20XX Turing GPUs

Written on September 27, 2018 by Dr Donald Kinghorn

NVIDIA recently released version 10.0 of CUDA. This is an upgrade from the 9.x series and has support for the new Turing GPU architecture. This CUDA version has full support for Ubuntu 18.4 as well as 16.04 and 14.04. The CUDA 10.0 release is bundled with the new 410.x display driver for Linux which will be needed for the 20xx Turing GPU's. If you are doing development work with CUDA or running packages that require you to have the CUDA toolkit installed then you will probably want to upgrade to this. I'll go though how to do the install of CUDA 10.0 either by itself or along with an existing CUDA 9.2 install.


Read this article at https://www.pugetsystems.com/guides/1230
Dr Donald Kinghorn (Scientific Computing Advisor )

PyTorch for Scientific Computing - Quantum Mechanics Example Part 4) Full Code Optimizations -- 16000 times faster on a Titan V GPU

Written on September 14, 2018 by Dr Donald Kinghorn

This is the 16000 times speedup code optimizations for the scientific computing with PyTorch Quantum Mechanics example. The following quote says a lot, "The big magic is that on the Titan V GPU, with batched tensor algorithms, those million terms are all computed in the same time it would take to compute 1!!!"


Read this article at https://www.pugetsystems.com/guides/1225
Dr Donald Kinghorn (Scientific Computing Advisor )

PyTorch for Scientific Computing - Quantum Mechanics Example Part 3) Code Optimizations - Batched Matrix Operations, Cholesky Decomposition and Inverse

Written on August 31, 2018 by Dr Donald Kinghorn

An amazing result in this testing is that "batched" code ran in constant time on the GPU. That means that doing the Cholesky decomposition on 1 million matrices took the same amount of time as it did with 10 matrices! In this post we start looking at performance optimization for the Quantum Mechanics problem/code presented in the first 2 posts. This is the start of the promise to make the code over 15,000 times faster! I still find the speedup hard to believe but it turns out little things can make a big difference.


Read this article at https://www.pugetsystems.com/guides/1222
Dr Donald Kinghorn (Scientific Computing Advisor )

PyTorch for Scientific Computing - Quantum Mechanics Example Part 2) Program Before Code Optimizations

Written on August 16, 2018 by Dr Donald Kinghorn

This is the second post on using Pytorch for Scientific computing. I'm doing an example from Quantum Mechanics. In this post we go through the formulas that need to coded and write them up in PyTorch and give everything a test.


Read this article at https://www.pugetsystems.com/guides/1207
Dr Donald Kinghorn (Scientific Computing Advisor )

Doing Quantum Mechanics with a Machine Learning Framework: PyTorch and Correlated Gaussian Wavefunctions: Part 1) Introduction

Written on July 31, 2018 by Dr Donald Kinghorn

A Quantum Mechanics problem coded up in PyTorch?! Sure! Why not? I'll explain just enough of the Quantum Mechanics and Mathematics to make the problem and solution (kind of) understandable. The focus is on how easy it is to implement in PyTorch. This first post will give some explanation of the problem and do some testing of a couple of the formulas that will need to be coded up.


Read this article at https://www.pugetsystems.com/guides/1196
Dr Donald Kinghorn (Scientific Computing Advisor )

NAMD Custom Build for Better Performance on your Modern GPU Accelerated Workstation -- Ubuntu 16.04, 18.04, CentOS 7

Written on July 20, 2018 by Dr Donald Kinghorn

In this post I will be compiling NAMD from source for good performance on modern GPU accelerated Workstation hardware. Doing a custom NAMD build from source code gives a moderate but significant boost in performance. This can be important considering that large simulations over many time-steps can run for days or weeks. I wanted to do some custom NAMD builds to ensure that that modern Workstation hardware was being well utilized. I include some results for the STMV benchmark showing the custom build performance boost. I've included some results using NVIDIA 1080Ti and Titan V GPU's as well as an "experimental" build using an Ubuntu 18.04 base.


Read this article at https://www.pugetsystems.com/guides/1193
Dr Donald Kinghorn (Scientific Computing Advisor )

Why You Should Consider PyTorch (includes Install and a few examples)

Written on July 13, 2018 by Dr Donald Kinghorn

PyTorch is a relatively new ML/AI framework. It combines some great features of other packages and has a very "Pythonic" feel. It has excellent and easy to use CUDA GPU acceleration. It is fun to use and easy to learn. read on for some reasons you might want to consider trying it. I've got some unique example code you might find interesting too.


Read this article at https://www.pugetsystems.com/guides/1191
Dr Donald Kinghorn (Scientific Computing Advisor )

Easy Image Bounding Box Annotation with a Simple Mod to VGG Image Annotator

Written on June 29, 2018 by Dr Donald Kinghorn

In this post I go through a simple modification to the VGG Image Annotator that adds easy to use buttons for adding labels to image object bounding-boxes. It is very fast way to do what could be a tedious machine learning data preparation task.


Read this article at https://www.pugetsystems.com/guides/1187
Dr Donald Kinghorn (Scientific Computing Advisor )

The Best Way to Install TensorFlow with GPU Support on Windows 10 (Without Installing CUDA)

Written on June 21, 2018 by Dr Donald Kinghorn

In this post I'll walk you through the best way I have found so far to get a good TensorFlow work environment on Windows 10 including GPU acceleration. YOU WILL NOT HAVE TO INSTALL CUDA! I'll also go through setting up Anaconda Python and create an environment for TensorFlow and how to make that available for use with Jupyter notebook. As a "non-trivial" example of using this setup we'll go through training LeNet-5 with Keras using TensorFlow with GPU acceleration. We'll get a setup that is 18 times faster than using the CPU alone.


Read this article at https://www.pugetsystems.com/guides/1184
Dr Donald Kinghorn (Scientific Computing Advisor )

How to install CUDA 9.2 on Ubuntu 18.04

Written on June 15, 2018 by Dr Donald Kinghorn

If you are wanting to use Ubuntu 18.04 and also want a CUDA install this post should help you get that working.


Read this article at https://www.pugetsystems.com/guides/1178
Dr Donald Kinghorn (Scientific Computing Advisor )

The Best Way To Install Ubuntu 18.04 with NVIDIA Drivers and any Desktop Flavor

Written on June 8, 2018 by Dr Donald Kinghorn

In this post I'll be going over details of Installing Ubuntu 18.04 including the NVIDIA display driver and, any one of the available desktop environments. I'll do this starting from a base server install. I'll go over a few possible pitfalls and end with a short discussion on the new netplan configuration tool for Ubuntu networking.


Read this article at https://www.pugetsystems.com/guides/1172
Dr Donald Kinghorn (Scientific Computing Advisor )

Install TensorFlow with GPU Support on Windows 10 (without a full CUDA install)

Written on June 4, 2018 by Dr Donald Kinghorn

In this post I'll walk you through the best way I have found so far to get a good TensorFlow work environment on Windows 10 including GPU acceleration. I'll go through how to install just the needed libraries (DLL's) from CUDA 9.0 and cuDNN 7.0 to support TensorFlow 1.8. I'll also go through setting up Anaconda Python and create an environment for TensorFlow and how to make that available for use with Jupyter notebook. As a "non-trivial" example of using this setup we'll go through training LeNet-5 with Keras using TensorFlow with GPU acceleration. We'll get a setup that is 18 times faster than using the CPU alone.


Read this article at https://www.pugetsystems.com/guides/1170
Dr Donald Kinghorn (Scientific Computing Advisor )

Install TensorFlow with GPU Support the Easy Way on Ubuntu 18.04 (without installing CUDA)

Written on May 25, 2018 by Dr Donald Kinghorn

TensorFlow is a very important Machine/Deep Learning framework and Ubuntu Linux is a great workstation platform for this type of work. If you are wanting to setup a workstation using Ubuntu 18.04 with CUDA GPU acceleration support for TensorFlow then this guide will hopefully help you get your machine learning environment up and running without a lot of trouble. And, you don't have to do a CUDA install!


Read this article at https://www.pugetsystems.com/guides/1167
Dr Donald Kinghorn (Scientific Computing Advisor )

PCIe X16 vs X8 with 4 x Titan V GPUs for Machine Learning

Written on May 21, 2018 by Dr Donald Kinghorn

One of the questions I get asked frequently is "how much difference does PCIe X16 vs PCIe X8 really make?" Well, I got some testing done using 4 Titan V GPU's in a machine that will do 4 X16 cards. I ran several jobs with TensorFlow with the GPU's at both X16 and X8. Read on to see how it went.


Read this article at https://www.pugetsystems.com/guides/1157
Dr Donald Kinghorn (Scientific Computing Advisor )

Microsoft Build 2018 -- impressions

Written on May 11, 2018 by Dr Donald Kinghorn

I attended the Microsoft Build 2018 developers conference this week and really enjoyed it. I wanted to share my "big picture" feelings about it and some of the things that stood out to me. I'm not going to give you a "reporters" view or repeat press-release items. This is just my personal impression of the conference.


Read this article at https://www.pugetsystems.com/guides/1152
Dr Donald Kinghorn (Scientific Computing Advisor )

Multi-GPU scaling with Titan V and TensorFlow on a 4 GPU Workstation

Written on May 4, 2018 by Dr Donald Kinghorn

I have been qualifying a 4 GPU workstation for Machine Learning and HPC use. The last confirmation testing I wanted to do was running it with TensorFlow benchmarks on 4 NVIDIA Titan V GPU's. I have that systems up and running and the multi-GPU scaling looks very good.


Read this article at https://www.pugetsystems.com/guides/1146
Dr Donald Kinghorn (Scientific Computing Advisor )

GPU Memory Size and Deep Learning Performance (batch size) 12GB vs 32GB -- 1080Ti vs Titan V vs GV100

Written on April 27, 2018 by Dr Donald Kinghorn

Batch size is an important hyper-parameter for Deep Learning model training. When using GPU accelerated frameworks for your models the amount of memory available on the GPU is a limiting factor. In this post I look at the effect of setting the batch size for a few CNN's running with TensorFlow on 1080Ti and Titan V with 12GB memory, and GV100 with 32GB memory.


Read this article at https://www.pugetsystems.com/guides/1141
Dr Donald Kinghorn (Scientific Computing Advisor )

NVIDIA Titan V plus Tensor-cores Considerations and Testing of FP16 for Deep Learning

Written on April 20, 2018 by Dr Donald Kinghorn

Tensor-cores are one of the compelling new features of the NVIDIA Volta architecture. In this post I discuss the some thought on mixed precision and FP16 related to Tensor-cores. I have some performance results for large convolution neural network training that makes a good argument for trying to use them. Performance looks very good.


Read this article at https://www.pugetsystems.com/guides/1134
Dr Donald Kinghorn (Scientific Computing Advisor )

Build TensorFlow-GPU with CUDA 9.1 MKL and Anaconda Python 3.6 using a Docker Container

Written on April 12, 2018 by Dr Donald Kinghorn

Building TensorFlow from source is challenging but the end result can be a version tailored to your needs. This post will provide step-by-step instructions for building TensorFlow 1.7 linked with Anaconda3 Python, CUDA 9.1, cuDNN7.1, and Intel MKL-ML. I do the build in a docker container and show how the container is generated from a Dockerfile.


Read this article at https://www.pugetsystems.com/guides/1133
Dr Donald Kinghorn (Scientific Computing Advisor )

Build TensorFlow-CPU with MKL and Anaconda Python 3.6 using a Docker Container

Written on April 6, 2018 by Dr Donald Kinghorn

In this post I go through how to use Docker to create a container with all of the libraries and tools needed to compile TensorFlow 1.7. The build will include links to Intel MKL-ML (Intel's math kernel library plus extensions for Machine Learning) and optimizations for AVX512.