AMD 3900X (Brief) Compute Performance Linpack and NAMD

I was able to spend a little time with an AMD Ryzen 3900X. Of course the first thing I wanted know was the double precision floating point performance. My two favorite applications for a “first look” at a new processor are Linpack and NAMD. The Ryzen 3900X is a pretty impressive processor!

Numerical Computing Performance of 3 Intel 8-core CPUs – i9 9900K vs i7 9800X vs Xeon 2145W

In this post I’ll take a brief look at the numerical computing performance of three very capable 8-core processors — i9 9900K, i9 9800X and Xeon 2145W All three are great CPU’s but there are some significant differences that can cause confusion. I’ll discuss these differences and see how the processors stack up when running Linpack and NAMD molecular dynamics simulations.

P2P peer-to-peer on NVIDIA RTX 2080Ti vs GTX 1080Ti GPUs

There has been some concern about Peer-to-Peer (P2P) on the NVIDIA RTX Turing GPU’s. P2P is not available over PCIe as it has been in past cards. It is available with very good performance when using NVLINK with 2 cards. I did some testing to see how the performance compared between the GTX 1080Ti and RTX 2080Ti. There were some interesting results!

AMD Threadripper and (1-4) NVIDIA 2080Ti and 2070 for NAMD Molecular Dynamics

In my recent testing with the AMD Threadripper 2990WX is was impressed by the CPU based performance with the molecular dynamics program NAMD. NAMD makes a good benchmark for looking at CPU/GPU performance since it requires a balance and is usually limited by CPU. After some discussions I decided it would be good to look at multi-GPU performance with NAMD on Threadripper.

AMD Threadripper 2990WX 32-core vs Intel Xeon-W 2175 14-core – Linpack NAMD and Kernel Build Time

I recently wrote a post about building and running AMD Threadripper 2990WX with HPL Linpack – a “How-To”. Most of the time I had with the processor went into getting that to work. However, I did run a few other test jobs that I thought the 2990WX would do well with. I compared that against my personal workstation with a Xeon-W 2175. In this post I share those test runs with you. It’s not thorough testing by any means but it was interesting and I was surprised a couple of times with the results.

How to Run an Optimized HPL Linpack Benchmark on AMD Ryzen Threadripper — 2990WX 32-core Performance

The AMD Ryzen Threadripper 2990WX with 32 cores is an intriguing processor. I’ve been asked about performance for numerical computing and decided to find out how well it would do with my favorite benchmark the “High Performance Linpack” benchmark. This is used to rank Supercomputers on the Top500 list. It is not always simple to run this test since it can require building a few libraries from source. This includes the all important BLAS library which AMD has optimized in their BLIS package. I give you a complete How-To guide for getting this running to see what the 2990WX is capable of.

RTX 2080Ti with NVLINK – TensorFlow Performance (Includes Comparison with GTX 1080Ti, RTX 2070, 2080, 2080Ti and Titan V)

More Machine Learning testing with TensorFlow on the NVIDIA RTX GPU’s. This post adds dual RTX 2080 Ti with NVLINK and the RTX 2070 along with the other testing I’ve recently done. Performance in TensorFlow with 2 RTX 2080 Ti’s is very good! Also, the NVLINK bridge with 2 RTX 2080 Ti’s gives a bidirectional bandwidth of nearly 100 GB/sec!

NVLINK on RTX 2080 TensorFlow and Peer-to-Peer Performance with Linux

NVLINK is one of the more interesting features of NVIDIA’s new RTX GPU’s. In this post I’ll take a look at the performance of NVLINK between 2 RTX 2080 GPU’s along with a comparison against single GPU I’ve recently done. The testing will be a simple look at the raw peer-to-peer data transfer performance and a couple of TensorFlow job runs with and without NVLINK.

NVIDIA RTX 2080 Ti vs 2080 vs 1080 Ti vs Titan V, TensorFlow Performance with CUDA 10.0

Are the NVIDIA RTX 2080 and 2080Ti good for machine learning?
Yes, they are great! The RTX 2080 Ti rivals the Titan V for performance with TensorFlow. The RTX 2080 seems to perform as well as the GTX 1080 Ti (although the RTX 2080 only has 8GB of memory). I’ve done some testing using **TensorFlow 1.10** built against **CUDA 10.0** running on **Ubuntu 18.04** with the **NVIDIA 410.48 driver**.

NAMD Custom Build for Better Performance on your Modern GPU Accelerated Workstation — Ubuntu 16.04, 18.04, CentOS 7

In this post I will be compiling NAMD from source for good performance on modern GPU accelerated Workstation hardware. Doing a custom NAMD build from source code gives a moderate but significant boost in performance. This can be important considering that large simulations over many time-steps can run for days or weeks. I wanted to do some custom NAMD builds to ensure that that modern Workstation hardware was being well utilized. I include some results for the STMV benchmark showing the custom build performance boost. I’ve included some results using NVIDIA 1080Ti and Titan V GPU’s as well as an “experimental” build using an Ubuntu 18.04 base.