Puget Systems print logo

https://www.pugetsystems.com

Read this article at https://www.pugetsystems.com/guides/1321
Dr Donald Kinghorn (Scientific Computing Advisor )

AMD Threadripper and (1-4) NVIDIA 2080Ti and 2070 for NAMD Molecular Dynamics

Written on December 14, 2018 by Dr Donald Kinghorn
Share:


In my recent testing with the AMD Threadripper 2990WX is was impressed by the CPU based performance with the molecular dynamics program NAMD. Of course adding NVIDIA GPU's to the system gives a dramatic improvement since NAMD has good GPU acceleration. I like NAMD for many reasons and one of them is that it makes makes a pretty good benchmark for looking at CPU/GPU performance. NAMD requires a balance between CPU and GPU for the best results. It is also not very sensitive to speedup from AVX vector units. NAMD generally scales well with lots of cores (or lots of cluster nodes). After some discussions I decided it would be good to look at multi-GPU performance with NAMD on Threadripper. The assumption being that there would be enough cores to keep up with the NVIDIA's powerful new GPU's.

My last post AMD Threadripper 2990WX 32-core vs Intel Xeon-W 2175 14-core - Linpack NAMD and Kernel Build Time is good background for the present post and has some interesting comparison with an Intel 14-core Xeon-W system.

I spent a long afternoon on the same basic system I used in the last post. I was able to get a little testing done with the 24-core Threadripper 2970WX but most of the results are utilizing the 2990WX 32-core processor.

I had 2 "side fan" cooled NVIDIA RTX 2070 GPU's. It is not practical to use more than 2 of these types of cards in a system because of thermal throttling issues (very bad), see NVIDIA Dual-Fan GeForce RTX Coolers Ruining Multi-GPU Performance. A couple of days after doing the testing we got in our first batch of RTX 2070's with blower fans! You should be able to configure systems with these now.

We did have blower fan versions of the RTX 2080Ti so I was able to test with 1 to 4 of these great cards.


Test systems: AMD 2990WX and Intel Xeon-W 2175

The AMD Threadripper system I used was a test-bed build with the following main components,

AMD Hardware

  • AMD Ryzen Threadripper 2990WX 32-Core @ 3.00GHz (4.2GHz Turbo)
  • AMD Ryzen Threadripper 2970WX 24-Core @ 3.00GHz (4.0GHz Turbo)
  • Gigabyte X399 AORUS XTREME-CF Motherboard
  • 128GB DDR4 2666 MHz memory
  • Samsung 970 PRO 512GB M.2 SSD
  • NVIDIA RTX 2070
  • NVIDIA RTX 2090Ti

Software


Testing Results

When I sat down in front of the system it had a TR 2970WX 24-core processors in it so I did a few job runs with that before I swapped in the 2990WX. The first jobs I ran were CPU only. the results were very satisfying in that the scaling with increasing number of threads was very uniform. It is interesting that NAMD performance improved uniformly with SMT "hyper-threads". This is not always the case and you often see that only "real" cores improve performance.

CPU results

The graph shows how well the SMT threads worked with NAMD. Note that Lower is Better! What is being reported is the default NAMD performance output in day/ns i.e days needed to do 1 nano-second of simulation. Yes, this is a very compute intensive task! Big jobs can run for weeks or months. My job runs were for 500 time steps of the simulation.

TR NAMD CPU

That is very good CPU performance for that job run! In an older post, NAMD Performance on Xeon-Scalable 8180 and 8 GTX 1080Ti GPUs, using a dual Xeon 8180 system with a total of 56 CPU cores I had a result of 2.93 day/ns with 32 cores. Those processors cost over $10,000 each. So the 32-core Threadripper is a bargain by comparison. [Using all 56 cores on that Intel system I got 1.68 day/ns]. Note: if you look at that older post you will see that I took the inverse of the normal NAMD output and reported ns/day. Keep that in mind if you make a comparison. (sorry about that)

GPU accelerated results

The first thing I should say about the GPU results is that, even with the good performance from the 32-cores of the 2990WX, it's just not enough to keep up with more than 1 or 2 of the new NVIDIA RTX GPU's. The range of the worst result with 1 2070 to the best result with 4 2080Ti's is only a speedup of 1.6.

I'm not saying these results are bad! They are actually very good and they clearly show how much performance gain there is from adding even a "modest" GPU like the RTX 2070 which gives a speedup of nearly 5 over the CPU only result. However, by the time you have added 2 of the RTX 2070's or 2080Ti's you are being limited by the CPU.

In the older post I mentioned above, the dual Xeon 8180's provided enough CPU capability to get 0.438 day/ns with 1 GTX 1080Ti and using 2 1080Ti's gave 0.248 day/ns. Additional GPU's only made a small performance improvement over that, again being limited by CPU. (I tested with up to 8 GPU's).

Another thing to note in these results is the effect of the SMT "hyper-threads". With the CPU only runs there was a nice improvement with more SMT threads. When the GPU's were added the results were not as predictable. With more than 1 GPU it seemed that the SMT treads were a determent to performance.

The following table has all of the results of the testing.

TR NAMD CPU GPU

In this chart you can see that there is not much performance difference for many of the configurations. Also note that there can be significant performance variation between job runs. I only did two job runs on each test configuration and took the best one. It is clear that the TR 2990WX is providing more CPU performance than what is balanced with 1 RTX 2070. Adding a second RT 2070 or 1-2 RTX 2080Ti's provided more GPU performance than the CPU could effectively keep up with.

The following chart gives an easier to see, more uniform, performance scaling as the system specs are improved.

TR NAMD CPU GPU

Recommendation

No mater what CPU you have in your system if you are running NAMD then adding an NVIDIA GPU will be a significant performance boost. Hopefully this post shows that, and also makes clear the need for significant CPU performance to efficiently balance with modern GPU's. My recommendation for an AMD CPU based NAMD system would be the TR 2990WX and either 2 RTX 2070's or 1 RTX 2080Ti.

I will be doing a more comprehensive test with many GPU's for jobs including NAMD ( but with more focus on Machine Learning/AI ). That will be using the new Intel Core-X processors. In general I personally prefer an Intel CPU with AVX512 vector units for the basis of any scientific workstation. However, the high core count AMD Threadripper did really well in this NAMD testing. ... but see the note below ...

As a last note, I had to cut my testing short because the system failed after a normal OS update that I did in preparation to install CUDA for CPU-GPU memory bandwidth testing. I had 4 RTX 2080Ti's in the system and had booted to that with no problems. After a simple "apt-get upgrade" the system would no longer get to the boot prompt. I didn't have the time to try to find the problem.

Happy computing! --dbk

Tags: Threadripper, Ryzen, 2990WX, NAMD, HPC, Linux
el farmacéutico

Hello!
I am a complete noob when it comes to hardware for molecular modeling, and i was wondering how was the performance of AMD graphic cards for this kinds of tasks. I know AMD cards where preferred over nvidia for criptomining because they where suposedly most suited for calculations, but i have never seen a benchmark using AMD cards for molecular modeling.
Would you mind explaining to me why AMD cards are not used for this? And if they are suited, are they better or worse than Nvidia?
I thank you kindly for taking your time (and money) to do these tests, it has been extremely useful!.

Posted on 2018-12-26 22:19:09
Alexey Trubitsyn

I was not the one who had been asked, never the less hope this will help you. AMD cards are not used so widely for molecular modeling due to historical reasons: NVIDIA were the first to deal with GPU computing. People first tried to use triangles and textures to do scientific computations on a GPU. NVIDIA supported that approach and developed a first library which would make it easier to write this kind of software. CUDA was very restricted at first, but due to the lack of competitors it got widely adopted.
The question "are they better or worse" as it always does with such questions boils down to details like:
What exectly do you need in your project?
Performance? Both sides compete with each other time to time. AMD cards may be slightly more efficient in terms of computation per $USD.
Community support? CUDA has been experimented widely by the academic researchers over the years. Have been seeing many supercomputing centres installing Nvidia. Though OpenCL is catching up recently.
Programming productivity? NVIDIA CUDA porgramming is relatively simpler as it only needs to support its own GPU. The Unified Memory also might be a big deal for certain people.
Profiling and debugging capability? Both got their own software tools that are about equal as for me.
Stable driver support? Have been using Nvidia driver on Linux platform with no major problems what so ever. Had some compatibility difficalties with AMD drivers on Linux with various hardware.
Vendor independence? The main great feature of OpenCL is heterogeneous computing: same code can be launched on GPU, CPU, etc. With AMD card you’ll be using OpenCL wich can run NVIDIA card also, but you will certainly face performance issues during such transformation. Take a look at this paper for detailes: https://www.spiedigitallibr...
P.S. If you are new to the field I would strongly recomend NVIDIA + CUDA for your tasks mainly due to tons of tutorials online, bigger community and the ease of getting your system up and running. Best of luck in your research!

Posted on 2019-01-13 17:14:47
Donald Kinghorn

Thank you Alexey! I had forgotten to add this post to my comment monitoring!

Posted on 2019-01-14 19:29:56
Fernando Bachega

Hi Dr Donald Kinghorn. Thanks for such nice review.

I'm a NaMD user and considering buying the following CPU + GPU:

AMD Ryzen 7 2700X c/ Wraith Prism Cooler, Octa Core, Cache 20MB, 3.7GHz (Max Turbo 4.35GHz) AM4 - YD270XBGAFBOX

VGA EVGA NVIDIA GeForce GTX 1080 FTW 8GB, GDDR5, 256 Bits - 08G-P4-6286-KR

Do you think it's a good setup for someone with a limited budget?

Thanks a lot for you attention.

Posted on 2019-01-05 02:10:12
TA Nie

Not the writer, but that will be a decent machine and get the job done for sure.

Posted on 2019-01-14 17:52:04
Donald Kinghorn

Thank you for responding to the question ... I do agree ... and I have added myself to the comment notification list now :-)

Posted on 2019-01-14 19:32:00