HPC Posts Archive

HPL Amdhal's Law scaling chart of results in GFLOPS for TrPRO 7995WX, 7985WX, Tr 7980X and Xeon w9-3495X

AMD Zen4 Threadripper PRO vs Intel Xeon-w9 For Science and Engineering

Posted on March 7, 2024 by Dr. Donald Kinghorn

The performance improvement with the new Zen4 TrPRO over the Zen3 TrPRO is very impressive!
My first recommendation for a Scientific and Engineering workstation CPU would now be the AMD Zen4 architecture as either Zen4 Threadripper PRO or Zen4 EPYC for multi-socket systems.

Benchmarking with TensorRT-LLM

Posted on February 16, 2024 by Jon Allman

Evaluating the speed of GeForce RTX 40-Series GPUs using NVIDIA’s TensorRT-LLM tool for benchmarking GPU inference performance.

Experiences with Multi-GPU Stable Diffusion Training

Posted on January 29, 2024 by Jon Allman

Results and thoughts with regard to testing a variety of Stable Diffusion training methods using multiple GPUs.

Problems With RTX4090 MultiGPU and AMD vs Intel vs RTX6000Ada or RTX3090

Posted on February 15, 2023 by Dr. Donald Kinghorn

I was prompted to do some testing by a commenter on one of my recent posts. They had concerns about problems with dual NVIDIA RTX4090s on AMD Threadripper Pro platforms. I ran some applications to reproduce the problems reported above and tried to dig deeper into the issues with more extensive testing. The included table below tells all!

Ryzen 7950x Zen4 AVX512 Performance With AMD AOCCv4 HPL HPCG HPL-MxP

Posted on January 20, 2023 by Dr. Donald Kinghorn

This post is a first-look at performance of the Ryzen7 7950x CPU using the latest AMD compiler release with support for Zen4 arch including AVX512 vector instructions. Performance is tested using the HPC standard benchmarks, HPL (High Performance Linpack), HPCG (High Performance Conjugate Gradient) and the newer HPC Top500 benchmark, HPL-MxP (formerly HPL-AI).

Molecular Dynamics Benchmarks GPU Roundup GROMACS NAMD2 NAMD 3alpha on 12 GPUs

Posted on May 9, 2022 by Dr. Donald Kinghorn

We have a new collection of GPU accelerated Molecular Dynamics benchmark packages put together for GROMACS, NAMD 2, and NAMD 3-alpha10. (The benchmark packages will be available to the public soon.) In this post we present results for,
– 3 applications: GROMACS, NAND 2 and NAMD 3alpha10,
– 8 MD simulations,
– 12 different NVIDIA GPUs,
– 96 total results.

Intel Ice Lake Xeon-W vs AMD TR Pro Compute Performance (HPL, HPCG, NAMD, Numpy)

Posted on July 29, 2021 by Dr. Donald Kinghorn

The single socket version of Intel third generation Xeon SP is out, the Ice Lake Xeon W 33xx. This is a much better platform with faster large capacity 8 channel memory and PCIe v4 with plenty of lanes. The new Intel platform is very much like the AMD Threadripper Pro (single socket version of EPYC Rome) so this is the obvious comparison to make. Read on to see how the numerical computing testing went!

NVIDIA 3080Ti Compute Performance ML/AI HPC

Posted on June 18, 2021 by Dr. Donald Kinghorn

For computing tasks like Machine Learning and some Scientific computing the RTX3080TI is an alternative to the RTX3090 when the 12GB of GDDR6X is sufficient. (Compared to the 24GB available of the RTX3090). 12GB is in line with former NVIDIA GPUs that were “work horses” for ML/AI like the wonderful 2080Ti.

Outstanding Performance of NVIDIA A100 PCIe on HPL, HPL-AI, HPCG Benchmarks

Posted on May 21, 2021 by Dr. Donald Kinghorn

The NVIDIA A100 (Compute) GPU is an extraordinary computing device. It’s not just for ML/AI types of workloads. General scientific computing tasks requiring high performance numerical linear algebra run exceptionally well on the A100.

Intel Rocket Lake Compute Performance Results HPL HPCG NAMD and Numpy

Posted on March 31, 2021 by Dr. Donald Kinghorn

The new Intel Rocket Lake CPUs have been officially released. There were numerous posts and reviews before the official release date of March 30 2021, but I haven’t seen anything about the numerical compute performance. I’ve had access to a Core-i9 11900KF 8-core CPU and have compared it with (my own) AMD 5800X system.

HPC Posts