NVIDIA GTC 2023 was outstanding! To say that about a virtual conference tells you how much I value it. This post is largely a catalog of the talks I found interesting along with titles that I think will be interesting to a larger audience and my colleagues at Puget Systems.
Ryzen 7950x Zen4 AVX512 Performance With AMD AOCCv4 HPL HPCG HPL-MxP
This post is a first-look at performance of the Ryzen7 7950x CPU using the latest AMD compiler release with support for Zen4 arch including AVX512 vector instructions. Performance is tested using the HPC standard benchmarks, HPL (High Performance Linpack), HPCG (High Performance Conjugate Gradient) and the newer HPC Top500 benchmark, HPL-MxP (formerly HPL-AI).
NVIDIA RTX4090 ML-AI and Scientific Computing Performance (Preliminary)
This post presents preliminary ML-AI and Scientific application performance results comparing NVIDIA RTX 4090 and RTX 3090 GPUs. These are early results using the NVIDIA CUDA 11.8 driver.
AMD Ryzen 7950X Scientific Computing Performance – 7 Optimized Applications
This post presents scientific application performance testing on the new AMD Ryzen 7950X. I am impressed! Seven applications that are heavy parallel numerical compute workloads were tested. The 7950X outperformed the Ryzen 5950X by as much as 25-40%. For some of the applications it provided nearly 50% of the performance of the much larger and more expensive Threadripper Pro 5995WX 64-core processor. That’s remarkable for a $700 CPU! The Ryzen 7950X is not in the same platform class as the Tr Pro but it is a respectable, budget friendly, numerical computing processor.
WSL2 vs Linux (HPL HPCG NAMD)
We’ve been curious about the performance of WSL for scientific applications and decided to do a few relevant benchmarks. This is also a teaser for some hardware-specific optimized application containerization that I’ve been working on!
Intel Ice Lake Xeon-W vs AMD TR Pro Compute Performance (HPL, HPCG, NAMD, Numpy)
The single socket version of Intel third generation Xeon SP is out, the Ice Lake Xeon W 33xx. This is a much better platform with faster large capacity 8 channel memory and PCIe v4 with plenty of lanes. The new Intel platform is very much like the AMD Threadripper Pro (single socket version of EPYC Rome) so this is the obvious comparison to make. Read on to see how the numerical computing testing went!
Self Contained Executable Containers Using Enroot Bundles
NVIDIA Enroot has a unique feature that will let you easily create an executable, self-contained, single-file package with a container image AND the runtime to start it up! This allows creation of a container package that will run itself on a system with or without Enroot installed on it! “Enroot Bundles”.
NVIDIA 3080Ti Compute Performance ML/AI HPC
For computing tasks like Machine Learning and some Scientific computing the RTX3080TI is an alternative to the RTX3090 when the 12GB of GDDR6X is sufficient. (Compared to the 24GB available of the RTX3090). 12GB is in line with former NVIDIA GPUs that were “work horses” for ML/AI like the wonderful 2080Ti.
Outstanding Performance of NVIDIA A100 PCIe on HPL, HPL-AI, HPCG Benchmarks
The NVIDIA A100 (Compute) GPU is an extraordinary computing device. It’s not just for ML/AI types of workloads. General scientific computing tasks requiring high performance numerical linear algebra run exceptionally well on the A100.
Run “Docker” Containers with NVIDIA Enroot
Enroot is a simple and modern way to run “docker” or OCI containers. It provides an unprivileged user “sandbox” that integrates easily with a “normal” end user workflow. I like it for running development environments and especially for running NVIDIA NGC containers. In this post I’ll go through steps for installing enroot and some simple usage examples including running NVIDIA NGC containers.