This is just a short post to announce a more usable version of the NVIDIA GPU powerlimit setup script that I released a few months ago. This update to version 0.2 uses an interactive mode to set GPU powerlimits and optionally setup a systemd unit file to set these limits on subsequent reboots.
This post presents testing data showing that power-limit reduction on NVIDIA GPUs have give significant benefits for both high wattage and lower wattage GPUs. Power-limit vs Performance data is presented for 1-4 A5000 and 1-4 RTX3090 GPUs.
In this post I am referencing a Bash shell script I recently put together for setting up automatic NVIDIA GPU power-limit lowering at system boot. This allows a reliable way to configure and maintain multi-GPU systems for stable operation under heavy load.
I have updated my TensorFlow performance testing. This post contains up-to-date versions of all of my testing software and includes results for 1 to 4 RTX and GTX GPU’s. It gives a good comparative overview of most of the GPU’s that are useful in a workstation intended for machine learning and AI development work.
I have been qualifying a 4 GPU workstation for Machine Learning and HPC use. The last confirmation testing I wanted to do was running it with TensorFlow benchmarks on 4 NVIDIA Titan V GPU’s. I have that systems up and running and the multi-GPU scaling looks very good.