Accelerated Parallel Computing
with NVIDIA Tesla and GPU Compute
Peak delivers the highest possible compute performance into the hands of developers, scientists, and engineers to advance computing enabled discovery and solution of the world's most challenging computational problems.
Puget Systems has over 21 years experience designing and building high quality and high performance PCs. Our emphasis has always been on reliability, high performance, and quiet operation. We take this experience to the HPC sector with our Peak family of workstations and servers. Through in-house testing we do not blindly follow the industry -- we help lead it. We provide the products below as starting points that we feel cover some of the most compelling areas that we can contribute to the HPC community. Do you have a project that needs some serious compute power, and you don't know where to turn? Let us help, it's what we do!
Minimum noise and maximum performance, reliability and usability. Puget Peak is an evolutionary step from our custom systems experience. Genesis performance post-production, Summit server stability, Serenity silent design, Obsidian reliability and even the diminutive Echo have influenced Peak.
TeraFLOPS. Using Intel Xeon CPU's and the Intel MKL library, or the well established CUDA platform and libraries, there is tremendous potential for applications leveraging the computing power of both the CPU and the GPU.
Ready for use. Peak systems are installed, configured and tested under load before they ship and will (optionally) arrive with the setup and tools you need to get started. Our CentOS setup will provide a configuration that can be the basis of your working environment.
Part of what makes our cooling both effective and quiet is that we specifically target the hot spots of each system. We place fans only where they are needed and only when they are needed. We then verify the final configuration with extensive testing, full load stress testing, and thermal imaging to ensure excellent cooling.
We know that these PCs are intended for heavy, long duration workloads. We have designed them for long life with 24/7 load, and that is our primary design goal. Through targeted cooling and high quality thermal solutions, we are able to achieve an excellent low noise level while maintaining the cooling necessary for long term high load. Even better, since we are implementing a custom cooling plan for each order, if you have a preference of whether you'd like us to tune more aggressively in either direction (towards even quieter operation, or more extreme cooling), all you have to do is let us know!
I recently wrote a post introducing Intel oneAPI that included a simple installation guide of the Base Toolkit. In that post I promised a follow-up about the the oneAPI AI Analytics Toolkit. This is it! I'll describe what it is and give recommendations for doing an install setup of the AI toolkits using conda with Anaconda Python.
Intel oneAPI is a massive collection of very high quality developer tools, and, it's free to use! In this post I'll give you a little background on what oneAPI is and my recommendations for doing an install setup to get started exploring the collection of tool-kits.
In this post I will show you how to install NVIDIA's build of TensorFlow 1.15 into an Anaconda Python conda environment. This is the same TensorFlow 1.15 that you would have in the NGC docker container, but no docker install required and no local system CUDA install needed either.
This is a follow up post to "Quad RTX3090 GPU Wattage Limited "MaxQ" TensorFlow Performance". This post will show you a way to have GPU power limits set automatically at boot by using a simple script and a systemd service Unit file.
Can you run 4 RTX3090's in a system under heavy compute load? Yes, by using nvidia-smi I was able to reduce the power limit on 4 GPUs from 350W to 280W and achieve over 95% of maximum performance. The total power load "at the wall" was reasonable for a single power supply and a modest US residential 110V, 15A power line.
The GeForce RTX3070 has been released. The RTX3070 is loaded with 8GB of memory making it less suited for compute task than the 3080 and 3090 GPUs. we have some preliminary results for TensorFlow, NAMD and HPCG.
When you install Miniconda3 or Anaconda3 on Windows it adds a PowerShell shortcut that has the necessary environment setup and initialization for conda. It's listed in the Windows menu as "Anaconda Powershell Prompt (Anaconda3)". However, this opens a separate/detached PowerShell instance and it would be nice to have this as an optional shell from Windows Terminal! In this post we will add that functionality as a new shell option in Windows Terminal.
The second new NVIDIA RTX30 series card, the GeForce RTX3090 has been released. The RTX3090 is loaded with 24GB of memory making it a good replacement for the RTX Titan... at significantly less cost! The performance for Machine Learning and Molecular Dynamics on the RTX3090 is quite good, as expected.
The much anticipated NVIDIA GeForce RTX3080 has been released. How good is it with TensorFlow for machine learning? How about molecular dynamics with NAMD? I've got some preliminary numbers for you!
WSL2 offers improved performance over version 1 by providing more direct access to the host hardware drivers. Recent "Insider Dev Channel" builds of Win10 even allows access to the Windows NVIDIA display driver for GPU computing applications for WSL2 Linux applications! The performance improvements with WSL2 are largely because this version is running as a privileged virtual machine on to of MS Hyper-V. This means that at least low level support for the Hyper-V virtualization layer needs to be enabled to use it. In particular, the Windows feature "VirtualMachinePlatform" must be enabled for WSL2. We tested to see if there was any negative application performance impact.