Puget Systems print logo



Accelerated Parallel Computing
with NVIDIA Tesla and GPU Compute

Peak delivers the highest possible compute performance into the hands of developers, scientists, and engineers to advance computing enabled discovery and solution of the world's most challenging computational problems.


Puget Systems has over 18 years experience designing and building high quality and high performance PCs. Our emphasis has always been on reliability, high performance, and quiet operation. We take this experience to the HPC sector with our Peak family of workstations and servers. Through in-house testing we do not blindly follow the industry -- we help lead it. We provide the products below as starting points that we feel cover some of the most compelling areas that we can contribute to the HPC community. Do you have a project that needs some serious compute power, and you don't know where to turn? Let us help, it's what we do!

Dr. Kinghorn

Dr. Donald Kinghorn
Scientific Advisor for Puget Systems

Dr. Kinghorn has a 20+ year history with scientific and high performance computing and holds a BA in Mathematics/Chemistry and a PhD in Theoretical Chemistry. If you are looking for a HPC configuration, check out his HPC Blog.


Puget Peak Mini

Peak Mini


Payments starting at $104/month

A compact, efficient, portable developer workstation.

Puget Peak Single Xeon Tower

Peak Single Xeon Tower


Payments starting at $133/month

A powerful enterprise-class tower developer workstation with support for four NVIDIA Titan GPUs.

Puget Peak Dual Xeon Tower

Peak Dual Xeon Tower


Payments starting at $198/month

A powerful enterprise-class tower developer workstation with support for dual NVIDIA Tesla or GPUs.

Puget Peak Quad Xeon Tower

Peak Quad Xeon Tower


Payments starting at $594/month

A quad socket E7 Xeon tower for maximum processing power in a single box.

Puget Peak 1U

Peak 1U


Payments starting at $195/month

A powerful, enterprise-class 1U rackmount server with Intel Xeon processors and up to 4 NVIDIA Tesla or GPU cards.

Puget Peak 4U

Peak 4U


Payments starting at $352/month

A powerful, enterprise-class 4U rackmount server with dual Intel Xeon processors and up to 8 NVIDIA Tesla or GPU cards.


Minimum noise and maximum performance, reliability and usability. Puget Peak is an evolutionary step from our custom systems experience. Genesis performance post-production, Summit server stability, Serenity silent design, Obsidian reliability and even the diminutive Echo have influenced Peak.


TeraFLOPS. Using Intel Xeon CPU's and the Intel MKL library, or the well established CUDA platform and libraries, there is tremendous potential for applications leveraging the computing power of both the CPU and the GPU.


Ready for use. Peak systems are installed, configured and tested under load before they ship and will (optionally) arrive with the setup and tools you need to get started. Our CentOS setup will provide a configuration that can be the basis of your working environment.

Part of what makes our cooling both effective and quiet is that we specifically target the hot spots of each system. We place fans only where they are needed and only when they are needed. We then verify the final configuration with extensive testing, full load stress testing, and thermal imaging to ensure excellent cooling.

Example of Puget Systems targeted cooling

Without targeted cooling

With targeted cooling

We know that these PCs are intended for heavy, long duration workloads. We have designed them for long life with 24/7 load, and that is our primary design goal. Through targeted cooling and high quality thermal solutions, we are able to achieve an excellent low noise level while maintaining the cooling necessary for long term high load. Even better, since we are implementing a custom cooling plan for each order, if you have a preference of whether you'd like us to tune more aggressively in either direction (towards even quieter operation, or more extreme cooling), all you have to do is let us know!

Recommended Reading

Read this article at https://www.pugetsystems.com/guides/1303
Dr Donald Kinghorn (Scientific Computing Advisor )

AMD Threadripper 2990WX 32-core vs Intel Xeon-W 2175 14-core - Linpack NAMD and Kernel Build Time

Written on 12/06/2018 by Dr Donald Kinghorn

I recently wrote a post about building and running AMD Threadripper 2990WX with HPL Linpack - a "How-To". Most of the time I had with the processor went into getting that to work. However, I did run a few other test jobs that I thought the 2990WX would do well with. I compared that against my personal workstation with a Xeon-W 2175. In this post I share those test runs with you. It's not thorough testing by any means but it was interesting and I was surprised a couple of times with the results.

Read this article at https://www.pugetsystems.com/guides/1291
Dr Donald Kinghorn (Scientific Computing Advisor )

How to Run an Optimized HPL Linpack Benchmark on AMD Ryzen Threadripper -- 2990WX 32-core Performance

Written on 11/30/2018 by Dr Donald Kinghorn

The AMD Ryzen Threadripper 2990WX with 32 cores is an intriguing processor. I've been asked about performance for numerical computing and decided to find out how well it would do with my favorite benchmark the "High Performance Linpack" benchmark. This is used to rank Supercomputers on the Top500 list. It is not always simple to run this test since it can require building a few libraries from source. This includes the all important BLAS library which AMD has optimized in their BLIS package. I give you a complete How-To guide for getting this running to see what the 2990WX is capable of.

Read this article at https://www.pugetsystems.com/guides/1267
Dr Donald Kinghorn (Scientific Computing Advisor )

RTX 2080Ti with NVLINK - TensorFlow Performance (Includes Comparison with GTX 1080Ti, RTX 2070, 2080, 2080Ti and Titan V)

Written on 10/26/2018 by Dr Donald Kinghorn

More Machine Learning testing with TensorFlow on the NVIDIA RTX GPU's. This post adds dual RTX 2080 Ti with NVLINK and the RTX 2070 along with the other testing I've recently done. Performance in TensorFlow with 2 RTX 2080 Ti's is very good! Also, the NVLINK bridge with 2 RTX 2080 Ti's gives a bidirectional bandwidth of nearly 100 GB/sec!

Read this article at https://www.pugetsystems.com/guides/1262
Dr Donald Kinghorn (Scientific Computing Advisor )

NVLINK on RTX 2080 TensorFlow and Peer-to-Peer Performance with Linux

Written on 10/16/2018 by Dr Donald Kinghorn

NVLINK is one of the more interesting features of NVIDIA's new RTX GPU's. In this post I'll take a look at the performance of NVLINK between 2 RTX 2080 GPU's along with a comparison against single GPU I've recently done. The testing will be a simple look at the raw peer-to-peer data transfer performance and a couple of TensorFlow job runs with and without NVLINK.

Read this article at https://www.pugetsystems.com/guides/1247
Dr Donald Kinghorn (Scientific Computing Advisor )

NVIDIA RTX 2080 Ti vs 2080 vs 1080 Ti vs Titan V, TensorFlow Performance with CUDA 10.0

Written on 10/03/2018 by Dr Donald Kinghorn

Are the NVIDIA RTX 2080 and 2080Ti good for machine learning? Yes, they are great! The RTX 2080 Ti rivals the Titan V for performance with TensorFlow. The RTX 2080 seems to perform as well as the GTX 1080 Ti (although the RTX 2080 only has 8GB of memory). I've done some testing using **TensorFlow 1.10** built against **CUDA 10.0** running on **Ubuntu 18.04** with the **NVIDIA 410.48 driver**.

Read this article at https://www.pugetsystems.com/guides/1236
Dr Donald Kinghorn (Scientific Computing Advisor )

How To Install CUDA 10 (together with 9.2) on Ubuntu 18.04 with support for NVIDIA 20XX Turing GPUs

Written on 09/27/2018 by Dr Donald Kinghorn

NVIDIA recently released version 10.0 of CUDA. This is an upgrade from the 9.x series and has support for the new Turing GPU architecture. This CUDA version has full support for Ubuntu 18.4 as well as 16.04 and 14.04. The CUDA 10.0 release is bundled with the new 410.x display driver for Linux which will be needed for the 20xx Turing GPU's. If you are doing development work with CUDA or running packages that require you to have the CUDA toolkit installed then you will probably want to upgrade to this. I'll go though how to do the install of CUDA 10.0 either by itself or along with an existing CUDA 9.2 install.

Read this article at https://www.pugetsystems.com/guides/1230
Dr Donald Kinghorn (Scientific Computing Advisor )

PyTorch for Scientific Computing - Quantum Mechanics Example Part 4) Full Code Optimizations -- 16000 times faster on a Titan V GPU

Written on 09/14/2018 by Dr Donald Kinghorn

This is the 16000 times speedup code optimizations for the scientific computing with PyTorch Quantum Mechanics example. The following quote says a lot, "The big magic is that on the Titan V GPU, with batched tensor algorithms, those million terms are all computed in the same time it would take to compute 1!!!"

Read this article at https://www.pugetsystems.com/guides/1225
Dr Donald Kinghorn (Scientific Computing Advisor )

PyTorch for Scientific Computing - Quantum Mechanics Example Part 3) Code Optimizations - Batched Matrix Operations, Cholesky Decomposition and Inverse

Written on 08/31/2018 by Dr Donald Kinghorn

An amazing result in this testing is that "batched" code ran in constant time on the GPU. That means that doing the Cholesky decomposition on 1 million matrices took the same amount of time as it did with 10 matrices! In this post we start looking at performance optimization for the Quantum Mechanics problem/code presented in the first 2 posts. This is the start of the promise to make the code over 15,000 times faster! I still find the speedup hard to believe but it turns out little things can make a big difference.

Read this article at https://www.pugetsystems.com/guides/1222
Dr Donald Kinghorn (Scientific Computing Advisor )

PyTorch for Scientific Computing - Quantum Mechanics Example Part 2) Program Before Code Optimizations

Written on 08/16/2018 by Dr Donald Kinghorn

This is the second post on using Pytorch for Scientific computing. I'm doing an example from Quantum Mechanics. In this post we go through the formulas that need to coded and write them up in PyTorch and give everything a test.

Read this article at https://www.pugetsystems.com/guides/1207
Dr Donald Kinghorn (Scientific Computing Advisor )

Doing Quantum Mechanics with a Machine Learning Framework: PyTorch and Correlated Gaussian Wavefunctions: Part 1) Introduction

Written on 07/31/2018 by Dr Donald Kinghorn

A Quantum Mechanics problem coded up in PyTorch?! Sure! Why not? I'll explain just enough of the Quantum Mechanics and Mathematics to make the problem and solution (kind of) understandable. The focus is on how easy it is to implement in PyTorch. This first post will give some explanation of the problem and do some testing of a couple of the formulas that will need to be coded up.

Read More