Puget Systems print logo
https://www.pugetsystems.com


Read this article at https://www.pugetsystems.com/guides/820
Dr Donald Kinghorn (Scientific Computing Advisor)

Intel Xeon E5 v4 Broadwell Buyers Guide (Parallel Performance)

Written on July 1, 2016 by Dr Donald Kinghorn

Intel's Xeon E5 v4 processors are available and there are lots of them! The changes from the v3 Haswell are mostly small clock changes and increases in core count. You can now get a E5-2699v4 with 22 cores. In a dual socket system that's 44 cores to work with. If the programs you want to run scale well with thread count then that could be a great processor for you. However, if your parallel scaling is not near linear then it may not be the best value. We have a dynamic chart of performance based on Amdahl's Law that may help you decide which processor is best for your uses.


Read this article at https://www.pugetsystems.com/guides/815
Dr Donald Kinghorn (Scientific Computing Advisor)

NAMD Molecular Dynamics Performance on NVIDIA GTX 1080 and 1070 GPU

Written on June 23, 2016 by Dr Donald Kinghorn

The new NVIDIA GeForce GTX 1080 and GTX 1070 GPU's are out and I've received a lot of questions about NAMD performance. The short answer is -- performance is great! I've got some numbers to back that up below. We've got new Broadwell Xeon and Core-i7 CPU's thrown into the mix too. The new hardware refresh gives a nice step up in performance.


Read this article at https://www.pugetsystems.com/guides/803
Dr Donald Kinghorn (Scientific Computing Advisor)

GTX 1080 CUDA performance on Linux (Ubuntu 16.04) preliminary results (nbody and NAMD)

Written on May 27, 2016 by Dr Donald Kinghorn

Just got a NVIDIA GTX 1080 for testing. I hacked up an install with Ubuntu 16.04 and CUDA 7.5 along with a beta display driver that works! First run after compiling the cuda samples nbody gave 5816 GFLOP/s! A GTX 980 on the same system does 2572 GFLOP/s. However, it's not all good news ...


Read this article at https://www.pugetsystems.com/guides/801
Dr Donald Kinghorn (Scientific Computing Advisor)

Intel Broadwell Xeon E5 2600v4 performance test

Written on May 18, 2016 by Dr Donald Kinghorn

The Intel Xeon E5 2600 v4 Broadwell processors are finally available. My first Linpack testing with a E5-2687W v4 shows a greater than 35% performance increase over the v3 Haswell version! And, it's the same price as the v3 version! It's significantly better than expected.


Read this article at https://www.pugetsystems.com/guides/775
Dr Donald Kinghorn (Scientific Computing Advisor)

NVIDIA CUDA with Ubuntu 16.04 beta on a laptop (if you just cannot wait)

Written on March 8, 2016 by Dr Donald Kinghorn

I was preparing a Puget Systems Traverse Skylake based laptop for GPU accelerated molecular dynamics demos at the upcoming ACS meeting and decided to see if I could get Ubuntu 16.04 beta working with NVIDIA CUDA 7.5. It worked!


Read this article at https://www.pugetsystems.com/guides/733
Dr Donald Kinghorn (Scientific Computing Advisor)

Windows 10 with Xeon Phi

Written on November 13, 2015 by Dr Donald Kinghorn

Can you use an Intel Xeon Phi with Windows 10? Yes, you can. However, just because you can do something, doesn't mean that you should do it! I did a set up and a little testing mainly just to see if it would work -- it does!


Read this article at https://www.pugetsystems.com/guides/719
Dr Donald Kinghorn (Scientific Computing Advisor)

Molecular Dynamics Performance on GPU Workstations -- NAMD

Written on October 27, 2015 by Dr Donald Kinghorn

Molecular Dynamics programs can achieve very good performance on modern GPU accelerated workstations giving job performance that was only achievable using CPU compute clusters only a few years ago. The group at UIUC working on NAMD were early pioneers of using GPU's for compute acceleration and NAMD has very good performance acceleration using NVIDIA CUDA. We show you how good that performance is on modern Nvidia GPU's


Read this article at https://www.pugetsystems.com/guides/682
Dr Donald Kinghorn (Scientific Computing Advisor)

OpenACC for free! -- NVIDIA OpenACC Toolkit

Written on July 14, 2015 by Dr Donald Kinghorn

NVIDIA and PGI are offering "PGI Accelerator with OpenACC" free to academia (or 90 day trial for commercial users) under the banner "NVIDIA OpenACC Toolkit". It's about time!


Read this article at https://www.pugetsystems.com/guides/670
Dr Donald Kinghorn (Scientific Computing Advisor)

Xeon Phi 5110p and Free Intel Parallel Studio Cluster Edition

Written on June 22, 2015 by Dr Donald Kinghorn

Another amazing deal on Xeon Phi from Intel! This time you can get a 90% discount on a Phi 5110p and get the Intel Parallel Studio Cluster edition with a 1 year license for free.


Read this article at https://www.pugetsystems.com/guides/659
Dr Donald Kinghorn (Scientific Computing Advisor)

GTX 980 Ti Linux CUDA performance vs Titan X and GTX 980

Written on June 12, 2015 by Dr Donald Kinghorn

NVIDIA has just launched the GTX 980 Ti and I got to run some benchmarks on one. How is the Linux CUDA performance? Almost as good as the TitanX! This is another great card from NVIDIA for single precision compute loads. We've got some number to show it.


Read this article at https://www.pugetsystems.com/guides/654
Dr Donald Kinghorn (Scientific Computing Advisor)

Install NVIDIA CUDA on Fedora 22 with gcc 5.1

Written on May 19, 2015 by Dr Donald Kinghorn

Fedora 22 is full of new goodness like kernel 4.0 and gcc 5.1 and yes, you can install and run CUDA on it! It's not officially supported but I did manage to get it working!


Read this article at https://www.pugetsystems.com/guides/652
Dr Donald Kinghorn (Scientific Computing Advisor)

5 Ways of Parallel Programming

Written on May 12, 2015 by Dr Donald Kinghorn

Modern computing hardware is all about parallelism. This is because we essentially hit the wall several years ago on increasing core clock frequency to speedup serial code execution. The transistor count has continued to follow Moore's Law (doubling every 1.5-2 years) but these transistors have mostly gone into multiple cores, vector units, memory controllers, etc. on a single die. To utilize this hardware, software needs to be written to take advantage of it, i.e. you have to go parallel.


Read this article at https://www.pugetsystems.com/guides/642
Dr Donald Kinghorn (Scientific Computing Advisor)

GTC 2015 Deep Learning and OpenPOWER

Written on April 6, 2015 by Dr Donald Kinghorn

Another great GTC meeting. NVIDIA does this right! The most interesting aspects for me this year were the talks on "Deep Learning" (Artificial Neural Networks) and OpenPOWER. I have some observations and links to recordings of the keynotes and talks. Enjoy!


Read this article at https://www.pugetsystems.com/guides/629
Dr Donald Kinghorn (Scientific Computing Advisor)

NVIDIA CUDA GPU computing on a (modern) laptop

Written on March 13, 2015 by Dr Donald Kinghorn

Modern high-end laptops can be treated as desktop system replacements so it's expected that people will want to try to do some serious computing on them. Doing GPU accelerated computing on a laptop is possible and performance can be surprisingly good with a high-end NVIDIA GPU. [I'm looking at GTX 980m and 970m ]. However, first you have to get it to work! Optimus technology can present serious problems to someone who wants to run a Linux based CUDA laptop computing platform. Read on to see what worked.


Read this article at https://www.pugetsystems.com/guides/626
Dr Donald Kinghorn (Scientific Computing Advisor)

Intel vs NVIDIA, IBM, Mellanox, AMD and everybody!

Written on March 2, 2015 by Dr Donald Kinghorn

The next 18 months are going to see more shakeup and factioning in the computing world than we have seen in over a decade. Intel is pulling more and more of the compute architecture onto a single piece of silicon and tightly integrating the whole hardware stack. That's good and bad. It may let them achieve better performance. However, this is going to leave users with a choice of "all Intel" or something else entirely. And, the "something else" is starting to seriously take shape.


Read this article at https://www.pugetsystems.com/guides/614
Dr Donald Kinghorn (Scientific Computing Advisor)

Xeon Phi 31s1p Cooling and Motherboards

Written on December 18, 2014 by Dr Donald Kinghorn

OK, you got one of the Intel "fire sale / crazy Eddie sale" Xeon Phi 31s1p cards ... now what? I'll give you some tips on how to get this thing working!


Read this article at https://www.pugetsystems.com/guides/599
Dr Donald Kinghorn (Scientific Computing Advisor)

Intel Xeon E5 v3 Haswell-EP Buyers Guide

Written on October 3, 2014 by Dr Donald Kinghorn

The new Xeon E5 v3 Haswell processors are here, all 30+ of them! There is a bewildering variety of clock speeds, core counts, and power usage. There are processors in the new v3 familly ranging from the single socket E5-1620v3 with 4 cores at 3.5 GHz to the dual socket E5-2699v3 with 18 cores at 2.3GHz. How do you make a choice for a new system?! How do these new processors perform when you programs parallel scaling is less than perfect?


Read this article at https://www.pugetsystems.com/guides/595
Dr Donald Kinghorn (Scientific Computing Advisor)

Xeon E5 v3 Haswell-EP Performance -- Linpack

Written on September 8, 2014 by Dr Donald Kinghorn

The Intel Xeon E5 v3 Haswell EP processors are here. The floating point performance on these new processors is outstanding. We run a Linpack benchmark on a dual Xeon E5-2687W v3 system and show how it stacks up against several processors.


Read this article at https://www.pugetsystems.com/guides/596
Dr Donald Kinghorn (Scientific Computing Advisor)

Memory Performance for Intel Xeon Haswell-EP DDR4

Written on September 8, 2014 by Dr Donald Kinghorn

Memory bandwidth is often an important factor for compute or data intensive workloads. The STREAM benchmark has been used for may years as a measure of this bandwidth. We present STREAM results for the new Xeon E5 v3 Haswell processor with DDR4 memory and compare this with an Xeon E5 v2 Ivy Bridge system.


Read this article at https://www.pugetsystems.com/guides/594
Dr Donald Kinghorn (Scientific Computing Advisor)

Linpack performance Haswell E (Core i7 5960X and 5930K)

Written on August 29, 2014 by Dr Donald Kinghorn

The new Intel desktop Core i7 processors are out, Haswell E! We look at how the Core i7 5960X and 5930K stack up with some other processors for numerical computing with the Intel optimized MKL Linpack benchmark.


Read this article at https://www.pugetsystems.com/guides/593
Dr Donald Kinghorn (Scientific Computing Advisor)

LAMMPS Optimized for Intel on Quad Socket Xeon

Written on August 27, 2014 by Dr Donald Kinghorn

LAMMPS is a molecular dynamics program capable of running very large (billions of atom) dynamics simulations. It is modular with many contributed packages to add extra potential energy functions, atom types etc.. There was recently added a package, USER-INTEL, that adds some nice code optimizations for Intel Xeon hardware. We grabbed the latest source code and did a build with this new code and fired it up on our quad Xeon test system and got very good performance.


Read this article at https://www.pugetsystems.com/guides/587
Dr Donald Kinghorn (Scientific Computing Advisor)

OpenFOAM performance on Quad socket Xeon and Opteron

Written on August 5, 2014 by Dr Donald Kinghorn

OpenFOAM is a collection of programs and libraries for computational fluid dynamics, CFD, and general dynamical modelling with many solver types. It can give linear scaling and excellent parallel performance on Quad socket many-core systems. Read on to see performance on a 40-core Xeon and 48-core Opteron system.


Read this article at https://www.pugetsystems.com/guides/584
Dr Donald Kinghorn (Scientific Computing Advisor)

Why quad Xeon? 95% of peak LINPACK on 40 cores!

Written on July 29, 2014 by Dr Donald Kinghorn

I've been doing application performance testing on our quad socket systems and I am especially liking the quad Xeon box on our test bench. I realized that I haven't published any LINPACK performance numbers for this system (that's my favorite benchmark). I'll show the results for the Intel optimized multi-threaded binary that is included with Intel MKL and do a compile from source using OpenMPI. It turns out that both openMP threads and MPI processes give outstanding, near theoretical peak performance. Building from source hopefully shows that it's not just Intel "magic" that leads to this performance ... although I guess it really is.


Read this article at https://www.pugetsystems.com/guides/579
Dr Donald Kinghorn (Scientific Computing Advisor)

POV-ray on Quad Xeon and Opteron

Written on July 14, 2014 by Dr Donald Kinghorn

POV-ray is an open source ray tracing package with a long history. It has been a favorite system performance testing package since it's inception because of the heavy load it places on the CPU. It has had an SMP parallel implementation since the mid 2000's and is often used as a multi-core CPU parallel performance benchmark on both Linux and Windows. So lets try it on our Quad socket many-core systems!


Read this article at https://www.pugetsystems.com/guides/578
Dr Donald Kinghorn (Scientific Computing Advisor)

Hyper-Threading may be Killing your Parallel Performance

Written on July 2, 2014 by Dr Donald Kinghorn

Hyper-Threading, hyperthreading, or just HT for short, has been around on Intel processors for over a decade and it still confuses people. I'm not going to do much to help with the confusion. I just want to point out an example from some testing I was doing recently with the ray-tracing application POV-ray that surprised me. Hyper-threading dramatically lowered the performance on a multi-core test system running Windows when running POV-ray in parallel.