Posted on March 2, 2015 by Dr Donald Kinghorn
The next 18 months are going to see more shakeup and factioning in the computing world than we have seen in over a decade. Intel is pulling more and more of the compute architecture onto a single piece of silicon and tightly integrating the whole hardware stack. That's good and bad. It may let them achieve better performance. However, this is going to leave users with a choice of ďall IntelĒ or something else entirely. And, the ďsomething elseĒ is starting to seriously take shape.
I sat on a chair made for a Kindergartner in the back of a dark auditorium waiting for my daughter to perform at her school Christmas program. You can almost feel the nervous energy coming from the children and especially the parents like me who are not sure if their child remembered to bring their sheet music, instrument and every part of their costume including the reindeer antlers.
OK, you got one of the Intel ďfire sale / crazy Eddie saleĒ Xeon Phi 31s1p cards Ö now what? I'll give you some tips on how to get this thing working!
As part of my job at Puget Systems, I speak with many of our customers at various stages of ownership that range from about a week to a couple of years. These customers often share feedback that we use to improve our products and services. Occasionally customers share what they wish they had done differently when they were configuring their computers. I share this information with our sales team, and figured it might be helpful to those of you considering a new computer today. So in the vein of "If I could do it all over again..." here are a number of items our customers would change if they could turn back time:
The new Xeon E5 v3 Haswell processors are here, all 30+ of them! There is a bewildering variety of clock speeds, core counts, and power usage. There are processors in the new v3 familly ranging from the single socket E5-1620v3 with 4 cores at 3.5 GHz to the dual socket E5-2699v3 with 18 cores at 2.3GHz. How do you make a choice for a new system?! How do these new processors perform when you programs parallel scaling is less than perfect?
Sales Consultant Jeff Stubbers recently took home an Asus 4K monitor for personal use, and he liked it so much that he wrote a blog post about it.
The Intel Xeon E5 v3 Haswell EP processors are here. The floating point performance on these new processors is outstanding. We run a Linpack benchmark on a dual Xeon E5-2687W v3 system and show how it stacks up against several processors.
Memory bandwidth is often an important factor for compute or data intensive workloads. The STREAM benchmark has been used for may years as a measure of this bandwidth. We present STREAM results for the new Xeon E5 v3 Haswell processor with DDR4 memory and compare this with an Xeon E5 v2 Ivy Bridge system.
Posted on August 29, 2014 by Dr Donald Kinghorn
The new Intel desktop Core i7 processors are out, Haswell E! We look at how the Core i7 5960X and 5930K stack up with some other processors for numerical computing with the Intel optimized MKL Linpack benchmark.
LAMMPS is a molecular dynamics program capable of running very large (billions of atom) dynamics simulations. It is modular with many contributed packages to add extra potential energy functions, atom types etc.. There was recently added a package, USER-INTEL, that adds some nice code optimizations for Intel Xeon hardware. We grabbed the latest source code and did a build with this new code and fired it up on our quad Xeon test system and got very good performance.
I'd never used a Dremel before. But I'd have to learn if I wanted a PC that stood out from all the nondescript beige boxes my friends owned. So I spent the afternoon tracing the pattern on side panel of my Lian-Li aluminum case using a stencil I'd found online. Had YouTube been around at the time, I would have searched to find a Dremel tutorial but it would be few more years before it existed.
Posted on August 5, 2014 by Dr Donald Kinghorn
OpenFOAM is a collection of programs and libraries for computational fluid dynamics, CFD, and general dynamical modelling with many solver types. It can give linear scaling and excellent parallel performance on Quad socket many-core systems. Read on to see performance on a 40-core Xeon and 48-core Opteron system.
Iíve been doing application performance testing on our quad socket systems and I am especially liking the quad Xeon box on our test bench. I realized that I havenít published any LINPACK performance numbers for this system (thatís my favorite benchmark). Iíll show the results for the Intel optimized multi-threaded binary that is included with Intel MKL and do a compile from source using OpenMPI. It turns out that both openMP threads and MPI processes give outstanding, near theoretical peak performance. Building from source hopefully shows that itís not just Intel ďmagicĒ that leads to this performance Ö although I guess it really is.
POV-ray is an open source ray tracing package with a long history. It has been a favorite system performance testing package since itís inception because of the heavy load it places on the CPU. It has had an SMP parallel implementation since the mid 2000ís and is often used as a multi-core CPU parallel performance benchmark on both Linux and Windows. So lets try it on our Quad socket many-core systems!
Posted on July 2, 2014 by Dr Donald Kinghorn
Hyper-Threading, hyperthreading, or just HT for short, has been around on Intel processors for over a decade and it still confuses people. Iím not going to do much to help with the confusion. I just want to point out an example from some testing I was doing recently with the ray-tracing application POV-ray that surprised me. Hyper-threading dramatically lowered the performance on a multi-core test system running Windows when running POV-ray in parallel.
Iím going to walk you through a basic install and configuration for a development system to do CUDA and OpenACC GPU programming. This is not a detailed howto but if you have some linux admin skills it will be a reasonable guide to get you started. Weíll do a basic NVIDIA GPU programming setup including CentOS 6.5, CUDA development environment and a PGI compiler setup with OpenACC. The most interesting part may be the OpenACC setup. OpenACC is a relatively new option for GPU programming and allows for a directive (pragma) based coding model.
We take a look at Quad Xeon and Quad Opteron performance and parallel scaling with Zemax OpticStudio including an analysis using Amdahl's Law. Based on this analysis we then make performance predictions for other processors.
Several times a year my father would score Utah Jazz tickets, and being the oldest son, meant I was the one to accompany him to Salt Lake City to watch the games at the old Salt Palace arena. I sat next to my father for the hour-long drive from our home in northern Utah and knew we were getting close when I could see the arena that looked like a large wedding cake. For the next two hours Iíd cheer on the Jazz against their rivals such as the Portland Trailblazers or the Seattle Sonics. The Jazz were my team and my loyalty knew no bounds. I wore Jazz jerseys, collected player cards, and could tell you how many assists John Stockton needed to overtake Magic Johnson as the all-time assists leader.
NVIDIA Tesla K20 plus PGI Accelerator compilers with OpenACC in a package deal with a system. Yes, it's official. If you've wanted to do some development work with OpenACC on Tesla, this is a nice way to get started with a heavily discounted K20 and PGI compiler package pre loaded on a Peak Mini.
Here's a quick look at CUDA performance on the NVIDIA Jetson Tegra K1 developer board.
I recently returned from Las Vegas where I attended an Intel partner event. Over the course of three days, I was able to listen to many speakers give us their predictions for the future of computing. We were presented with demos of fancy all-in-one PCs, sleek new laptops as well as beefy workstations powered by quad Xeons. If Intel was building chips for it, we saw it or heard about it.
This weekend a few of us from Puget Systems made the trip to Bellingham, WA for LinuxFest 2014. Two days of total immersion into the world of Linux and open source. Having recently made the plunge by setting up a native install of Ubuntu on my primary work machine, I thought this would be a great event to soak in the culture and goings-on of the free and open-source software (FOSS) community.
The annual Northwest pilgrimage for the Linux faithful to the Bellingham Technical College in Bellingham, WA is nearly upon us! Puget Systems is donating a great machine to the raffle, a Serenity mini with a commemorative case etching!
Where is NVIDIA heading with High Performance Computing hardware? Ever since Intel announced Xeon Phi Knights Landing as a stand-alone processor integrated at the board level as a full compute unit, I've been wondering what NVIDIA would do along these lines. It just makes sense that they would do something similar since getting the GPU off of the PCIe bus and tightly integrated with plentiful system memory would be a huge step forward for usability and performance. Here's my guess about where NVIDIA is heading.
I had the pleasure of attending the NVIDIA Graphics Technology Conference ( GTC ) last week. Wonderful conference! If you have any doubts about the quality of the conference you are in luck. They have most of the content on-line, you can check it out yourself ...<< Older Posts