In the late 90s I had the opportunity to take a factory tour of the Porsche plant in Stuttgart Germany. I watched as engineers assembled engines by hand. The only automation I noticed was how parts were delivered to each work station by robotic carts. Our tour guide pointed out that each Porsche was built-to-order and that a number of models had long waiting lists. But it was an area near the end of the tour, just off the main assembly line that stood out to me that day. In this area were maybe a dozen or so women stitching together what looked to be large swaths of leather or canvas. Looking around the plant of such a high performance car company, this particular area felt antiquated. Another man in our tour group asked the tour guide why those women were not using commercial stitching machines.
I sat on a chair made for a Kindergartner in the back of a dark auditorium waiting for my daughter to perform at her school Christmas program. You can almost feel the nervous energy coming from the children and especially the parents like me who are not sure if their child remembered to bring their sheet music, instrument and every part of their costume including the reindeer antlers.
OK, you got one of the Intel ďfire sale / crazy Eddie saleĒ Xeon Phi 31s1p cards Ö now what? I'll give you some tips on how to get this thing working!
As part of my job at Puget Systems, I speak with many of our customers at various stages of ownership that range from about a week to a couple of years. These customers often share feedback that we use to improve our products and services. Occasionally customers share what they wish they had done differently when they were configuring their computers. I share this information with our sales team, and figured it might be helpful to those of you considering a new computer today. So in the vein of "If I could do it all over again..." here are a number of items our customers would change if they could turn back time:
If you have done a fresh install of CentOS 6.6 or ďupdatedĒ to it from a 6.5 install and you are setting up NVIDIA CUDA 6.5 you may be having trouble with a failed build of the nvidia-uvm kernel module. Read on for a fix Ö
A few months ago my car wouldn't start. I narrowed the problem down to the starter motor. After doing a little research online, I decided I could perform the repair myself. I ordered the motor and expected the replacement to take a couple of hours. If you've ever replaced a starter engine, you know that getting to the starter is often the most time consuming part of the project. It didn't take long to realize the tools I had on hand were not tailored for the job. Iím a lot more comfortable around computers than I am cars. But I figured with detailed instructions in hand, Iíd have my car up and running soon. That wasn't the case.
I love apps that save time, even just a few seconds on each use. Most of my day is spent writing so any tool that allows me to keep my focus on that activity earns a spot on my computer. Over the years, Iíve test dozens of utilities that promised to save time, and Iíve found that very few have lived up to that promise. Many are either too complex, require too much administration or just donít work the way they should. But a few apps have withstood the test of time. These are the apps I use multiple times a day. A few of these I use a dozen or more times a day. The attribute each has in common is they save time.
The new Xeon E5 v3 Haswell processors are here, all 30+ of them! There is a bewildering variety of clock speeds, core counts, and power usage. There are processors in the new v3 familly ranging from the single socket E5-1620v3 with 4 cores at 3.5 GHz to the dual socket E5-2699v3 with 18 cores at 2.3GHz. How do you make a choice for a new system?! How do these new processors perform when you programs parallel scaling is less than perfect?
I recently had two experiences while shopping for groceries that I want to share. I do most of the grocery shopping for our family in the evenings when the crowds are lighter and the kids are in bed. I decided to try the largest grocery store in the area. Inside is a deli, bank, pharmacy and coffee shop. This store is open 24 hours. I entered the store around 9 pm, grabbed a cart and made my way down the aisles. I was especially impressed with the bakery, but when I got to the produce area, I noticed most sections were covered with large tarps. It felt like a game of hide-and-seek trying to find the gala apples and seedless grapes, but I managed to find what I came for and headed towards the checkout stands.
Sales Consultant Jeff Stubbers recently took home an Asus 4K monitor for personal use, and he liked it so much that he wrote a blog post about it.
On the drive from the kidís school to our home, we pass through a field of black lava formations on the outskirts of Santa Clara, UT. My daughter asked asked why the lava was black, and before I could say anything my son said, ďThe lava turns into obsidian when it comes in contact with water.Ē Where did he learn that? Minecraft.
The Intel Xeon E5 v3 Haswell EP processors are here. The floating point performance on these new processors is outstanding. We run a Linpack benchmark on a dual Xeon E5-2687W v3 system and show how it stacks up against several processors.
Memory bandwidth is often an important factor for compute or data intensive workloads. The STREAM benchmark has been used for may years as a measure of this bandwidth. We present STREAM results for the new Xeon E5 v3 Haswell processor with DDR4 memory and compare this with an Xeon E5 v2 Ivy Bridge system.
Posted on August 29, 2014 by Dr Donald Kinghorn
The new Intel desktop Core i7 processors are out, Haswell E! We look at how the Core i7 5960X and 5930K stack up with some other processors for numerical computing with the Intel optimized MKL Linpack benchmark.
LAMMPS is a molecular dynamics program capable of running very large (billions of atom) dynamics simulations. It is modular with many contributed packages to add extra potential energy functions, atom types etc.. There was recently added a package, USER-INTEL, that adds some nice code optimizations for Intel Xeon hardware. We grabbed the latest source code and did a build with this new code and fired it up on our quad Xeon test system and got very good performance.
I'd never used a Dremel before. But I'd have to learn if I wanted a PC that stood out from all the nondescript beige boxes my friends owned. So I spent the afternoon tracing the pattern on side panel of my Lian-Li aluminum case using a stencil I'd found online. Had YouTube been around at the time, I would have searched to find a Dremel tutorial but it would be few more years before it existed.
Posted on August 5, 2014 by Dr Donald Kinghorn
OpenFOAM is a collection of programs and libraries for computational fluid dynamics, CFD, and general dynamical modelling with many solver types. It can give linear scaling and excellent parallel performance on Quad socket many-core systems. Read on to see performance on a 40-core Xeon and 48-core Opteron system.
Iíve been doing application performance testing on our quad socket systems and I am especially liking the quad Xeon box on our test bench. I realized that I havenít published any LINPACK performance numbers for this system (thatís my favorite benchmark). Iíll show the results for the Intel optimized multi-threaded binary that is included with Intel MKL and do a compile from source using OpenMPI. It turns out that both openMP threads and MPI processes give outstanding, near theoretical peak performance. Building from source hopefully shows that itís not just Intel ďmagicĒ that leads to this performance Ö although I guess it really is.
To the best of my knowledge, itís been at least six years since we've written about life behind-the-scenes here at Puget Systems. So weíre going to kick off a whole new generation of newsletters - focused more on the people and less the technology - as we dive into this summer season of 2014. I hope you enjoy this little glimpse of what working at Puget Systems is really like, day to day.
The first computer I purchased arrived at my home with two operating systems: DOS and Windows 3.1. Most full-fledged programs ran in DOS, including nearly every game in the early 1990s. Besides pool, the game I played most during my college years was called Links Golf which ran in DOS. Without Links Iím convinced my GPA would be at least a half grade higher. I offset my Links addiction by installing WordPerfect for DOS which allowed me to write reports from home instead of the schoolís computer lab
POV-ray is an open source ray tracing package with a long history. It has been a favorite system performance testing package since itís inception because of the heavy load it places on the CPU. It has had an SMP parallel implementation since the mid 2000ís and is often used as a multi-core CPU parallel performance benchmark on both Linux and Windows. So lets try it on our Quad socket many-core systems!
Posted on July 2, 2014 by Dr Donald Kinghorn
Hyper-Threading, hyperthreading, or just HT for short, has been around on Intel processors for over a decade and it still confuses people. Iím not going to do much to help with the confusion. I just want to point out an example from some testing I was doing recently with the ray-tracing application POV-ray that surprised me. Hyper-threading dramatically lowered the performance on a multi-core test system running Windows when running POV-ray in parallel.
Iím going to walk you through a basic install and configuration for a development system to do CUDA and OpenACC GPU programming. This is not a detailed howto but if you have some linux admin skills it will be a reasonable guide to get you started. Weíll do a basic NVIDIA GPU programming setup including CentOS 6.5, CUDA development environment and a PGI compiler setup with OpenACC. The most interesting part may be the OpenACC setup. OpenACC is a relatively new option for GPU programming and allows for a directive (pragma) based coding model.
We take a look at Quad Xeon and Quad Opteron performance and parallel scaling with Zemax OpticStudio including an analysis using Amdahl's Law. Based on this analysis we then make performance predictions for other processors.
Several times a year my father would score Utah Jazz tickets, and being the oldest son, meant I was the one to accompany him to Salt Lake City to watch the games at the old Salt Palace arena. I sat next to my father for the hour-long drive from our home in northern Utah and knew we were getting close when I could see the arena that looked like a large wedding cake. For the next two hours Iíd cheer on the Jazz against their rivals such as the Portland Trailblazers or the Seattle Sonics. The Jazz were my team and my loyalty knew no bounds. I wore Jazz jerseys, collected player cards, and could tell you how many assists John Stockton needed to overtake Magic Johnson as the all-time assists leader.<< Older Posts