Puget Systems print logo

https://www.pugetsystems.com

Read this article at https://www.pugetsystems.com/guides/594
Dr Donald Kinghorn (Scientific Computing Advisor )

Linpack performance Haswell E (Core i7 5960X and 5930K)

Written on August 29, 2014 by Dr Donald Kinghorn
Share:

The new Intel “Haswell E” desktop processors are out, and yes, they are really good! Along with the new processors we have new motherboards, chipsets, and DDR4 memory lots of good stuff. Matt has some very good articles up on all this new stuff. You should have a look…

I’m mostly interested in numerical performance and potential for use with scientific computing. I’ll be doing new hardware testing over the next couple of weeks so expect to see some interesting posts (assuming you get excited by numbers the way I do :-).

Intel has also just released a new version of their excellent compilers and tools, Parallel Studio XE 2015, so I’ll be giving the new compilers a workout too.

The first thing I usually run on new Intel hardware is the Linpack parallel benchmark optimized with the MKL numerical libraries. This will usually get you close to theoretical peak double precision numerical performance. I’ve got some numbers for the new Core i7 5960X and 5930K processors. I’ll show these in a chart with some other processors for comparison.

The Core i7 5960X outperforms the fastest and highest core-count Sandy Bridge dual Xeon system! Not bad for a “desktop” processor!

Note: Theoretical peak is complicated on the new generations of processors but it is basically

CPU GHz * number of cores * vector ops (AVX) * special instructions (FMA3)

So for, the Core i7 5960X we expect a theoretical peak of,

3.0GHz * 8-cores * 8 DP-vector ops * 2 from FMA3  = 384 GFLOPS  

(384 billion double precision floating point operations per second ) From the linpack run we did we see over 90% of that … VERY GOOD!

The following chart has Linpack numbers from various systems and various versions of the Intel compilers that I’ve used, so the numbers should be taken as relative rather than absolute. ( I would take them with a +- 5% grain of salt --if that expression means anything to you!)

Linpack benchmark using the Intel MKL optimizations

 
Processor Brief Spec Linpack (GFLOPS)
Core i7 5960X (Haswell E) 8 core @ 3.0GHz AVX2 354
Dual Xeon E5 2687W 16 cores @ 3.2GHz AVX 345
Core i7 5930K (Haswell E) 6 cores @ 3.5GHz AVX2 289
Dual Xeon E5 2650 16 cores @ 2.0GHz AVX 262
Core i7 4770K (Haswell) 4 cores @ 3.5GHz AVX2 182
Xeon E3 1245v3 (Haswell) 4 cores @ 3.4GHz AVX2 170
Core -7 4960X (Ivy Bridge) 6 cores @ 3.6GHz AVX 165
Core i5 3570 (Ivy Bridge) 4 cores @ 3.4GHz AVX 105
Core i7 920 4 cores @ 2.66GHz SSE4.2 40

Happy computing! --dbk

Tags: Haswell E, Linpack, HPC, benchmark
Darin

Very nice. I do wonder about the power/cost analysis, might this be a better choice in certain situations...

Posted on 2014-08-30 16:09:18
decapattack

Wow. Sill so far away from videocards. One Radeon R7 370 costs around U$ 149,00 and it can output 1.61TFLOPS.

Posted on 2015-07-03 12:54:12
Robert Balthier Roosa

Sorry for necro-ing, but this benchmark was in double precision. The R7 370 can do 124 GFLOPS in double precision.

Posted on 2016-12-20 20:56:30
Arthur Spears

I would love to know how to run this benchmark to see the real world
differences in my overclocked i7-980x. I use Intel's BurnTest to find
the limit of stability and temperatures when I overclock, but it returns
the same GFLOPS value as listed by Intel for max single core
performance, so it must not be testing FLOPS. It must be doing
something else just to make the CPU run so hot, hotter than any other
benchmarking software out there. Stock turbo is 3.8ghz and I'm running
stable at 187x24 for 4.48ghz, should definitely see an improvement over
stock max single core performance. If I'm not getting any real
performance gain, I should ramp this down a bit. I'm a gamer so I'm
just trying to limit any bottleneck there might be between the gpu and
cpu.

Posted on 2015-10-23 21:37:04
Donald Kinghorn

Hey Arthur, Good question! Linpack is a great benchmark in my opinion. It's the traditional benchmark for the Top 500 Super Computer list. (but that will be supplemented with another measure soon) It's usually the first thing I run on new hardware. I personally install the Intel compilers when I do testing and have the benchmark binary from the MKL library. However, you can grab the binaries for the benchmark alone from,

https://software.intel.com/...

There are builds there for Linux and Windows. We run the Windows build as part of our QC process here on every machine that goes out. [ Be warned that Linpack puts a heavy load on the CPU's so you should watch your temperatures ]

Best regards -Don

Posted on 2015-10-23 23:30:44
Arthur Spears

Hey thanks! And the online chat guy wasn't sure there'd be a quick reply.

Posted on 2015-10-24 01:07:33
Arthur Spears

Best score on my i7-980x overclocked at 4.48ghz was 88.9GFLOPS.

Posted on 2015-10-24 01:52:31

Yeah, that sounds about like what I would expect. What's really limiting you is that CPU lacking AVX extensions. In its day that used to be a great CPU, and that's quite a high overclock on it. If it's stable and cool while running Linpack that's pretty sweet :)

Posted on 2015-10-24 02:12:46
Arthur Spears

I never fully realized how much more improved the avx extensions are over sse. And the posted benchmarks above are a definite indicator that I need to upgrade. What's after skylake? What's their next iteration of an enthusiast processor? Maybe its x99 or bust for a while.

Posted on 2015-10-24 06:33:44

Skylake is the current (very recently released) mainstream CPU platform. The enthusiast platform is a bit different, though, and currently operating on an architecture called Haswell-E. It is due to get an update to Broadwell-E in the next few months, I believe, so if you aren't in a rush you might wait for that.

Posted on 2015-10-25 00:47:35
Jacob Klein

Just wanted to quickly chime in with my strategy on stress testing my CPU overclocks.

My recommendation:
- Get the latest LinPack from Intel: https://software.intel.com/...
- Get the latest LinX from this thread: http://www.xtremesystems.or... .... Note: Be sure to scan for viruses, as this is a user-compiled binary, though I haven't had any problems.
- In case the latest LinX doesn't have the most-up-to-date LinPack files (which you want for maximum stress), then simply replace the following 4 files:
-- 32-bit\libiomp5md.dll (be sure to use the 32-bit version from LinPack)
-- 32-bit\linpack_xeon32.exe
-- 64-bit\libiomp5md.dll (be sure to use the 64-bit version from LinPack)
-- 64-bit\linpack_xeon64.exe
- Then test away!

As previously stated, this will put the maximum thermal stress on your CPU, that it's ever seen. Be on the watch for unsafe temperatures (like saturating around TjMax), and bail the test if your cooling isn't up to snuff, in which case you need to tone down your clocks.

And be sure to set problem size to around 2GB-less than your RAM amount, or around 90-95%. I have 64 GB Ram, I set problem size to 88000, resulting in a 59.4 GB problem size.

And set the Run dropdown to 500, and let it run overnight. If it can survive this test overnight with no problems, then your CPU overclock is pretty solid. Additional testing can be done using Prime95, even concurrently with LinX if you dare :) If you have GPUs, stress them at the same time, too, for an ultimate test of all hardware, including power supply!

Have fun! Be safe! I use BOINC with all equipment overclocked, while still assuring complete stability, by doing the tests I mentioned.
You should consider signing up for some BOINC projects, too!

Posted on 2017-03-14 03:04:36
Simon Waterholes

You know it's funny, when I first started in computers the ultimate computer was made by Cray and if memory serves me it performed at about 1.9 GFLOPS. People spent $10,000.00 for 1 minute of time on that machine. I now use the I7-5930K in my desktop at stock speed running at 289 GFLOPS. By this standard in just 15 to 20 years whatever passes for a desktop will be over 30 PFLOPS and the ram will be measured in terabytes. With that kind of power one could logically expect to have a meaningful relationship with your computer. The groundwork for that sort of thing is being put in place now with Google and Siri. The microcomputer into even a toddler yet in the grand scheme.

Posted on 2015-11-22 09:48:45
wsxedc

You should write what problem size (or memory usage, because these two values are equivalent) you have used. This greatly impacts performance, without this the results are worthless for comparing CPUs that are in a similar performance range.

Posted on 2017-03-07 21:01:19