Skylake-S i7 6700K and i5 6600K for compute? maybe?

The first Intel Skylake processors are out, the core i7 6700K and i5 6600K. My first question — “are they good for compute?” Short answer is “kind of, probably…“. In the serious computing crowd everyone has been looking forward to Skylake because it was to have the eagerly awaited AVX-512 instructions implemented. That is a vector unit with twice the width of the existing AVX2 implementation. Well, the consumer “core” version is not going to get that good stuff. We’ll have to wait for the Xeon part for that. That’s OK. Intel seems to be moving to more of a differentiation between the consumer “core” line of processor and the “data center” Xeon hardware. I think going forward the decision will be easier — just go with the Xeon part.

One thing to keep in mind about Skylake — it is not just a new processor, it’s a new platform. It’s on a new socket, with a new chipset, new motherboards, new power management. It’s using DDR4, new motherboards have new good stuff like USB 3.1 etc. So, yes, Skylake is a significant upgrade and yes, it is a great processor for a new system.

Check out Puget Labs article Haswell vs. Skylake-S: i7 4790K vs i7 6700K

Skylake is interesting and it’s a bit of a mystery at this point in time! I haven’t been able to find any good information of the architecture and I was hoping that this morning (5th of Aug) The good folks over at Anandtech would have one of their indepth deep dives and explain all. Unfortunately it looks like they haven’t been able to find good information either! We’ll have to wait for a few weeks until the Intel Developer Forum, IDF, before we get the nitty-gritty details from Intel.

I have done a little informal testing with the new i7 and i5 processors. (Mostly on the i5 since the guys in production are busy beating up the i7 with overclocking!) My favorite compute benchmark, Linpack, is a bit of a disappointment on Skylake. It’s puzzling really since I would have expected it to be at least as good as Haswell … It looks like it’s not!

I have also been doing some systems testing with the molecular dynamics software NAMD lately so I fired up a couple of test jobs to see if some real code would give better performance than Linpack … it does!

Here’s some quick numbers for you to ponder.

Note: I’m running Ubuntu 15.04 on the Skylake test systems and the i7 4790K, CentOS 7 for the i7 4770

Intel(R) Optimized LINPACK Benchmark data

Intel(R) Core(TM) i5-6600K CPU @ 3.50GHz

Number of CPUs: 1
Number of cores: 4
Number of threads: 4


Maximum memory requested that can be used=16200901024, at the size=45000


=================== Timing linear equation system solver ===================


Size   LDA        Align. Time(s)        GFlops   Residual         Residual(norm) Check
30000  30000  1          102.415        175.7722 6.018069e-10 2.372329e-02   pass
35000  35000  1          166.508        171.6785 8.098306e-10 2.350815e-02   pass
40000  40000  1          244.490        174.5257 1.081908e-09 2.406201e-02   pass

Intel(R) Core(TM) i7-4770 CPU @ 3.40GHz (Hyperthreading disabled)

Number of CPUs: 1
Number of cores: 4
Number of threads: 4


Maximum memory requested that can be used=16200901024, at the size=45000


=================== Timing linear equation system solver ===================


Size   LDA        Align. Time(s)        GFlops   Residual         Residual(norm) Check
30000  30000  1          97.893         183.8918 5.274344e-10 2.079152e-02   pass
35000  35000  1          155.150        184.2465 7.677294e-10 2.228602e-02   pass
40000  40000  1          230.872        184.8207 9.530465e-10 2.119608e-02   pass

Intel(R) Core(TM) i7-6700K CPU @ 4.00GHz

Number of CPUs: 1
Number of cores: 4
Number of threads: 8


Maximum memory requested that can be used=16200901024, at the size=45000


=================== Timing linear equation system solver ===================


Size   LDA        Align. Time(s)        GFlops   Residual         Residual(norm) Check
30000  30000  1          89.853         200.3483 6.018069e-10 2.372329e-02   pass
35000  35000  1          145.296        196.7411 8.098306e-10 2.350815e-02   pass
40000  40000  1          213.034        200.2961 1.081908e-09 2.406201e-02   pass

Intel(R) Core(TM) i7-4790K CPU @ 4.00GHz

Number of CPUs: 1
Number of cores: 4
Number of threads: 8


Maximum memory requested that can be used=16200901024, at the size=45000


=================== Timing linear equation system solver ===================


Size   LDA        Align. Time(s)        GFlops   Residual         Residual(norm) Check
30000  30000  1          78.790         228.4792 6.018069e-10 2.372329e-02   pass
35000  35000  1          126.072        226.7424 8.098306e-10 2.350815e-02   pass
40000  40000  1          182.586        233.6979 1.081908e-09 2.406201e-02   pass

NAMD

Intel(R) Core(TM) i5-6600K CPU @ 3.50GHz

kinghorn@skylake:~/projects/NAMD/test-jobs/f1atpase$ ../../NAMD_2.10_Linux-x86_64-multicore/namd2 +p4 f1atpase.namd

Info: Benchmark time: 4 CPUs 0.521322 s/step 6.03382 days/ns
Program finished after 269.163978 seconds.

Intel(R) Core(TM) i7-4770 CPU @ 3.40GHz

[kinghorn@i7 f1atpase]$ ../../NAMD_2.10_Linux-x86_64-multicore/namd2 +p4 f1atpase.namd
Info: Benchmark time: 4 CPUs 0.606377 s/step 7.01825 days/ns
Program finished after 313.891379 seconds.

The optimized Linpack benchmark was significantly worse on Skylake but the Molecular Dynamics code NAMD did significantly better on Skylake. I think the better NAMD number is more indicative of how the processor is going to perform in general. I would like to understand why the linpack numbers are low though. That is still puzzling me!


Happy computing! –dbk