Table of Contents
The first Intel Skylake processors are out, the core i7 6700K and i5 6600K. My first question — “are they good for compute?” Short answer is “kind of, probably…“. In the serious computing crowd everyone has been looking forward to Skylake because it was to have the eagerly awaited AVX-512 instructions implemented. That is a vector unit with twice the width of the existing AVX2 implementation. Well, the consumer “core” version is not going to get that good stuff. We’ll have to wait for the Xeon part for that. That’s OK. Intel seems to be moving to more of a differentiation between the consumer “core” line of processor and the “data center” Xeon hardware. I think going forward the decision will be easier — just go with the Xeon part.
One thing to keep in mind about Skylake — it is not just a new processor, it’s a new platform. It’s on a new socket, with a new chipset, new motherboards, new power management. It’s using DDR4, new motherboards have new good stuff like USB 3.1 etc. So, yes, Skylake is a significant upgrade and yes, it is a great processor for a new system.
Check out Puget Labs article Haswell vs. Skylake-S: i7 4790K vs i7 6700K
Skylake is interesting and it’s a bit of a mystery at this point in time! I haven’t been able to find any good information of the architecture and I was hoping that this morning (5th of Aug) The good folks over at Anandtech would have one of their indepth deep dives and explain all. Unfortunately it looks like they haven’t been able to find good information either! We’ll have to wait for a few weeks until the Intel Developer Forum, IDF, before we get the nitty-gritty details from Intel.
I have done a little informal testing with the new i7 and i5 processors. (Mostly on the i5 since the guys in production are busy beating up the i7 with overclocking!) My favorite compute benchmark, Linpack, is a bit of a disappointment on Skylake. It’s puzzling really since I would have expected it to be at least as good as Haswell … It looks like it’s not!
I have also been doing some systems testing with the molecular dynamics software NAMD lately so I fired up a couple of test jobs to see if some real code would give better performance than Linpack … it does!
Here’s some quick numbers for you to ponder.
Note: I’m running Ubuntu 15.04 on the Skylake test systems and the i7 4790K, CentOS 7 for the i7 4770
Intel(R) Optimized LINPACK Benchmark data
Intel(R) Core(TM) i5-6600K CPU @ 3.50GHz
Number of CPUs: 1 Number of cores: 4 Number of threads: 4 Maximum memory requested that can be used=16200901024, at the size=45000 =================== Timing linear equation system solver =================== Size LDA Align. Time(s) GFlops Residual Residual(norm) Check 30000 30000 1 102.415 175.7722 6.018069e-10 2.372329e-02 pass 35000 35000 1 166.508 171.6785 8.098306e-10 2.350815e-02 pass 40000 40000 1 244.490 174.5257 1.081908e-09 2.406201e-02 pass
Intel(R) Core(TM) i7-4770 CPU @ 3.40GHz (Hyperthreading disabled)
Number of CPUs: 1 Number of cores: 4 Number of threads: 4 Maximum memory requested that can be used=16200901024, at the size=45000 =================== Timing linear equation system solver =================== Size LDA Align. Time(s) GFlops Residual Residual(norm) Check 30000 30000 1 97.893 183.8918 5.274344e-10 2.079152e-02 pass 35000 35000 1 155.150 184.2465 7.677294e-10 2.228602e-02 pass 40000 40000 1 230.872 184.8207 9.530465e-10 2.119608e-02 pass
Intel(R) Core(TM) i7-6700K CPU @ 4.00GHz
Number of CPUs: 1 Number of cores: 4 Number of threads: 8 Maximum memory requested that can be used=16200901024, at the size=45000 =================== Timing linear equation system solver =================== Size LDA Align. Time(s) GFlops Residual Residual(norm) Check 30000 30000 1 89.853 200.3483 6.018069e-10 2.372329e-02 pass 35000 35000 1 145.296 196.7411 8.098306e-10 2.350815e-02 pass 40000 40000 1 213.034 200.2961 1.081908e-09 2.406201e-02 pass
Intel(R) Core(TM) i7-4790K CPU @ 4.00GHz
Number of CPUs: 1 Number of cores: 4 Number of threads: 8 Maximum memory requested that can be used=16200901024, at the size=45000 =================== Timing linear equation system solver =================== Size LDA Align. Time(s) GFlops Residual Residual(norm) Check 30000 30000 1 78.790 228.4792 6.018069e-10 2.372329e-02 pass 35000 35000 1 126.072 226.7424 8.098306e-10 2.350815e-02 pass 40000 40000 1 182.586 233.6979 1.081908e-09 2.406201e-02 pass
NAMD
Intel(R) Core(TM) i5-6600K CPU @ 3.50GHz
kinghorn@skylake:~/projects/NAMD/test-jobs/f1atpase$ ../../NAMD_2.10_Linux-x86_64-multicore/namd2 +p4 f1atpase.namd Info: Benchmark time: 4 CPUs 0.521322 s/step 6.03382 days/ns Program finished after 269.163978 seconds.
Intel(R) Core(TM) i7-4770 CPU @ 3.40GHz
[kinghorn@i7 f1atpase]$ ../../NAMD_2.10_Linux-x86_64-multicore/namd2 +p4 f1atpase.namd Info: Benchmark time: 4 CPUs 0.606377 s/step 7.01825 days/ns Program finished after 313.891379 seconds.
The optimized Linpack benchmark was significantly worse on Skylake but the Molecular Dynamics code NAMD did significantly better on Skylake. I think the better NAMD number is more indicative of how the processor is going to perform in general. I would like to understand why the linpack numbers are low though. That is still puzzling me!
Happy computing! –dbk