Table of Contents
I was doing some GPU compute testing the other day and I happened to be using a nice setup with a new Ivy Bridge-E Core i7-4960X (Extreme edition) processor and decided that I would take a break and see what the CPU would do with my favorite benchmark, Linpack. This is a pretty nice 6-core desktop processor so I thought it would be interesting to see what it would do against my humble Haswell Core i7 4770.
…hint for the impatient, Haswell wins!
Keep in mind this is just a quick comparison running the Linpack benchmark from the Intel MKL library. I was not intending to do any kind of thorough testing. The system configurations just happened to be what I had had set up at the time.
System configurations — briefly
The processor is really the only relevant component since the jobs I ran fit in 16GB.
Ivy Bridge-E | Haswell | |
---|---|---|
Motherborad | ASUS Rampage iv Gene X79 | ASUS Gryphon Z87 |
CPU | Intel Core i7-4960X | Intel Core i7-4770 |
Memory | 16GB DDR3 1600 | 32GB DDR3 1600 |
OS | Fedora 19 | Fedora 20 |
There are lots of differences between these two processors and some may be more important than others depending on what you are trying to do. For example the current Haswell has 16 PCIe lanes and the Ivy Bridge-E has 40!
Notable Differences in CPU Archetecture
Ivy Bridge-E | Haswell | |
---|---|---|
Name 🙂 | Intel Core i7-4960X | Intel Core i7-4770 |
# of Cores | 6 | 4 |
Clock Speed | 3.6GHz | 3.4GHz |
Max Turbo Frequency | 4GHz | 3.9GHz |
Cache | 15MB | 8MB |
Instruction Set Extensions | SSE4.2, AVX | SSE4.2, AVX 2.0 (FMA3) |
Surprising(?) Result
The Haswell processor is just wonderful for numerical linear-algebra type of compute tasks!
Ivy Bridge-E | Haswell | |
---|---|---|
# of Real Cores | 6 | 4 |
# of Threads (*) | 12 | 6 |
Approx. Price | $1059.00 | $312.00 |
Linpack Performance at size=35000 | 155.3 GFLOPS | 182.8 GFLOPS |
(*) I left Hyper-Threading on during these tests but it contributes nothing to this benchmark. It actually slows things down a bit, but you knew that …
It's interesting to me to see how much difference AVX2 makes with the Haswell. However, don't read too much into this result. The i7-4960X is a great processor and it has some nice features that Haswell lacks at this point. … but, for my favorite benchmark Haswell rocks!
I've included the terminal output for the job runs. It's kind of interesting to see how the processors do with smaller jobs etc..
Intel MKL Linpack Benchmark Core i7-4960X Output
Intel(R) Optimized LINPACK Benchmark data Current date/time: Sat Feb 22 19:24:53 2014 CPU frequency: 3.601 GHz Number of CPUs: 1 Number of cores: 6 Number of threads: 12 Parameters are set to: Number of tests: 15 Number of equations to solve (problem size) : 1000 2000 5000 10000 15000 18000 20000 22000 25000 26000 27000 30000 35000 40000 45000 Leading dimension of array : 1000 2000 5008 10000 15000 18008 20016 22008 25000 26000 27000 30000 35000 40000 45000 Number of trials to run : 4 2 2 2 2 2 2 2 2 2 1 1 1 1 1 Data alignment value (in Kbytes) : 4 4 4 4 4 4 4 4 4 4 4 1 1 1 1 Maximum memory requested that can be used=16200901024, at the size=45000 =================== Timing linear equation system solver =================== Size LDA Align. Time(s) GFlops Residual Residual(norm) Check 1000 1000 4 0.011 60.5545 1.031675e-12 3.518276e-02 pass 1000 1000 4 0.011 62.9133 1.031675e-12 3.518276e-02 pass 1000 1000 4 0.010 64.6291 1.031675e-12 3.518276e-02 pass 1000 1000 4 0.010 64.1308 1.031675e-12 3.518276e-02 pass 2000 2000 4 0.073 73.3250 4.382272e-12 3.812040e-02 pass 2000 2000 4 0.073 73.6438 4.382272e-12 3.812040e-02 pass 5000 5008 4 0.741 112.5025 2.581643e-11 3.599893e-02 pass 5000 5008 4 0.742 112.3135 2.581643e-11 3.599893e-02 pass 10000 10000 4 5.021 132.8041 8.700884e-11 3.068020e-02 pass 10000 10000 4 5.011 133.0682 8.700884e-11 3.068020e-02 pass 15000 15000 4 15.177 148.2760 2.225641e-10 3.505422e-02 pass 15000 15000 4 15.232 147.7491 2.225641e-10 3.505422e-02 pass 18000 18008 4 25.853 150.4133 2.894987e-10 3.170367e-02 pass 18000 18008 4 25.854 150.4091 2.894987e-10 3.170367e-02 pass 20000 20016 4 35.233 151.3944 4.097986e-10 3.627616e-02 pass 20000 20016 4 35.217 151.4647 4.097986e-10 3.627616e-02 pass 22000 22008 4 46.440 152.8784 4.548092e-10 3.331299e-02 pass 22000 22008 4 46.429 152.9131 4.548092e-10 3.331299e-02 pass 25000 25000 4 67.938 153.3455 6.089565e-10 3.462917e-02 pass 25000 25000 4 67.910 153.4068 6.089565e-10 3.462917e-02 pass 26000 26000 4 76.098 153.9946 6.669421e-10 3.506981e-02 pass 26000 26000 4 76.081 154.0296 6.669421e-10 3.506981e-02 pass 27000 27000 4 84.994 154.4037 6.672171e-10 3.253690e-02 pass 30000 30000 1 116.300 154.7882 8.421348e-10 3.319704e-02 pass 35000 35000 1 184.086 155.2852 1.085509e-09 3.151068e-02 pass 40000 40000 1 277.433 153.8026 1.466774e-09 3.262155e-02 pass 45000 45000 1 424.129 143.2444 1.711494e-09 3.011194e-02 pass Performance Summary (GFlops) Size LDA Align. Average Maximal 1000 1000 4 63.0569 64.6291 2000 2000 4 73.4844 73.6438 5000 5008 4 112.4080 112.5025 10000 10000 4 132.9362 133.0682 15000 15000 4 148.0125 148.2760 18000 18008 4 150.4112 150.4133 20000 20016 4 151.4295 151.4647 22000 22008 4 152.8958 152.9131 25000 25000 4 153.3761 153.4068 26000 26000 4 154.0121 154.0296 27000 27000 4 154.4037 154.4037 30000 30000 1 154.7882 154.7882 35000 35000 1 155.2852 155.2852 40000 40000 1 153.8026 153.8026 45000 45000 1 143.2444 143.2444 Residual checks PASSED End of tests Done: Sat Feb 22 20:02:53 PST 2014
Intel MKL Linpack Benchmark Core i7-4770 Output
Intel(R) Optimized LINPACK Benchmark data Current date/time: Sat Feb 22 19:26:26 2014 CPU frequency: 3.897 GHz Number of CPUs: 1 Number of cores: 4 Number of threads: 8 Parameters are set to: Number of tests: 15 Number of equations to solve (problem size) : 1000 2000 5000 10000 15000 18000 20000 22000 25000 26000 27000 30000 35000 40000 45000 Leading dimension of array : 1000 2000 5008 10000 15000 18008 20016 22008 25000 26000 27000 30000 35000 40000 45000 Number of trials to run : 4 2 2 2 2 2 2 2 2 2 1 1 1 1 1 Data alignment value (in Kbytes) : 4 4 4 4 4 4 4 4 4 4 4 1 1 1 1 Maximum memory requested that can be used=16200901024, at the size=45000 =================== Timing linear equation system solver =================== Size LDA Align. Time(s) GFlops Residual Residual(norm) Check 1000 1000 4 0.011 60.9602 1.194739e-12 4.074366e-02 pass 1000 1000 4 0.009 73.9902 1.194739e-12 4.074366e-02 pass 1000 1000 4 0.009 77.3679 1.194739e-12 4.074366e-02 pass 1000 1000 4 0.009 77.1606 1.194739e-12 4.074366e-02 pass 2000 2000 4 0.063 84.1916 4.536926e-12 3.946570e-02 pass 2000 2000 4 0.063 84.4081 4.536926e-12 3.946570e-02 pass 5000 5008 4 0.694 120.1819 2.471656e-11 3.446525e-02 pass 5000 5008 4 0.655 127.3289 2.471656e-11 3.446525e-02 pass 10000 10000 4 4.021 165.8472 9.436774e-11 3.327502e-02 pass 10000 10000 4 4.071 163.8038 9.436774e-11 3.327502e-02 pass 15000 15000 4 12.967 173.5532 2.169435e-10 3.416896e-02 pass 15000 15000 4 12.963 173.6057 2.169435e-10 3.416896e-02 pass 18000 18008 4 21.790 178.4627 2.645608e-10 2.897266e-02 pass 18000 18008 4 21.800 178.3777 2.645608e-10 2.897266e-02 pass 20000 20016 4 29.777 179.1361 3.504283e-10 3.102058e-02 pass 20000 20016 4 29.840 178.7565 3.504283e-10 3.102058e-02 pass 22000 22008 4 39.711 178.7811 4.267059e-10 3.125453e-02 pass 22000 22008 4 39.694 178.8594 4.267059e-10 3.125453e-02 pass 25000 25000 4 58.599 177.7833 5.194889e-10 2.954147e-02 pass 25000 25000 4 58.609 177.7516 5.194889e-10 2.954147e-02 pass 26000 26000 4 65.728 178.2908 6.593495e-10 3.467057e-02 pass 26000 26000 4 65.860 177.9332 6.593495e-10 3.467057e-02 pass 27000 27000 4 72.205 181.7522 6.135402e-10 2.991934e-02 pass 30000 30000 1 98.935 181.9566 7.133177e-10 2.811906e-02 pass 35000 35000 1 156.340 182.8432 1.085449e-09 3.150893e-02 pass 40000 40000 1 234.554 181.9194 1.338275e-09 2.976370e-02 pass 45000 45000 1 338.077 179.7047 1.782676e-09 3.136431e-02 pass Performance Summary (GFlops) Size LDA Align. Average Maximal 1000 1000 4 72.3697 77.3679 2000 2000 4 84.2999 84.4081 5000 5008 4 123.7554 127.3289 10000 10000 4 164.8255 165.8472 15000 15000 4 173.5795 173.6057 18000 18008 4 178.4202 178.4627 20000 20016 4 178.9463 179.1361 22000 22008 4 178.8202 178.8594 25000 25000 4 177.7674 177.7833 26000 26000 4 178.1120 178.2908 27000 27000 4 181.7522 181.7522 30000 30000 1 181.9566 181.9566 35000 35000 1 182.8432 182.8432 40000 40000 1 181.9194 181.9194 45000 45000 1 179.7047 179.7047 Residual checks PASSED End of tests Done: Sat Feb 22 19:59:03 PST 2014
Happy Computing! –dbk