Intel vs NVIDIA, IBM, Mellanox, AMD and everybody!

The next 18 months are going to see more shakeup and factioning in the computing world than we have seen in over a decade. Intel is pulling more and more of the compute architecture onto a single piece of silicon and tightly integrating the whole hardware stack. That's good and bad. It may let them achieve better performance. However, this is going to leave users with a choice of “all Intel” or something else entirely. And, the “something else” is starting to seriously take shape.

Intel dominates the CPU market and the last 20 years of competition with AMD’s x86 has left AMD a shell of their former self and the RISC based architectures (including Intel’s own Itanium) are mostly gone. AMD’s introduction of the 64 bit Opteron (amd64, a.k.a. x86_64, and, ahh, now intel64 :-), and many (Magny-Cours) core processors forced Intel to get aggressive with processor design (and business tactics) to play catch up. They caught up! Now the “CPU” belongs to Intel, (well, at least the x86 architecture does). The upcoming quad socket E5v3 Xeon processors will put the final nail in the coffin for AMD as far as serious computing goes in my opinion.

[ The Intel Quad Xeon E5 v3 should be showing up in a couple of months and they will make for wonderful many core, big memory, SMP compute platforms. I already really like the Quad v2 Xeon and would recommend it over a small cluster for a lot of work loads. ]

OK, that was the old rivalry. Now its’ a whole new ballgame!

Why is it “everybody vs Intel“? Well, Intel has a very strong CPU offerings from tiny to formidable. They have the Xeon Phi co-processor to compete with GPU’s and FPGA’s. And, since their acquisition of Qlogic’s Infiniband assets a couple of years ago, and their own in-house work, they now have strong Infiniband network fabric technology, “Intel True Scale Fabric”. I’m sure Mellanox is not too happy about that since they are the largest provider of high performance network fabric! Of course, Intel makes motherboards and such too. So what? They are integrating all of this to a tightly coupled, Intel only, platform! The Knights Landing Xeon Phi coming out later this year will be the first offering showing some of this integration, i.e. "Omni Scale Fabric". It won’t be the last, expect to see more stuff rolled into the “CPU” too.

Everybody else

The biggest rival to Intel is surely NVIDIA. For years Intel has wanted a discrete graphics processor and NVIDIA has wanted a CPU, but business tactics and lawyers and such have prevented this from happening. Intel finally kind of gave up on a discrete GPU and put that effort into the Xeon Phi co-processor. It’s not a graphics processor but it does offer competition to NVIDIA’s strong GPU computing offerings. NVIDIA got it’s own CPU in the form of their Tegra ARM design. They were looking into beefing up ARM for compute integrated with their GPU compute accelerators. However they really want a lot more than that, and they are going to get it!

Welcome IBM and OpenPOWER!

I have a warm spot in my heart for IBM POWER architecture. Some of the first serious computing I did was on IBM RS/6000, POWER1. POWER is IBM’s RISC based processor architecture and has a long history of stellar performance. Four of the top 10 Supercomputing Top500 systems form Nov. 2014 are on IBM POWER architecture. [ 2 are on Intel Xeon + NVIDIA Tesla, 2 on Intel Xeon + Xeon Phi, 1 on Sparc64 and 1 on AMD Opteron + NVIDIA Tesla ]

IBM has released the POWER architecture design similar to how ARM is done. The OpenPOWER Foundation is providing a rallying point to foster this ecosystem around POWER.

What does that mean? It means that companies other than IBM are going to be able to manufacture and sell POWER based systems and make architectural design decisions. What it “really” means is that Intel is going to have some serious competition again! I believe the community at large will do a much better job of proliferating POWER than IBM ever did.

NVIDIA and IBM already have 325 million dollar contract in hand to deliver two, energy efficient, 150 peta-FLOP class systems to the US DOE. This will be POWER9 plus NVIDIA Tesla integrated at the board level using NVIDIA’s NVLink and Mellanox network fabric. They should be the fastest systems in the world. And, it’s great motivation to move the whole OpenPOWER ecosystem forward. The Portland Group (PGI) is putting a lot of effort on the compiler side. The ecosystem is coming together. Another announcement I consider significant is that Rackspace has joined OpenPOWER and is planning of building their own servers to run Open Stack (cloud stuff). They will be leveraging Open Compute open hardware design for base systems and using POWER architecture.

There are significant players in OpenPOWER. The ones I’m most interested in as having a direct impact on future HPC are IBM, NVIDIA/PGI, Mellanox, Ubuntu, Google, Tyan, Rackspace, Bull, Data Direct Networks. There are several FPGA companies in OpenPOWER too and that could mean very specialized problem specific hardware.

It is all VERY interesting! What does this mean right now? Not much. However, by the end of 2015 we will be looking at perhaps some very uncomfortable but exciting changes in the computing landscape.

Happy computing! –dbk