Windows 10 with Xeon Phi

Can you use an Intel Xeon Phi with Windows 10? Yes, you can. However, just because you can do something, doesn’t mean that you should do it!

You can take that last phrase as a complete and total disclaimer. Using the Xeon Phi with Windows 10 is not supported by anyone, including, Microsoft, Intel, Puget Systems, and Myself. Nuf said …

Given the above discouragement, why then, would I try to do this? Answer: Because it’s there. I did manage to get everything working without much trouble including a Windows native application that was offloading to the Phi (Linpack in “automatic offload mode”).

Here’s the screenshot to prove it worked.

 

This really started when Intel launched their “deal” on a passive cooled Xeon Phi 5110p bundled with a 1 year trial license for their (wonderful) compiler tool suite, Parallel Studio XE 2016 Cluster edition. Intel was basically saying — If you buy one of our old Phi cards that we have laying around with a 90% discount we will throw in are a year of are full compiler suite so you can try to use it. That’s basically $6000 worth of good stuff for a couple hundred dollars. AND, this was not a Linux only deal! They would let you get the Windows version of Parallel Studio. That’s significant because there is no way to get the Intel dev tools free for Windows that I know of. ( Linux programmers working on open-source projects can license the Intel compiler tools for free ) This all happened right about the the time Windows 10 was released. Someone asked, “can you use the Phis on Win 10?” … “I don’t know, maybe I’ll try it.” A couple of months went by and then I had a free afternoon, so, … why not try it.

Reasons to try this:

  1. The Intel deal on the Phi and compilers is hard to resist, especially for Windows uses.
  2. The Xeon Phi is just a really interesting device (This is “old” “Knights Corner” stuff).
  3. Having use of the Intel compilers for free makes it worth the effort.
  4. Microsoft has a free “community” edition of Visual Studio now. (you need that installed before you can really use the Intel compilers).
  5. All of the software versions are compatible right now except the Intel drivers (MPSS) for the Phi on Win 10 (but the Win 8.1 drivers work).
  6. If you actually try to write some code or port a project to Phi and do the work needed to optimize for it you will have learned a LOT about modern parallel code development. This is really part of Intel’s “Code Modernization” project.
  7. Plain old curiosity, … or insanity (The Phi is a Linux embedded device and it’s much easier to work with under Linux)

Reasons not to try this:

  1. The Knights Corners Phi is soon to be replaced by the new Knights Landing devices which will include an add in PCIe version and, more interestingly, a stand alone processor that doesn’t require a host system to run. It will be significantly different than the current Phi.
  2. You have to have both Linux and Windows system administration skills to do the setup and understand the Phi card.
  3. The Xeon 5110p is a passive cooled card! That means you have to have it in a server chassis designed to handle passive cards or you have to have to create a cooling system using high pressure fans and mounts that will force enough air through the card to keep it from overheating!
  4. Look at my old blog posts about this stuff. In particular; Top 5 Xeon Phi Misconceptions and Intel Xeon Phi with Windows!

I am not going to give you a detailed how-to, however, this was actually pretty easy so if you think you want to try this just go for it. The following steps are what I did.

Important! Your motherboard has to have “Large BAR support” a.k.a. “Above 4G Decoding” in order to use the Xeon Phi! I’m making the, rather large, assumption that if you are thinking about trying this that you have a Phi card (properly cooled) and a system that will work with it. I was using an ASUS X99-E WS motherboard with “Above 4G decoding” enabled.

Windows 10

I started from a fresh install of Windows 10. Don’t try doing the Phi setup on a systems with important information and applications on it. You could accidentally clobber something you care about! Tip: After you do your Windows 10 install be sure to go through the (appalling) privacy settings and adjusting things to your taste. My advise is “just say NO!”. Add whatever you like to your Windows install (I always add Firefox before I do anything else). You will probably want to install a decent Windows native file editor, something like notepad++ is pretty handy.

Cygwin

In order to access the Phi you will need to have an ssh client (which Windows still doesn’t have natively after all these years! ) You can install PuTTY for this but I highly recommend that you just install Cygwin so you have a full Linux like environment available (including an X server) and you can use the Cygwin BASH shell or install one of the nicer terminal applications. I installed a full MATE environment! It amazes me that you can do this. Cygwin is really a great project. Get the the most recent Cygwin setup-x86_64.exe from cygwin.com and run it. By default you will get a reasonable POSIX compliant environment but there are lots of things you can add. If you don’t know what everything is then be conservative, you can always run the setup.exe again and add more stuff. I like to add at least emacs, openssh, ping, rsync, and bash-completion. You will need to create ssh keys to access the Phi from your user account. Do ssh-keygen from the Cygwin term to generate the key pair. The MPSS install found my key pair and set up access to the Phi automatically. Create these keys before you install MPSS.

Intel MPSS

After your OS install and Cygwin setup you can get the latest release of Intel MPSS. This is the driver and tools install for the Xeon Phi. Windows drivers are available and are up-to-date. When I did this there was a fresh release, https://software.intel.com/en-us/articles/intel-manycore-platform-software-stack-mpss 3.6, of MPSS for Windows. The listed support is for Windows 7 – 8.1 and Server 2008 and 2012. No Windows 10 support was listed at the time of this writing. However, the package installed and everything seems to be working fine on Windows 10. Just follow the instructions during the install and be sure to read the install readme document and the users guid.

You will need to one un-Windows like thing during the setup, create a new group for people using the Phi. Start mmc from the Win search box and select “local users and groups”, click groups folder, select Action -> new group. Create MICUSERS and then add your user account and Administrator to that group.

Note: you can run the Phi, a.k.a mic, control application, micctrl, from a DOS prompt or a Cygwin shell.

Note: The MPSS install creates a virtual network to the Phi over the PCIe bus. Unfortunately Intel used a very common subnet for this 192.168.1.0, which conflicted with the LAN subnet I was on while I was working on this, so I did have to change that network during the setup configuration.

After the MPSS install your Win 10 system should properly recognise the Phi and you should be able to boot the Phi (mic) and start micsmc to monitor usage and temperature and running micinfo should show you all the details about your card. You should also now be able to log in to the Phi from the Cygwin shell using your ssh keys. If you can do all of this then you have a Windows 10 with Xeon Phi working.

Microsoft visual Studio

Now, if you want to do anything interesting with the Phi you need to have the Intel compilers installed. AND, the Intel compiler install assumes you have Microsoft Visual Studio installed. Fortunately Visual Studio 2015 and Intel Parallel Studio 2016 work together. Even better — Microsoft now has a “Community” version of Visual Studio for free. (That really is significant in my opinion — it’s about time! Now they just need to get their own ssh server/client and it will seem like a real OS 🙂 [ that’s a blatant jab for sure but, really, I like Microsoft and I think it’s wonderful the way the company is moving these days ]

Get (at least) the Community version of Visual Studio 2015 and install it. Do a “custom” install and be sure you install C/C++ stuff (and anything else you want).

Intel Parallel Studio XE 2016

With the MPSS installed, the Phi recognized and working properly, and Visual Studio installed you can now install the Intel compilers and start working with the Phi. The Intel Parallel Studio install will find the Phi (mic) and configure itself to work with it. (including rebooting the Phi during the setup). To start with you can grab the most recent Windows version of Parallel Studio and install it with a trial license. The installer is simple to use and and will do the right thing for integrating with Visual Studio. It is worth your time to look at some of the tools Intel bundles, they are really nice! If you are working on parallel optimization of your code these tools can be a big help.

Running some test jobs on the Phi

OK, I am mostly going to leave this as an exercise for the reader 🙂 With the above setup done you can start playing with the Phi. The first things I did was copy the Intel mkl benchmark directory to my Cygwin home directory and my Windows home directory. That benchmark directory contains binary builds of the linpack benchmark.

	C:\Program Files (x86)\IntelSWTools\compilers_and_libraries_2016.0.110\windows\mkl\benchmark

The first test was to run linpack natively on the Phi. I did this from the Cygwin bash shell. You have to copy all the *mic file to the card and the mic openMP library. From the benchmark/linpack directory,

	scp *mic mic0:~/
	scp C:\Program Files (x86)\Common Files\Intel\Shared Libraries\compiler\lib\intel64_win_mic\libiomp5.so mic0:~/

Then login to the Phi and run the linpack,

	ssh mic0
	export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:.
	./runme_mic

That gave the expected performance for the card of around 710 GFLOP/s

Note that all you have done at this point is run the linux linpack binary on the Phi. The real test comes next.

The simple test that is most interesting is running the linpack benchmark natively on Windows with “automatic offload” to the Phi. This means that you are running you are running a Windows application that is using the Phi. to do that the program has to have instruction telling the executable that a code segment needs to be compiled for the Phi and run on it during the program execution. That’s basically what a Windows application using the Phi needs to do. Hopefully that makes sense!

To do the this test open a DOS shell (or Cygwin bash) and cd into the benchmark\linpack directory and run the .bat file runme_xeon64_ao.bat That will create a file called win_xeon64_ao.txt that will have to output from the Windows linpack executable run with some of the work offloaded to the Phi. I got 816 GFLOP/s from that run! Not bad.

One serious problem occurred during this offload test. One of the job runs “blew up”, see below,

	Size   LDA    Align. Time(s)    GFlops   Residual      Residual(norm)  Check
	
	40960  41022  4      63.663     719.6706 9.964507e-010 2.115631e-002   pass
	43008  43072  4      236.986    223.8018 1.040247e+006 2.001728e+013   FAIL
	45056  45120  4      74.771     815.5667 1.398059e-009 2.454023e-002   pass

FAIL! That residual error is really bad! The calculation failed badly on one of the problem sizes during the job run. Everythings else was fine. I don't know what caused this failure.

So there you have it. Yes, the Xeon Phi does work with Windows 10. Should you do that? Well, I recommend sticking with Linux for this kind of thing. At least for now! Who knows what the new Knights Landing Xeon Phi will bring!

Happy computing –dbk