Puget Systems print logo

https://www.pugetsystems.com

Read this article at https://www.pugetsystems.com/guides/1688
Article Thumbnail

Unsupported: How to Make Dual NVLink Work on Windows 10

Written on March 4, 2020 by William George
Share:

Introduction

With the current Turing-based generation of GeForce and Quadro cards, NVIDIA offers a method of physically connecting pairs of cards to enable direct communication between them. This facilitates SLI for gaming and similar applications, and can allow direct memory access between the cards for scientific computing and rendering (if the software supports it). NVIDIA calls these physical connectors "NVLink Bridges", and outside of gaming (where SLI is the more focused-on term) they use the NVLink name to refer to both the technology and the connection itself. I have written a lot about NVLink in the past, including how to enable and test it in Windows, as well as which bridges will work with which cards.

But what if you want to have more than two video cards? In my early testing I included a setup with four cards in two NVLinked pairs - and it worked just fine. Ever since that time, I had assumed this was the way things were supposed to work - and we even sold such configurations on occasion! - until recently we had an order come through for a setup with four GeForce RTX 2080 Ti cards, and it wasn't working as expected. I got involved because of my past experience with testing this stuff, but the latest NVIDIA drivers just would not cooperate at all. We got it working on an older driver revision, but then found that Windows would immediately update the driver... and while we had some leads to preventing that, it wasn't really something we could stand behind a customer using in the field.

In light of that experience, I've taken time in the last couple of weeks to go through and test several different NVIDIA drivers on two configurations: four GeForce RTX 2080 Ti cards as well as four Quadro RTX 6000s, both set up in two physically bridged pairs. In this article I will chronicle what I found worked, what didn't work or behaved oddly, and where we are at with this as a company as a result.

Test Hardware

Here is the hardware platform I used for testing, a rare motherboard with enough PCI-Express slots to actually set this up:

Test Platform
CPU Intel Xeon W-3245
CPU Cooler Stock Intel Xeon SP 92mm Cooler
Motherboard ASUS PRO WS C621-64L SAGE/10G
RAM 6x DDR4-2666 16GB ECC (96GB total)
Video Card 4x ASUS GeForce RTX 2080 Ti 11GB
4x PNY Quadro RTX 6000 24GB
2x NVIDIA Quadro 2-slot NVLink Bridge
Hard Drive Samsung 960 Pro 1TB
Software Windows 10 Pro 64-bit
NVLink Test Utility

Four NVIDIA Quadro RTX 6000 video cards in two NVLink pairs

That's a lot of horsepower!

Test Methodology

My test process was fairly simple:

  • Perform a clean installation of the NVIDIA driver to be examined (using DDU if needed to properly clean up first)
  • After installation, reboot the system
  • Open NVIDIA Control Panel and check what the default SLI configuration was after driver installation
  • If not already in SLI, attempt to switch the configuration to that mode and note what cards it showed would be linked
  • Apply settings
  • See if the links shown in NVIDIA Control Panel matched what was shown before applying
  • Run our NVLink test utility and record the result
  • Disable SLI and proceed on to the next driver

The expected behavior, for a driver which should allow the system to use both pairs of cards in SLI / NVLink, is for NVIDIA Control Panel to show the four cards being in two SLI pairs, as in the screenshot below:

Two pairs of GeForce RTX 2080 Ti cards in SLI / NVLink in Windows 10 Pro

Click image above to enlarge

Something else worth noting on that screenshot is that one card in each pair has a monitor connected to it. That is required in Windows for a pair of cards to be put into SLI, so a setup like this requires either two monitors, two separate connections to a single monitor, or one real monitor plus a dummy dongle that fakes the presence of another monitor.

Test Results - GeForce

Here is a table showing the driver versions I tested and their behavior. To save space, I have grouped together sequences of drivers that functioned the same:

Driver Versions Dual NVLink? Description
411.70
417.01
Non-functional These older drivers install just fine, and look like they will allow both pairs of cards to enter into SLI / NVLink, but when you actually click "Apply" they give warnings about programs running in the background. After those are closed, and you click "Continue", the process appears to work for a moment but then reverts back to being disabled.
419.35
419.67
430.86
Functional These are the last GameReady driver before the Creator / Studio drivers started to come out, the lone "Creator" driver, and the first "Studio" driver (respectively) and they all work as expected. In fact, SLI is enabled by default when they are installed, and NVLink tests show it is working immediately.
431.70
431.86
441.12
Non-functional These Studio drivers default to one pair of cards being in SLI (and functioning in NVLink) upon installation, and no amount of adjusting settings was able to improve that. I could never get both pairs to even look like they were going to be enabled, though sometimes it would switch which of the two pairs of cards was in SLI.
441.66
442.19
Non-functional These were the most recent Studio drivers when I was testing, and in both cases I had to physically remove the NVLink bridges in order to get the drivers to even install properly. If that wasn't done, upon rebooting after driver installation, one or two of the four cards would show up with errors in Device Manager and the NVIDIA control panel would not run. Even using DDU did not help. Only removing the bridges, then installing drivers and rebooting, then finally shutting down and putting the bridges back on would allow the drivers to function properly. Even once all that was done, only one pair of cards would go into SLI / NVLink at a time.

I was disappointed to find that for several driver revisions this feature has not worked, though it did when these cards were first launched (I have records of the older drivers working, even though they now have trouble with background processes in Windows) and even up through the first Studio driver release it was perfectly functional. One of my coworkers reached out to NVIDIA with this information, to see what they had to say, and we got word back that this feature is not supported on GeForce cards, and the fact that it worked in the past was unintentional. After all the testing this did not surprise me, though again it did disappoint, and it can somewhat be inferred from the fact that no 2-slot NVLink bridges exist that are GeForce branded. To do this testing at all, we had to use Quadro branded bridges - as listed in the Test Hardware section, and as shown in the image below.

Four Asus GeForce RTX 2080 Ti blower-style video cards in two NVLink pairs

One of the cards had a defective LED, but worked fine otherwise

Test Results - Quadro

This brings up a great question, though: does this feature work on Quadro cards? After all, there are 2-slot bridges that NVIDIA sells for them, thus enabling (physically, at least) such a setup without having to go outside of their official branding. To find out, I did similar testing with a handful of Quadro driver releases - sticking mostly to the latest version of each driver family (the first three digits of the driver version). The results are shown in the same style of table:

Driver
Version
Dual
NVLink?
Description
412.40 Functional Dual NVLink could be enabled, and worked, but there was a warning when enabling it (as we saw with the earliest GeForce drivers). Unlike those GeForce drivers, though, this one was able to get the Quadro cards into SLI / NVLink and the performance was as expected.
426.32 Functional Dual NVLink worked perfectly! It didn't default to being in SLI / NVLink immediately after driver installation, as some of the GeForce drivers did, but it showed the correct card pairings when switching to SLI mode and then gave the correct bandwidth when tested.
431.98 Functional When attempting to enable SLI in this driver, it defaulted to showing all four cards connected in "4-way SLI" - which is the only time I saw such behavior with any of the GeForce or Quadro RTX series cards / drivers. After applying that setting, and running our NVLink test, the script didn't properly identify which pairs of cards were in NVLink together - but the actual bandwidth results did show both pairs of cards having the proper communication speeds for NVLink. I think the reason the script got confused is because even the cards that were not physically bridged were still shown as having P2P access to each other, just with lower bandwidth (they must have been communicating over PCI-E). One advantage of this configuration, however, was that it did not require two monitor connections in order for both pairs of SLI to be enabled: because it was all one big, happy SLI family a single monitor connection was sufficient to allow the full setup! This was definitely the weirdest result out of all my testing, but technically it worked.
442.50 Non-functional This is the latest driver at this time, released at the end of February 2020. Unfortunately, dual NVLink did not work with this driver. Only one pair of cards would go into SLI, and often not even a pair that was connected via a NVLink bridge! I tried repeatedly, and sometimes three cards would go into a single SLI triplet... but never two pairs or all four. This odd behavior confused our test program, but measured bandwidth was very low - showing that NVLink was not working.

Until that last driver test, I had high hopes that this type of configuration would be properly supported by NVIDIA on Windows... but it looks like that is not the case. We haven't yet gotten official word from NVIDIA about whether this is supposed to work or not, but I am guessing not given that the latest drivers don't behave (plus the odd results from the version before that). So where does that leave us?

Conclusion

Can you make two pairs of NVLink cards work in Windows 10? Yes, by using older drivers... but Windows may update them at any time, potentially messing up the configuration. You also wouldn't be able to utilize improved performance or added features from newer drivers, and there is no guarantee that future versions of Windows will work with the older drivers. All in all, not a great solution.

Does Puget Systems Offer Dual Pairs of NVLinked Cards in Windows?

No, due to fact that the latest drivers do not behave with this sort of configuration - and that NVIDIA has said they don't support it - we can no longer offer dual pairs of NVLinked graphics cards in Windows. One pair is doable, with the proper motherboard, chassis, and power supply... and we do offer configurations with three or four video cards, just not in NVLink.

I should also note that both GeForce and Quadro cards have worked in NVLink just fine under Linux, and don't even need any special work done to enable that (like putting the cards in SLI, which is required in Windows). This could always change if NVIDIA alters the behavior of their Linux driver, but for now that is the way to go if you absolutely must have this setup.

Tags: NVLink, NVIDIA, GeForce, Quadro, SLI, Windows 10, Video Card, GPU
Luca Pupulin

Hi William,
great and interesting article as usual...
just two questions...

why using four 2080 Ti in NVLink mode instead of two Titans (in NVLink as well)?

I didn't understand well if you enabled SLI with Quadro cards;it shouldn't be necessary,am I correct?

Cheers,
Luca

Posted on 2020-03-05 17:35:01

Four 2080 Ti cards together have more horsepower than two Titans, so there are many cases (like GPU based rendering) where you might want to use them and also have access to NVLink. Moreover, the Titan RTX cards are only available with dual fan configurations, which mean they are not ideal for more than two cards in a system (so I did not bother testing them in quad configurations, like I did with the 2080 Ti and RTX 6000).

I am trying to remember if I ever tried running my NVLink test utility on the Quadro cards with them *not* in SLI... and I cannot remember now. But as far as I can recall, from all of my prior with with NVLink in Windows, SLI was required on both Quadro and GeForce to enable that feature. The only exception was back with the older Quadro GP100 and GV100, which required a different trick to enable NVLink (and a third video card as well, to handle display output):

https://www.pugetsystems.co...

Posted on 2020-03-13 18:03:37
Turing

https://www.nvidia.com/en-u...
https://nvidia.custhelp.com...

Do not use older drivers, they have known security vulnerabilties that are fixed in the latest drivers.

Posted on 2020-03-06 08:11:34

Yeah, normally it is desirable to use the latest drivers for a number of reasons - security not the least of those! - but this is a weird case where a feature that used to work no longer does, so some people may have to balance security against their need for that feature. Thankfully, the two driver vulnerabilities listed on that link you sent both require an attacker to have local system access... and if you've got someone running software on your computer already (either locally or remote controlled) you are already in a heap of trouble :)

Posted on 2020-03-13 17:59:55
MahaVakyas

Interesting article to say the least. Wonder why Nvidia just killed 3 and 4 way SLI? Can the drivers be "edited" to "force" 4 Way SLI/NVLink to work? There's a guy on Guru3D Forums who has 4 Titan Voltas running in SLI and he is not using any bridge (software SLI) - I think he says he used the Nvidia auto diff utility and edited the drivers but no idea if it works properly.

Posted on 2020-04-05 02:32:01
Vedran Klemen

What would be Quadro dual slot Nvlink for two 2070 Super? Thx.

Posted on 2020-05-04 15:51:27

I haven't specifically tried it, but it looks like the 2070 Super uses the standard size NVLink connector - so the Quadro RTX 6000 / 8000 bridges should fit it, but not the RTX 5000 bridges (which are an odd, smaller size).

Posted on 2020-05-04 20:52:20