Puget Systems print logo
Read this article at https://www.pugetsystems.com/guides/1253
Article Thumbnail

NVLink on NVIDIA GeForce RTX 2080 & 2080 Ti in Windows 10

Written on October 5, 2018 by William George


When NVIDIA announced the GeForce RTX product line in August 2018, one of the things they pointed out was that the old SLI connector used for linking multiple video cards had been dropped. Instead, RTX 2080 and 2080 Ti cards would use the NVLink connector, found on the high-end Quadro GP100 and GV100 cards. This caused much excitement since one of the features of NVLink on Quadros is the ability to combine the video memory on both cards and share it between them. This is extremely helpful in applications that can be memory-limited, like GPU based rendering, and having it available on GeForce cards seemed like a great boon. Afterward, though, NVIDIA only spoke of it using terms like "SLI over NVLink" - leading many to surmise that the GeForce RTX cards would not support the full NVLink feature set, and thus might not be able to pool memory at all. To clear this up we decided to investigate...

What is NVLink?

At its core, NVLink is a high-speed interconnect designed to allow multiple video cards (GPUs) to communicate directly with each other - rather than having to send data over the slower PCI-Express bus. It debuted on the Quadro GP100 and has been featured on a few other professional NVIDIA cards like the Quadro GV100 and Tesla V100.

What Can NVLink on Quadro Cards Do?

As originally implemented on the Quadro GP100, NVLink allows bi-directional communication between two identical video cards - including access to the other card's memory buffer. With proper software support, this allows GPUs in such configurations to tackle larger projects than they could alone, or even in groups without NVLink capabilities. It required specific driver setup, though.

What Are the Requirements to Use NVLink on Quadros?

Special setup is necessary to use NVLink on Quadro GP100 and GV100 cards. Two NVLink bridges are required to connect them, and a third video card is needed to handle actual display output. Linked GPUs are then put in TCC mode, which turns off their outputs (hence the third card). Application-level support is also needed to enable memory pooling.

TCC Mode Being Enabled on Quadro GP100 Video Cards

This is how TCC is enabled on Quadro GP100s via the command line in Windows 10.

Do GeForce RTX 2080 and 2080 Ti Video Cards Have NVLink Connectors?

Technically, yes: there is a single NVLink connector on both the RTX 2080 and 2080 Ti cards (compared to two on the Quadro GP100 and GV100). If you look closely, though, you will see that the connectors on the RTX cards face the opposite direction of those on the Quadro cards. Check out the pictures below:

NVIDIA GeForce RTX 2080 and Quadro GP100 Side by Side

NVIDIA GeForce RTX 2080 and Quadro GP100 NVLink Connector Comparison

Are the GeForce RTX and Quadro GP100 / GV100 NVLink Bridges the Same?

No, there are several differences between the NVLink bridges sold for the GeForce RTX cards and older ones built for Quadro GP100 and GV100 GPUs. For example, they differ in both appearance and size - with the Quadro bridges designed to connect adjacent cards while the GeForce RTX bridges require leaving a slot or two between connected video cards.

NVIDIA Quadro NVLink Bridge vs GeForce RTX NVLink Bridge (View From Top)

NVIDIA Quadro NVLink Bridge vs GeForce RTX NVLink Bridge (View From Bottom)

Are GeForce RTX and Quadro GP100 NVLink Bridges Interchangeable?

In our testing, the GP100 bridges physically fit but would not work on GeForce RTX 2080s. The GeForce bridge did work on a pair of Quadro GP100 cards, with some caveats. Due to its larger size, only one GeForce bridge could be installed on the pair of GP100s - meaning only half the potential bandwidth was available between them.

Dual NVIDIA Quadro GP100 Cards with Dual Quadro NVLink Bridges Installed

Dual NVIDIA Quadro GP100 Cards with Single GeForce RTX NVLink Bridge Installed

Dual NVIDIA GeForce RTX 2080 Cards with a Quadro NVLink Bridge Installed - Which Does Not Function

Dual NVIDIA GeForce RTX 2080 Cards with a GeForce RTX NVLink Bridge Installed

Are NVLink Bridges for Quadro GP100 and GV100 Cards the Same?

No. While we don't have any GV100 era NVLink bridges here to test, we know that they are the same size as those for the GP100 but are colored differently and sold separately by NVIDIA. Other sources are also reporting that they may work with the new RTX series video cards, but we cannot confirm that.

A Pair of Quadro GP100 Era NVLink Bridges (Silver)

A Pair of Quadro GP100 Era NVLink Bridges (Silver)

A Pair of Quadro GV100 Era NVLink Bridges (Gold)

A Pair of Quadro GV100 Era NVLink Bridges (Gold)

Is NVLink Setup on the GeForce RTX 2080 the Same as Quadro GP100?

After testing many different combinations of cards and NVLink bridges, we were unable to find any way to turn on TCC mode for the GeForce RTX cards. That means they cannot be set up for "peer-to-peer" communication using the same method as the GP100 and GV100 cards, and attempts to test NVLink using the 'simpleP2P.exe' CUDA sample program failed.

Chart of NVIDIA Quadro GP100 and GeForce GTX 2080 NVLink Configurations and Capabilities

The chart above shows the results we found when using different combinations of video cards and NVLink bridges, including which combinations supported SLI and whether TCC could be enabled. Click to expand and see additional notes about each configuration.

Dual Quadro GP100 Video Cards Without NVLink Bridge in Peer-to-Peer Bandwidth Test

Dual Quadro GP100 Video Cards With Single GeForce RTX NVLink Bridge in Peer-to-Peer Bandwidth Test

Dual Quadro GP100 Video Cards With Dual Quadro NVLink Bridges in Peer-to-Peer Bandwidth Test

Dual GeForce RTX 2080 Video Cards With NVLink Bridge Failing Peer-to-Peer Bandwidth Test

These screenshots from the Windows command line show peer-to-peer bandwidth across cards with different types of NVLink bridges installed. The first three are pairs of GP100s with no bridge, the GeForce RTX bridge, and then dual Quadro bridges - while the last screenshot shows that the RTX 2080 cards did not support P2P communication in this test at all, regardless of what bridge was installed.

GeForce RTX 2080 Video Cards Do Not Support TCC Mode

TCC mode cannot be enabled on the GeForce RTX 2080 video cards in Windows.

How To Configure NVLink on GeForce RTX 2080 and 2080 Ti in Windows 10

Instead of using TCC mode, and needing to have a third graphics card to handle video output, setting up NVLink on the new GeForce RTX cards is much simpler. All you need to do is mount a compatible NVLink bridge, install the latest drivers, and enable SLI mode in the NVIDIA Control Panel.

NVIDIA Control Panel Screenshot Showing SLI Enabled on GeForce RTX 2080 Video Cards

It is not obvious that the steps above enable NVLink, as that is not mentioned anywhere in the NVIDIA Control Panel that we could see. The 'simpleP2P.exe' test we ran before also didn't detect it, likely because TCC mode is not being enabled in this process. However, another P2P bandwidth test from CUDA 10 did show the NVLink connection working properly and with the bandwidth expected for a pair of RTX 2080 cards (~25GB/s each direction):

NVIDIA P2P Bandwidth Test Showing NVLink Working on a Pair of GeForce RTX 2080 Video Cards

How to Verify NVLink Functionality in Windows 10

There isn't an easy way to tell whether NVLink is working in the NVIDIA Control Panel, but NVIDIA does supply some sample CUDA code that can check for peer-to-peer communication. We have compiled the sample test we used above, and created a simple GUI for running it and viewing the result. You can download those utilities here.

Do GeForce RTX Cards Support Memory Pooling in Windows?

Not directly. While NVLink can be enabled and peer-to-peer communication is functional, accessing memory across video cards depends on software support. If an application is written to be aware of NVLink and take advantage of that feature, then two GeForce RTX cards (or any others that support NVLink) could work together on a larger data set than they could individually.

What Benefits Does NVLink on GeForce RTX Cards Provide?

While memory pooling may not 'just work' automatically, it can be utilized if software developers choose to do so. Support is not widespread currently, but Chaos Group has it functioning in their V-Ray rendering engine. Just like the new RT and Tensor cores in the RTX cards, we will have to wait and see how developers utilize NVLink.

What About SLI Over NVLink on GeForce RTX Cards?

While memory pooling may require special software support, the single NVLink on the RTX 2080 and dual links on the 2080 Ti are still far faster than the old SLI interconnect. That seems to be a main focus on these gaming-oriented cards: implementing SLI over a faster NVLink connection. That goal is already accomplished, as shown in benchmarks elsewhere.

Will GeForce RTX Cards Gain More NVLink Functionality in the Future?

Future application and driver updates will change the situation on a program-by-program basis, as software developers learn to take advantage of NVLink. Additionally, the 2.5 Geeks Webcast interviewed a NVIDIA engineer who indicated that NVLink capabilities on these cards will be exposed via DirectX APIs - which may be different than the CUDA based P2P code which we tested here.

Does NVLink Work on GeForce RTX Cards in Linux?

My colleague Dr. Don Kinghorn conducted similar tests in Ubuntu 18.04, and he found that peer-to-peer communication over NVLink did work on RTX 2080 cards in that operating system. This functionality in Linux does not appear to depend on TCC or SLI, so with that hurdle removed the hardware link itself seems to work properly.

Tags: NVIDIA, GeForce, RTX, 2080, 2080 Ti, NVLink, SLI, Bridge, Quadro, GP100, GPU, Memory, Pooling
Avatar Padi

Memory pooling is possible for GeForce RTX according Nvidia’s Director of Technical Marketing, Tom Peterson, during HotHardware 2.5 Geeks podcast:

„Petersen explained that this would not be the case for GeForce RTX cards. The NVLink interface would allow such a use case, but developers would need to build their software around that function. “While it's true this is a memory to memory link; I don't think of it as magically doubling the frame buffer. It's more nuanced than that today,” said Petersen. “It's going to take time for people to understand how people think of mGPU setup and maybe they will look at new techniques. NVLink is laying a foundation for future mGPU setup.”

edit: lik fixed

Posted on 2018-10-06 08:17:01

The link in your comment seems to have been cut off, but I found the podcast episode you are referring to. Do you happen to know what time stamp this particular quote is from? I'd like to go through and listen to the context around it, but I was hoping to avoid listening to the whole hour-long podcast :-)

Posted on 2018-10-06 14:59:27
Avatar ryan o'connor


Posted on 2018-10-07 04:01:34

Yeah, talk of NVLink starts just before the 38:00 mark and goes until about 46:30. I ended up watching all of it earlier today, but thank you for the direct link :)

I am going to listen to just that ~8 minute portion again tomorrow, and then write some thoughts.

Posted on 2018-10-07 07:12:46
Avatar ryan o'connor

No problem! Interested to hear what you think

Posted on 2018-10-07 18:30:36

Okay, here is the section that I think bears most closely on what our article above covers. It goes from about 41:56 to 44:15 in the video above and addresses two questions:

Interviewer: "NVIDIA collective communications library, the NCCL library, for developing atop NVLink, will GeForce users get access to that for playing with communications and buffers?"

NVIDIA Engineer: "I expect that the answer is 'yes' to that. So NVLink is a software visible capability, and its gonna be exposed primarily through the DX [DirectX] APIs. I'm not sure exactly... NCCL, I'm not super familiar with that but the DX APIs will expose NVLink."

Interviewer: "I had a question, generally speaking, in terms of when you were talking about "hey, what's in your frame buffer?" - in the way I understand the way NVLink works in machine learning and supercomputers, you know, high performance computing - you now have, let's say in the case of two 8GB frame buffer cards, you now have a contiguous 16GB frame buffer. Is that too simplified, simplifying it too much?"

NVIDIA Engineer: "I think that sets the wrong expectation, right? When people say that they're trying to say I can now game with 16GB textures. And its really that style of memory scaling will require app work, right? Its not just gonna magically share that memory. Now its true that you could set it up to do that, right? You could set it up a memory map so that, you know, effectively it looked like a giant frame buffer - but it would be a terrible performance thing. Because the game would really need to know that there is latency to access that second chunk of memory, and its not at all the same. So think of it as it is true that this is a memory to memory kind of link, but I don't just think of it as magically doubling the frame buffer. Its much more nuanced than that today, and its going to really take time for people to understand "hey, NVLink is changing the way I should think about my multi-GPU setup and, effectively, maybe I should start looking at new techniques", right? And that's why we did NVLink. NVLink is not really to make SLI a little bit better, its to lay a foundation for the future of multi-GPU."

So it sounds to me like what is going on, for these GeForce cards, is that they are going to expose NVLink capabilities in a different way than Quadro cards have. That makes sense, in a way, since GeForce cards are aimed at a different audience (mainstream, largely gamers) and need to be accessible to game developers in ways that they are already somewhat familiar with. However, if NVIDIA only allow access to NVLink on GeForce cards through DirectX APIs then that may interfere with using it in applications that are more focused on GPU computation.

I think I will add one more section onto the article above, talking about how just because the traditional way to test NVLink GPU communication doesn't work on the GeForce cards does not mean they will never be able to work together in a similar way. We are, of course, very early in the release of this RTX / Turing GPU generation - and both other APIs / approaches to the issue as well as future driver updates could change the situation :)

Posted on 2018-10-08 18:39:50
Avatar Padi

Amazing summary. Thank you for taking the effort transcribing it. This was the way I understood it when OTOY talked about their implementation. It is not automagically double the VRAM and it will have a speed hit every time assets need to be exchanged over NVLink. The hope is that this penalty is much smaller than going out-of-core to system memory to fetch assets which don‘t fit the 11 GB VRAM but might fit in a 22 GB pool of VRAM.

The good news is that GPU render engines are already actively working on using the new API as described in the post I linked on an older article here:


The odds are good that we will benefit from the GeForce NVLink as well and the 2080 Ti will have better bandwidth compared to the 2080 cards.

Posted on 2018-10-08 19:01:55

Between the potential of NVLink and RT cores, I think there will be a lot of growth room for GPU rendering on this generation of cards. I am excited to see where it goes, and to test Octane, Redshift, and V-Ray as they release updates that utilize Turing's capabilities. It may also be interesting to replicate the testing above once we have a pair of RTX 2080 Ti cards (we have only one at the moment) to see if they report a different number of links than the vanilla 2080 cards.

Posted on 2018-10-08 19:21:38
Avatar Padi

Sorry for the cut off link. You will find the context in the article. I did listen to the interview a month ago but I don‘t remember specific timestamps:


Posted on 2018-10-07 05:53:00

Thank you for sharing that! I am going to re-listen to the applicable part of the interview tomorrow, and write some of my thoughts on it here in the comments.

Posted on 2018-10-07 07:13:58

I just posted a reply above, to Ryan O'Connor, addressing the video interview you brought up.

Posted on 2018-10-08 18:40:19
Avatar Michael

Any chance you can test this on linux, where TCC mode is not an issue?

Posted on 2018-10-07 03:41:59

That may be a little bit outside my area of expertise, but it would certainly be interesting to see if there is any different behavior on Linux.

Posted on 2018-10-07 07:54:54
Avatar Donald Kinghorn

Hi Michael, William ask if I could comment ... I've just done a bunch of NVLINK testing in Linux (Ubuntu 18.04, CUDA 10.0 and driver 410) It looks like full NVLINK but with a bit lower performance than you would see on the V100 server hardware. I'll have a full post up at https://www.pugetsystems.co... in a couple of days. I'll be looking at TensorFlow performance along with general performance like the following testing... the following is 2 x RTX2080 founder edition cards

kinghorn@i9:~/projects/samples-10.0/bin/x86_64/linux/release$ nvidia-smi nvlink -c
GPU 0: GeForce RTX 2080 (UUID: GPU-2cac9708-1ed8-0312-ada8-ce3fb52a556c)
Link 0, P2P is supported: true
Link 0, Access to system memory supported: true
Link 0, P2P atomics supported: true
Link 0, System memory atomics supported: true
Link 0, SLI is supported: true
Link 0, Link is supported: false

cudaMemcpyPeer / cudaMemcpy between GPU0 and GPU1: 22.53GB/s

Unidirectional P2P=Disabled Bandwidth Matrix (GB/s)
D\D 0 1
0 389.09 5.82
1 5.82 389.35
Unidirectional P2P=Enabled Bandwidth (P2P Writes) Matrix (GB/s)
D\D 0 1
0 386.63 24.23
1 24.23 389.76
Bidirectional P2P=Disabled Bandwidth Matrix (GB/s)
D\D 0 1
0 386.41 11.59
1 11.57 391.01
Bidirectional P2P=Enabled Bandwidth Matrix (GB/s)
D\D 0 1
0 382.58 48.37
1 47.95 390.62

Posted on 2018-10-11 15:26:25

Thank you for doing that testing, Don! It is looking like the issue with NVLink on these GeForce RTX cards is purely because NVIDIA is not allowing TCC mode in the current Windows drivers. I will update the article text (and maybe the title too) to better reflect that.

Posted on 2018-10-11 16:58:52
Avatar Michael

Great that's terrific news. The 2080ti as a tu102 chip should support twice the bandwidth as the 2080. I'm curious whether this results in training speedups with memory pooling. Will look forward to your writeup.

Posted on 2018-10-12 18:54:46

I couldn't test P2P bandwidth on Windows, of course, but I was able to see that the 2080 Ti cards have two links available - compared to just one on the vanilla 2080 (and none on the upcoming 2070s, as I understand it). So assuming you are using an OS and software setup that works properly with NVLink, then a pair of 2080 Ti should indeed have double the P2P bandwidth of the 2080 :)

Posted on 2018-10-12 18:57:26
Avatar Lee aste

so Can't i use GP100 or GV100 NVLINK Bridge for 2080ti 2pcs?
I use Matx motherboard, so I need to do sli, need 2-slot nvlink bridge.. but there is no 2-slot nvlink bridge.. except Quadro nvlinj bridges..
Is there a way?

Posted on 2018-10-08 10:26:42

The Quadro bridge did not work on GeForce RTX cards for us, so I would not expect it to work for you either. Moreoever, I would be concerned about using two of these dual-fan cards right next to each other. The heatsink configuration on the NVIDIA Founders Edition cards in this generation is not built for having cards next to each other without at least one slot in-between for airflow. I think that may be why they don't offer a 2-slot NVLink SLI Bridge.

Posted on 2018-10-08 18:11:30

I did some testing under Win10 1809 and 416.16 drivers and during my single application monitoring of VRAM usage I hit 11.7GB, 700MB over (keep in mine this is single app, not combined OS + app). This was an "SLI" aware app that does indeed use both mGPU with a supporting nVidia profile under DX11. If the 700MB was swapping to main system RAM then I would have expected to see a sharp decline in FPS but no such decline happened at the point the app exceed 11GB usage, FPS was very consistent. So in my real world test case and not the "discussion" case, it seems that Memory Pooling is happening. In my case the application was a flight simulator (Lockheed Martin's Prepar3D V4.x). I can probably run more test by increasing the shadow map size to sharpen shadow quality as this will use more VRAM also and should push me further past 11GB.

Posted on 2018-10-14 16:59:55

Thank you for sharing your experience! If all that is needed is enabling SLI, in order to have memory pooling, that would be nice... but it is definitely a change from how NVLink and memory pooling worked in the past (on Quadro cards). I hope NVIDIA puts out some more official information about this, and it would be nice if they also put more details in their control panel - especially showing memory pooling and usage.

Posted on 2018-10-15 19:22:49
Avatar -V-

VRay apparently got it to work.

Posted on 2018-10-15 01:26:31

Chaos Group (V-Ray) and OTOY (OctaneRender) have both talked about it, but I haven't seen anything published with detailed information directly showing NVLink at work on GeForce RTX cards in either of those rendering engines. I would love to know more about what they have actually tested and how they got memory pooling working, if indeed they have. It would also be great if they would update their benchmarks to utilize it - both V-Ray Benchmark and OctaneBench are lagging behind their latest releases :(

Posted on 2018-10-15 19:11:22
Avatar nejck

You guys should be aware that you probably need to enable "SLI" in order for the NVLink to work on the RTX series. Memory pooling also works if implemented in the application. I'd recommend taking a look at this post:

Posted on 2018-10-15 05:58:16

That is really weird - they used bridges from Quadro cards and it worked for them, when that definitely did not work for us (not even SLI mode was available when trying to use those bridges).

Hmm... GV100 bridges? We used ones from the GP100. Maybe the bridges themselves have been updated over the Quadro GP100 -> GV100 generation change? The coloring is different - the bridges in that Facebook post look golden in color, rather than silver like the Quadro GP100 bridges we have.

It is good to see that they are using blower-style RTX cards, though - looks like the same Asus series we tested recently, though they have the 2080 Ti variants (lucky!).

I still am unsure how software like this is functioning with P2P over NVLink without being able to put the cards into TCC mode... but maybe memory pooling in this generation somehow doesn't need that? I'll play with this some more when I have a chance.

Posted on 2018-10-15 19:20:26
Avatar Nejc

Here we go, they got it working on the regular RTX NVLink bridges :) https://www.chaosgroup.com/...

Would be fun to see if it works on the Quadro bridges too I guess. Thanks for testing btw :)

Posted on 2018-10-19 12:45:01

Thank you for posting that! It lines up with some things one of the other guys here (Matt) learned while talking with NVIDIA reps at Adobe Max this week. I am going to test two RTX 2080 cards again, without a third card this time, and make sure they are in SLI mode. That *might* enable P2P functionality in the test we used above, or it may be that the type of testing we were trying to do just won't play nicely with the new RTX cards no matter what. It does look like some of NVIDIA's own tools, even in CUDA 10, are not reporting things properly (like the incorrect VRAM usage mentioned in the link you posted)... which may well be part of why NVLink didn't work as we expected it to.

Posted on 2018-10-19 17:14:01
Avatar Eths

I noticed that for Geforce RTX there are only 3 and 4 slot connectors, but on the Quadro RTX pages there are 2 slot connectors too. I noticed that there was no mention of the new Quadro RTX NVLink here, I am interested if that 2 slot bridge for Quadro RTX would work on Geforce RTX (for space constrained systems, less optimal thermals is tolerable). I saw a similar question below but I think older Quadro NVLinks were tested since it was not specified.

Posted on 2018-11-06 05:43:04

They're not yet available to purchase and test, but I fully expect the new Quadro RTX NVLink bridges should work on the GeForce cards. The bridges from the old Quadro GP100 didn't work, but I have the feeling that's because it was an older generation of the technology. We don't have any to test here, but other sources online have indicated that bridges from the Quadro GV100 do work with the GeForce RTX cards, which if true is a good sign for future Quadro bridges working as well.

Posted on 2018-11-06 05:45:58
Avatar Eths

Thank you for the information.

Posted on 2018-11-06 05:53:32

We got the Quadro RTX NVLink bridges in last week, and they work just fine on the GeForce RTX cards. If you are going to use a 2-slot bridge, though, make sure to get single-fan video cards that exhaust heat out the back. The multi-fan models do not do well stacked next to each other, and pump a lot of heat back into the computer.

Posted on 2019-01-09 17:20:49
Avatar Eths

Thank you for the information. As for the heat, I recently got four zotac blower cards and out of the box they would overheat. With a tiny modification to backplates they are fine though. More information here, I hope links are ok to post (no adds on my site) https://codingbyexample.com...

Posted on 2019-01-09 22:43:50

Yeah, if the fan isn't somewhat recessed within the overall heatsink / shroud and the cards have backplates it can prevent sufficient intake airflow for proper cooling. Removing the backplate, as you did, is certainly an option - though it is worth noting that it may void the manufacturer warranty on some video cards. We try to use cards which have enough space between them in stock configurations to be cooled well, but making folks aware of this issue is a good idea! Thank you for your post, and yes - the link is just fine in this case (we do block spam, but not useful links).

Posted on 2019-01-09 22:51:45
Avatar Eths

Yes, we had to take a gamble on cards due to limited availability in Japan at the time. I would be interested if you opened up sales to Japan or had a company that you have worked with here.

Posted on 2019-01-09 23:16:53
Avatar Vedran Klemen

Hello, Is this due to smaller motherboard?

Posted on 2019-09-01 16:27:35
Avatar omsrisagar

Now that they are available at https://www.nvidia.com/en-u..., did you get a chance to test them? Thank you.

Posted on 2018-12-25 19:43:03

Not yet, but we have placed an order for one of each size - so we should have them sometime in early January.

Posted on 2018-12-27 17:28:33
Avatar omsrisagar

Thank you William, I appreciate your reply!

Posted on 2018-12-27 19:32:32

We got Quadro RTX NVLink bridges in last week, both 2- and 3-slot sizes, and they work just fine on the GeForce RTX cards. We still need to get a pair of Quadro RTX cards (we have one, but need a second) in order to test whether the GeForce branded bridges will work on them or not. Once we have the full set of data, we will likely publish a brief overview article charting the compatibility.

Posted on 2019-01-09 17:19:35
Avatar omsrisagar

Thank you William for informing me, I appreciate it :) It's great to know that Quadro RTX bridges work fine with GeForce RTX cards. Even I too ordered Quadro RTX bridges and have them installed with my RTX 2080 Ti cards. I am yet to test them though. This is great news!

Posted on 2019-01-11 23:07:12
Avatar Dan O (Visual)

Probably a dumb question and i think the answer is no but im having a hard time researching this topic,this is the most informative article yet and for that i thank you. Can you use an rtx nvlink to link an rtx 2080ti to and rtx 2080 non ti?

Posted on 2019-01-11 21:27:21

I theory, NVLink (and SLI before it) is only suppose to be used to bridge identical video cards. In practice... I'm not actually sure. I may test that out, just for kicks, though in the real world I would avoid it (even if it works) because of potential complications arising from different GPUs being tied together.

Posted on 2019-01-11 21:29:46

I just tried it, and the NVIDIA Control Panel does not show the option to enable SLI (which is required in order to use NVLink functionality, at least on Windows) when using mixed GPUs. In this case, I was trying a RTX 2080 Ti + RTX 2080... but I suspect the same would happen with any mis-matched combination of video cards.

Posted on 2019-01-11 23:02:10
Avatar Dan O (Visual)

Thanks a ton man, i appreciate your response. Good to have a guy to get ahold of for help!

Posted on 2019-01-13 07:45:52
Avatar Dan O (Visual)

Hey william, so as far as gaming goes, with the Nvlink do games still need to support SLI? Again Im just learning about SLi, appreciate your patience and knowledge.

Posted on 2019-01-14 18:44:31

I don't think anything has changed in this regard. NVLink is faster than older SLI bridges, but aside from that I expect it to behave the same - at least for the purpose of games. If I am correct, then some games will benefit from SLI and some will not, depending on how they are designed... and within those that do benefit, there can be a wide range of how much gain is seen from a second card.

It is also possible that, at some point, game developers might begin using NVLink for other things besides just SLI. I have not heard of anyone doing that yet, and it is enough of a niche that it may not happen for a long time (or never), but this technology does have the potential to do more than just pure SLI if a program is written to take advantage of it.

Posted on 2019-01-14 19:17:17
Avatar Dan O (Visual)

Thats so interesting, wish it was just as simple as nvlink and the system pools the resources for any gpu related tasks! Coming from a naive gamer lol

Posted on 2019-01-15 16:54:42

It would indeed be nice, but all that NVLink itself does is provide a fast, bi-directional connection between two video cards - allowing for faster communication and avoiding the need to go through the PCI-E bus. SLI on top of that enables cooperation between the video cards on drawing frames for display in games or other 3D applications, if they support / can benefit from it. Beyond that, as I understand it, anything else a developer wants to use the NVLink connection for has to be specially coded into an application.

Posted on 2019-01-15 17:17:46
Avatar Dan O (Visual)

I returned my 2 founders edition 2080s before receiving my NVlink because i was told alot of games wouldnt support it. I then bought the msi rtx2080ti gaming trio, but further down the road i may grab another if its supported by all games.

Posted on 2019-01-14 18:45:57
Avatar Dan O (Visual)

From what i originally thought the nvlink doesnt run one card as master and the other as slave. But im unsure if that is only the gp100. I have the ROG strix 4 slot nvlink, is the difference the master /slave config? Or only 40gb/s vs 100gb/s

Posted on 2019-01-14 18:49:09
Avatar Juan Nunez

I doubt anyone has tried yet, but there is precedence where SLI was enabled between two non-identical cards. You will need to "hack" the NVIDIA driver and this hack may no longer work (or it needs to be modified/updated) on newer drivers and/or with newer GPUs.

Google Search "Different SLI" - Again, it's a hack; so no support from NVIDIA and mileage may vary.

Posted on 2019-03-12 16:19:31
Avatar Juan Nunez

How did you turn on NVLINK on an Ubuntu system? Or does it "just work" if the NVLINK Bridge is connected to the two GeForce RTXs?
* On Windows to "enable NVLINK" you have to turn on SLI.

Posted on 2019-03-11 21:57:15

I haven't done any work with NVLink on Linux myself, but from talking with Dr. Don Kinghorn (and reading his HPC Blog posts) I believe it "just works" under Linux. By that I mean, that the high-speed connection between cards is functional without any additional work (once the bridge and drivers are installed). That does not necessarily mean that GPU application performance will automatically improve, though!

Posted on 2019-03-11 23:02:07
Avatar Juan Nunez

Thank you William! That is what I gathered as well but wanted to confirm as Dr. Kinghorn did not specifically say either way and on the Windows world, one must deliberately "turn it on"; Granted two different worlds, but there's a precedence. Cheers.

Posted on 2019-03-11 23:11:11
Avatar Donald Kinghorn

I did reply on my post that you commented on but I copied it here too as a reference for others ...

On Linux it "just works" ... any software that is peer-to-peer aware will use it by default. For the RTX 20xx cards and Titan if the NVLINK bridge is not there then things fall back to memcpy to buffers in CPU memory space. On the RTX Quadro cards the fallback is to p2p over the PCIe bus ... like the older GTX cards.

I am writing up a post where I look at 4 GPU's and have testing with 2 NVLINK bridges on 4 cards. That "just works" on Linux too. On Windows you have to have a display connected to one of each of the pairs and enable SLI

Posted on 2019-03-12 17:51:48

Does 2080ti with nvlink run on win 7 at all? No one mentioned it - reason why I’m on 7 because of low vram overhead - basically win 10 eats a gig or two more of vram, whic I use for rendering..

Posted on 2019-03-31 07:52:25

I'm sorry, but I don't have any data on Windows 7. We haven't used that OS in a while here :(

Posted on 2019-04-01 16:23:28
Avatar Alexander Stohr

When you are mentioning "Tesla V100" - then there are two options:
* the SXM2 module that comes with much of NVLink - but its definitely nothing that would fit in any of your standard or high-end gaming PCs
* the PCI-E plug-in board that comes with no NVLink at all - but then its not much of relevance here, and make no sense mentioning it at that place on the very top.
neither of the two boards uses any such bridges as shown by you in your article.

Posted on 2019-08-08 14:36:34

We don't have any Tesla V100 cards (or the smaller modules) here, but I was going off of NVIDIA's literature from the time that extolled the virtues of NVLink and the more advanced NVSwitch and spoke of them being technologies on the V100 - as they did here: https://www.nvidia.com/en-u....

Looking at pictures of the V100 PCI-E accelerators, it appears that the NVLink edge connectors are there on the PCB... but that the heatsink and backplate do not have the required cutouts to allow them to be used - an odd choice, it seems to me, for such a high-end / high-cost GPU.

Posted on 2019-08-08 16:18:41
Avatar Alexander Stohr

How about that for "silver vs. gold":
* Quadro GP100 => NVLink 1.0 with 20 GT/s
* Quadro GV100 => NVLink 2.0 with 25 GT/s

Posted on 2019-08-08 15:03:25

That would have been an accurate but of text to include as well, though including a data transfer speed rating can confuse things a bit. Those are the ratings per NVLink, but each connector actually supports up to two links and so the GP100 and GV100 cards have four functioning links for ~ 80 or 100 GT/s, respectively (and a little lower than that if you want to count GB/s instead).

Posted on 2019-08-08 16:22:28
Avatar Alexander Stohr

using a single (non split) link means the transmission of 8 bits in parallel for one direction. thus a spec value of e.g. 25 GT/s will match with a total of 25 GByte/s of data transfer raw capability (in the considered direction).
doing the same for any number of link bundling will provide a data rate with the respective multiplication factor.

That all is just about peak theoretical raw bandwidth and thus its value can be calculated. In a real world with real world systems the truely available data rates can and should be measured because they are definitely lower. Factors like the RAM types and speeds, protocol overhead, interrupt latency and many more might have a noticeable impact. having both values side to side gives hints on how effective the specific setup is in practice. this allows a better understanding of systems and also of comparing and fixing/improving them or assuring for their flawless operating.

other than that, the one that spends the big money for two such identical high-end cards should not go the risk of re-using a not-that-sure-to-be-compatible spare device from an older setup.

Posted on 2019-08-28 09:38:22
Avatar Alexander Stohr

"The GeForce bridge did work on a pair of Quadro GP100 cards"

As the bridges are rather passive devices (probably all signals might have a 1:1 routing - but not totally sure on that) with some lane length adjustment for the PCB signals the whole chance of an electrically correct match was quite there. in setups with power for driving a few LEDs on the bridge the chances for a fit might be a bit lower. i would not recommend it for real "fatal" reasons to your system unless you checked it very intense. and still you might face the chance that the used signals paths wont be tuned that well including bad shielding from neighbor signals and from/to the environment (=> EMC).

Posted on 2019-08-08 15:10:26

If you go into Single Card mode... is it ALWAYS running at 8x or does it go 16x until you drop it back into SLI?

Posted on 2019-10-24 18:18:49
Avatar Donald Kinghorn

Hi MTG, I'm not completely sure what you are asking but, this might clear things up a bit ... maybe:-) I'm a "compute guy" so I don't know much about SLI ...

X8 and X16 (data) layout and switching is a function of motherboard design. There are only so many lanes available so when you add or remove a device that needs them (GPU's) most modern motherboards will switch automatically to distribute the lanes. ... but it depends on the motherboard logic.

Now, here's why it doesn't matter too much. For nearly everything you do, it doesn't make a "noticeable" difference if you are on X8 or X16. Here's why. In the early days with the GPU (especially for compute) we had to code by hand double buffering to hide latency and bandwidth. That means staging a buffer on the CPU side and the GPU side of the PCIe bus. As instructions and data are used up on the GPU it is simultaneously issuing a fetch to the buffer that is being refilled at the same time at both ends. There is usually enough going on with the GPU cores that the buffers do not get exhausted. The end result is that lag with X8 vs X16 is not noticeable to the end user. If the buffers were not there it would be a huge difference!

That double buffering trick is now mostly automatic with modern code libs and frameworks. It even works for multiple GPU's. In this case a method called "pinning" is used. Basically blocks of memory on the CPU side are allocated to match the memory addresses on the GPU side and are reserved for memory staging for each GPU. [ it's good to have twice the CPU memory as you have total GPU memory ]

That is all CPU - GPU communication. For GPU to GPU communication it's trickier. A lot of code will just route through the CPU memory space for this. That gets buffered too but it can significantly slow things down because GPU to GPU comm may happen often and you can get stalls. Methods were developed to do this "peer-to-peer" communication that could route directly over the PCIe bus card-to-card and avoid the CPU space. For this any difference in the bus performance will have an impact. AND, the GeForce RTX GPU's do not support peer-to-peer over PCIe! They have to go through CPU memory space. The older 10xx card did do P2P over PCIe and the new RTX Quadro's do also. ... so, on the GeForce RTX cards having the NVLINK is a big plus! (in theory) NVLINK just blows away the PCIe bus for performance. (SLI helps with this too (a little) but it is not a full replacement for P2P) ... but, the impact is smaller than you might think!

Bottom line: Newer hardware is so insanely good and programming methods (frameworks) accommodate differences without much performance loss. The end result is that things that look like they might have a large impact separately, only actually have a small impact overall.

Cheers Mate!

Posted on 2019-10-25 16:04:25
Avatar Tiago Sousa

Does NVlink bridge works with just 2 RTX 2080 ti? I mean without a thrid card...
Thank you!

Posted on 2020-01-17 17:07:23

NVLink only works between pairs of cards, so you can have two cards in a NVLink pair or four cards in two separate pairs. A third card is not necessary, nor would it be in / part of NVLink if you had a pair of bridged cards. Moreover, it has been a while since I played around with this on Windows - but I *think* having a third GeForce card may actually make it *harder* to get a system to behave properly in NVLink in Windows.

Posted on 2020-01-17 17:55:27
Avatar Tiago Sousa

Thank you very much for your clear answer! Is it possible to have your email contact to ask you about some other doubts I have regarding my rendering rig?
It would be very appreciated!
Thank you once again William.

Posted on 2020-01-17 23:52:16

I'm not sure how much help I can be, but william [at] pugetsystems [dot] com will get to me :)

Posted on 2020-01-17 23:56:10
Avatar Josh

For triple SLi cards you used triple bridge for the 2070 RTX and above. Same for Quadro. If anyone get confused from this picture.

For Quad cards 4-way from my understanding for Quadro NVLink 1st main card connect to the 2nd card from the left link while 2nd card connect to 2nd link connect to the 3rd card 2nd link and the left side link on the 3rd card connect to the to the bottom 4th card to get the quad going..

For going with 2070 and above you used 4 way SLI Nvlink bridge. Email Nvidia website they will post a link to buy them.

Posted on 2020-01-28 12:36:49
Avatar Donald Kinghorn

SLI is not NVLINK On the RTX cards there is only 1 connector, they can only be connected in pairs. The older high end Pascal and Volta Titan cards had 2 connectors. The GTX 9xx cards had SLI connectors that could be 2,3 and 4 way connected. RTX NVLINK bridges come in 2,3 and 4 slot spacings

Posted on 2020-01-28 15:24:58
Avatar k8s_1

What about 4 GPU bridges like the one used in the DGX-1 workstation?

Posted on 2020-04-02 11:34:24
Avatar Donald Kinghorn

Those are waterblocked Tesla V100 GPU's i.e Volta based ... If you look at the top of those cards you see 2 connectors

Posted on 2020-04-02 13:02:32
Avatar k8s_1

Indeed but that's besides the point, the DGX-1 station has a NVLink for 4 PCIe cards.

Posted on 2020-04-16 21:54:51
Avatar Donald Kinghorn

yup :-) each V100 has 2 NVLINK connectors so you can chain them together with 4 connectors... each Turing based RTX card has 1 NVLINK connector so you can only connect in pairs. The Volta cards do have 2 connectors. So for active cooled workstation cards there is the Quadro GV100 and the Titan V. ( I have a couple of Titan V's to use and love them!). The DGX Station is a beautiful system and the liquid cooled setup for those V100's is very nicely done. But at $50K it's outside of my personal budget :-)
Take care --Don

Posted on 2020-04-17 14:31:45
Avatar ᚢᚦᛁᚾ

Hi. Maybe a little too late to comment on this article, but I have recently updated to Windows 20H2 and for some reason, no matter what driver from NVIDIA I try out, I can't enable NVLink as it results in a Black Screen that does not recover even with a driver restart.

But the problem is that I am not sure if I should attribute this issue to the Driver or the Windows update, because I recently waterblocked these GPUs so I am not sure if it could maybe be a physical issue, even if it doesn't make too much of a sense considering the cards are working as normal within Windows.

Posted on 2020-11-03 23:43:18
Avatar Donald Kinghorn

I wish I could offer better advise but I'm really not sure what is causing that. If your GPU's are working OK without the NVLINK bridge then it could indeed be an issue cause from the update. I saw a post in my news feed yesterday titled something like "How to fix black screen in Windows 10" you might want to search for that and see if it offers any hints.

If you are doing compute with the cards then you will still have buffered GPU-GPU communication via memcopy. It will slow down some code but probably not by more than a few percent. Best wishes --Don

Posted on 2020-11-04 01:13:54
Avatar Adam Hendry

Can you please provide another "NVLinkTest.exe" for CUDA 11 and/or provide the source code so users can rebuild the tester for new toolkits? The current toolkit only searches for CUDA 10.

Posted on 2021-08-20 18:48:29