Puget Systems print logo

https://www.pugetsystems.com

Read this article at https://www.pugetsystems.com/guides/1030
Article Thumbnail

Core i7 7820X vs Core i9 7900X: Do PCI-E Lanes Matter For GPU Rendering?

Written on September 11, 2017 by William George
Share:

Introduction

PCI-Express is the primary connection adding expansion cards - from powerful video cards to simple USB controllers - to a computer. It has been updated several times since its release in 2004, with PCI-E 3.0 being the current version, and it is also available in several slot sizes. These are described in terms of "lanes", with x1, x4, x8, and x16 being the common sizes found today.

PCI-Express Logo

Each generation and each slot / lane size brings with it more bandwidth for the expansion card to communicate with the rest of the computer, but sometimes cards which are capable of running at a higher speed (like x16) may run at a slower speed (x8 for example) because of other limitations within the computer. The question that can naturally arise from this situation is simple: does that reduction in speed actually lead to lower performance? And if so, how much is lost?

The answer to this question can differ from one type of expansion card to another, and indeed from one application to another. Today we are looking specifically at the impact of dropping from x16 to x8 speed on video cards, and in particular what impact that has on GPU-accelerated rendering.

Test Setup

To answer this question, we are looking at the new X299 chipset and Skylake X processors from Intel. This is an ideal test case because different CPUs in this series support different numbers of PCI-Express lanes. The Core i7 7800X and 7820X both have 28 lanes, while the Core i9 7900X and higher models have 44 lanes. With a single video card both can provide the normal x16 speed, but when you move to 2 or 3 video cards, that drops - especially on the processors that only have 28 lanes.

When the Core i7 7820X is supporting two video cards, one can run at x16 but the other is limited to x8. This uses 24 lanes of the 28 that processor has available, since it cannot provide the 32 lanes that would be needed for two cards to both operate at x16. If three video cards are used, then they all run at x8 (again, using 24 total lanes between them).

On the other hand, the Core i9 7900X with 44 lanes can support two video cards at x16, and when three are used it keeps two of them at x16 and runs the third at a slower x8 speed. If there is any performance benefit from video cards running at x16 instead of x8, it should show itself when two or three video cards are in use on these differing CPUs. Here is the hardware we tested to find out:

The three rendering engines we included cover a wide portion of the GPU-accelerated rendering market, and should be a good sample size. If the results from all three of these agree, then they should be applicable across most similar applications as well.

Benchmark Results - OctaneRender

First up, here are the results from running OctaneBench across 1, 2, and 3 video cards on each CPU:

OctaneBench 3.06.2 PCI-E x8 vs x16 Comparison

There is almost no difference in performance showing here. None is to be expected with a single video card, since both CPUs can run one with a full x16 lanes, but the fact that there is less than 1% (well within margin of error) means that the CPU speed itself is not impacting performance either. That is helpful, since without knowing that it could be another factor impacting any differences that show up... but now we don't need to worry about it.

In the 2- and 3-card comparisons, there is again no substantial difference between the two CPUs and the way they allocate PCI-Express lanes. The biggest gap is at 2 video cards, with the 7820X based system coming in 1.2% slower, but I think that is still well within the margin of error. Certainly it would be negligible in real world usage.

Benchmark Results - FurryBall

Next up is FurryBall RT, which has a built-in benchmark. It provides three results for different aspects of the rendering process: Ambient Occlusion, Direct, and Indirect. We have combined all three into a single graph to keep this article from getting too long:

FurryBall RT Benchmark PCI-E x8 vs 16 Comparison

The results here are technically not as close as they were with Octane, with up to a 3.2% difference, but that difference goes in favor of different systems depending on the test. Because neither configuration is continually faster, this again appears to be within the margin of error.

Benchmark Results - V-Ray

And finally, we look at the GPU portion of V-Ray Benchmark. The main V-Ray application has moved on to version 3.6, so this test is a  bit dated now, but it isolates GPU rendering performance well (it also has a CPU option, which was turned off for this article):

V-Ray Benchmark 3.57.01 PCI-E x8 vs x16 Comparison

Here there is absolutely no difference between the two CPUs, and thus no difference between x8 and x16 speeds for the video cards. This may be due to the lower precision (just two significant digits in the test results) but it does line up with the findings from the other two benchmarks.

Conclusion

As shown across the results of all three tests, there is no impact from x8 vs x16 lane configurations for GPU rendering. This is good news, as it means that less expensive CPUs can be used even in multi-GPU rendering workstations. In other articles we have found that there is minimal difference between chipsets / motherboard platforms for GPU rendering, so for these applications you can focus on what really does matter: getting the most - and most powerful - video cards you can afford.

Tags: PCI, Express, PCI-E, Lanes, Scaling, x8, x16, Motherboard, CPU, Core, X, i7, i9, Skylake, Video, Card, GPU, Rendering
Sid

Thank you for providing this information!

Posted on 2017-09-22 10:05:10
بهاء التميمي

Thanks for this informstion.
For years I thought the lans No. have strong effect on the performance.
So, the question.
Why they manufactured with differ lan No. ?

Posted on 2017-10-01 17:07:56

The PCI-E lane width - x1, x4, x8, or x16 - determines how much data can be sent, per second, to and from an expansion card. There are certainly situations where that is an important factor. For example, 10GbE network adapters usually use a x4 (or larger) connection because in order to sustain 10Gb per second of data transfer a x1 connection would not be big enough.

Video cards tend to default to the largest size (x16) in order to provide the maximum amount of bandwidth possible. However, it turns out that for many applications that is overkill. We've known that to be the case with video games for some time, which is why its okay to use dual video cards with each one at x8 speed (though I would also argue that dual GPUs is overkill for most gaming situations... but that is another story).

What is less-known is how the PCI-E speed impacts other applications that make heavy use of the video card. It turns out that for GPU-based rendering, the lane size - at least between x16 and x8 - doesn't really matter. This helps us know what sort of configurations to recommend to customers doing this sort of work. Would x4 or x1 have an impact? Very possibly, but we haven't tested that far down - and given the way motherboards are designed, I don't think there would be a practical reason to do so.

Are there other applications where GPUs do need more bandwidth? Probably - but it comes down to how the video card is being used. The places where PCI-E lane speeds are going to have a big impact are situations where a lot of data is being sent back and forth between the card and the rest of the computer. In the case of rendering, data is sent out to the card about what the scene is... and then the card processes that and sends back updates on what the image should look like as the rendering process unfolds. That is not a lot of information, which is likely why x8 vs x16 has no real impact here. But if you had an application that was constantly sending tons of new data to the video card, and the card was sending back results just as fast, that would be much more likely to show a difference with lower PCI-E lane counts.

Posted on 2017-10-02 17:12:31
Mike Ligocki

Does that imply that applications such as 3D renders or After Effects that uses CUDA would benefit from wider lanes since they are passing data back and forth at (preferably) high speeds?

Posted on 2018-11-30 16:03:59

I highly doubt AE would be impacted even by going from x16 to x4 since there is almost no performance difference between something like a GTX 1060 and a Titan V. It is just so much more CPU limited even if you try to load up on accelerated effects.

In this article, William is covering rendering in raytracing/3s rendering engines like OctaneRender. Rendering of video is completely different - I wish there was a better way for us to differentiate the two, but it seems like people in both types of software like to just use the term "rendering".

Posted on 2018-11-30 16:13:44
Salah Rabe

Good to know... it's excellent, my 7820x kicks the same as 7900x in video games. I was afraid that "only" 28 cpu lanes works lower than 44 when i'll plan for a second Gigabyte O/C GTX 1080. But in fact, it shows no différences. The price of the 7900x is twice as the 7820x, but the performance is far far away to be the twice of the 7820x. Keep thinking i'd made the good choice with this cpu, the ratio cores/perfs/price is really better than his brother 7900x.

Posted on 2017-12-24 10:40:17
Ryan

Sure, in gaming, the cpu plays a smaller role than the GPU which is why AMD CPU's are usually adequate for most gaming needs. Where that 7900x shines is compiling very large programs/data sets as an example, or simply running multiple instances of programs that continually calculate data.

Posted on 2017-12-29 06:24:52
jorgehe1988

Can the 7820x handle 4 gpu ?

Posted on 2017-12-28 07:48:55

The motherboards we had available to test with these CPUs could only accommodate up to three GPUs physically. We are all waiting X299 motherboards with space for four double wide expansion cards, and when those become available - if they pass our testing - we will probably try this sort of test again to see if the 7820X will work or not.

Posted on 2017-12-28 07:56:24
jorgehe1988

I have the 7820x and 4x 1070 gpu and I thought if the (Asus - ROG RAMPAGE VI APEX) could handle all the GPUs, because the 7820x has 28 pci Lines and 4x x 4x are 16 pci lines, and 4x for each GPU, the speed of render is the same.

Posted on 2017-12-28 08:11:28
Trevor Ketch

Thank you for the info, as most of us are not well padded to discover this information ourselves.

Posted on 2018-04-06 17:19:06
Ron

I have just about the exact same set of parts on my lab desk as I type. My only difference is I have an ASUS Prime X299 Deluxe MB and a EVGA GTX 1070 SC black edition Video card. I am going with the i9 CPU. But the memory and HDD are the same, I am also putting in a 4TB HDD for storage. Plus the standard BD writer and multi SD card reader. All of which will be water cooled in the Thermaltake case, when ever it decides to make an appearance. Thanks for confirming my build, I just stumbled upon this URL today and was shocked at the simularities.

Posted on 2018-06-01 15:08:23
Lex -

However, the GTX 1080 series don't get anywhere near saturating the bandwidth of the PCI-E x16 3.0 bus but the Titan V and the Titan RTX are different animals, altogether.

Most of the communication happens across the SLI or X-Fire bridges so that doesn't affect the bandwidth of the cards that much. The only real difference on the older cards, is the fact that you can load textures and other crap into the card nearly twice as fast on the x16 vs. that of the x8 lanes configuration. Once the game or applications starts, the vast amount of communications are the same across the cards over the PCI-E busline. Very few apps/games send different information to each card. You'll notice this with a big drop in performance, despite having a bridge installed and enabled.

Much of what you see in increases are due to one chip being slightly more efficient than the other in microcode and other factors not actually relating to the PCI-E lane availability. Some motherboards allow you to select how many PCI-E lanes are active (for troubleshooting processes). If you use the same amount of lanes regardless of how many are available per CPU, you'll get a better measurement of how good a said CPU is.

Posted on 2018-12-18 06:54:30

In the type of application we were looking at here, GPU-based rendering, there is actually very little communication needed between the cards. Historically, SLI and Crossfire have not been used (and in fact have been avoided) for these programs, though that may change with adoption of NVLink on the new RTX series video cards. The bigger deal with PCI-E lanes, as you noted, is how quickly you can load texture and other scene data into the video card - and then how quickly you can get the finished frame data back from the video card when it is done processing.

We have done additional testing since this article (like this: https://www.pugetsystems.co... ) and generally found that x8 vs x16 has very little impact on performance. If you go further, to x4 or x1, the differences begin to be more noticeable. There might also be bigger differences when working with more complex scenes than what are used in the benchmarks we have available.

Posted on 2018-12-18 17:21:39

disqus_KQbFSURpzi 5+

Posted on 2019-01-01 06:05:40
Motion Graphic

Thank you for your info

My question is X299 don't effect about number M.2 with PCI-E Lane what I see in uploaded Picture from this Link

https://uploads.disquscdn.c...

https://www.gigabyte.com/Fi...

but Linus said in X399 or X 299 every M.2 take 4 PCI-E Lane

https://uploads.disquscdn.c...

in this link in time 2:34

https://www.youtube.com/wat...

I need to add three M.2 NVMe to both PC System x299 & x399 without effecting on 2 GPU PCI-E Lane X16 for each PC System

and If Need to add Another Graphic Card for x299 is x16 or x8 best for primary Preview 4k monitor because I have 2 GPU with liquid water for render V-ray RT Production on x16

Posted on 2019-01-10 18:46:53

You should be able to get away with two x16 GPUs and three x4 M.2 drives on either X299 or X399 - both ought to have sufficient PCI-E lanes, if the motherboard is laid out correctly. A third video card will complicate things, though... you'd probably need to end up going x16 / x8 / x8 for the three GPUs. That is likely okay, though, since x8 is usually not all that much slower than x16 for GPU based rendering (and most other GPU tasks as well).

Posted on 2019-01-10 21:46:39
Motion Graphic

Thank you too much and I appreciate your helping

But each three M.2 take 4 PCI-Lane meaning 12 PCI-Lane or they sharing one 4 PCI-Lane

With gigabytes link below

https://www.gigabyte.com/Fi...

Didn't effect the number of M.2 and take 4 PCI-Lane with three GPU still 2 16x and on 8x and three M.2
Is that correct from gigabytes link with 44 PCI-Lane x299 cpu

Posted on 2019-01-11 13:21:51

Which motherboard are you looking at? When I used that configurator to set up 3 GPUs and 3 M.2 drives on the X299 Designare EX (the board we use here) it showed the PCI-E slots at x8 / x16 / x8 with all three M.2 slots in use.

Posted on 2019-01-14 20:02:23