Puget Systems print logo

https://www.pugetsystems.com

Read this article at https://www.pugetsystems.com/guides/1181
Article Thumbnail

Redshift 2.6.11 Multi-GPU Performance Scaling

Written on June 18, 2018 by William George
Share:

Introduction

Redshift is a production-quality, GPU-accelerated renderer. Traditionally this type of rendering was done on CPUs, but graphics processors (GPUs) are ideal for highly parallel tasks like this - and it is easier to fit multiple video cards in a single computer, to boost performance, than multiple CPUs.

Speaking of multiple cards, how well does rendering speed scale across multiple GPUs in Redshift? Are there diminishing returns as more cards are added? We are putting Redshift 2.6.11 to the test, looking at scaling from one to four video cards in a single workstation.

Test Setup

To see how increasing the number of video cards in a system affects performance in Redshift, we ran the benchmark included in the demo version of Redshift 2.6.11 with 1, 2, 3, and 4 NVIDIA GeForce GTX 1080 Ti video cards. This benchmark uses all available GPUs to render a single, still image. For animations, there are also methods to assign a different frame to each video card - which may be more efficient in some situations, but is outside the scope of the benchmarking tool Redshift provides.

On the hardware side, we wanted to use a high clock speed processor so that the video cards could really shine. We also needed a platform that would support as many video cards as possible in a large tower workstation. Given that combination of goals, the configuration which made the most sense was Intel's Xeon W - specifically, the W-2125 processor on a Gigabyte MW51-HP0 board. That provided the right PCI-Express slot layout for up to four GPUs, and the Xeon W-2125 runs fast: 4.0GHz base and up to 4.5GHz turbo.

If you would like full details on the hardware configuration we tested on, just .

Benchmark Results

Here are the Redshift 2.6.11 benchmark render times with 1, 2, 3, and 4 of the GeForce GTX 1080 Ti 11GB graphics card:

Redshift Benchmark GeForce GTX 1080 Ti GPU Performance Scaling from 1 to 4 Video Cards

Or another way to look at it, here is how adding video cards increased rendering performance - shown as a percentage compared to the speed of a single card:

Redshift Benchmark GeForce GTX 1080 Ti Performance Scaling as Percentage

Analysis

As demonstrated above, video card performance in Redshift scales very well as additional cards are added. It isn't quite perfect, or linear, scaling - there is some level of diminishing returns - but it is still more than enough to justify their use in multi-GPU workstations.

Conclusion

Performance in Redshift scales very well across multiple GPUs - but that statement can lead to incorrect conclusions. Doubling the number of video cards in a system almost doubles performance, but does *not* double the price of the computer. Much of a workstation may stay the same, even as more video cards are added, so the percentage increase in price for an additional card is usually less than the percentage increase in Redshift performance you will end up getting. When looking at the total price of a system, a few lower-cost cards can often outpace one or even two top-end GPUs - so multiple video cards is the way to go for the best value in Redshift.

Recommended Systems for Redshift

Tags: Multi, GPU, Scaling, Rendering, Redshift, Benchmark, NVIDIA, GeForce, 1080 Ti, Performance, Intel, Xeon W, Video, Card

Thanks for this bench, Will. I have two questions.
First: Is this scalability maintained independently of the model? Are 2x 1060 ti getting the same percentage per unit?
Sec: Is it possible to multi-gpu render with 2+ different models? 1x 1080ti + 1x 1070ti + 1x 1050, for example? As far as I know, SLI has nothing to do with multi-gpu render; is it right? Do you know if there's any downside in doing so?

Well, a little more than 2.

Posted on 2018-06-22 19:28:42

You are correct that SLI has nothing to do with GPU based rendering, along with most other non-gaming GPU applications.

Mixing GPU models should work just fine, but note that you will be limited on scene size and complexity by the lowest amount of RAM on any of the cards used (the memory amounts do not add together or average - the lowest amount limits the system as a whole).

We've actually gotten that question a surprising number of times, though, so I might go ahead and do some testing on a mixed-card setup. I don't have a 1050 handy, but I could probably put together a 1080 Ti + 1070 Ti + 1060 or something along those lines.

And as for scaling, I don't have access to a full four of any other GPU models right now... but based on what I've seen in other GPU renderers I would expect scaling to be roughly the same (in terms of % increase with each card added) no matter the model of card used.

Posted on 2018-06-22 19:34:10

Okay, the 1080 Ti cards were in use - but here is what I got with a 1070 Ti + 1060: 546 seconds.

If it had been two 1070 Ti cards, I would have expected around 463 seconds based on testing we've done here ( http://puget.systems/go/148263 ) combined with the scaling seen in the article above. For two 1060s, I would have expected around 683 seconds. The result above from combining the two falls between those, so I'd say that mixing GPUs - at least within the same generation - works as expected.

Posted on 2018-06-22 20:19:11

Thanks, Will. Being limited by the lowest amount of RAM is a big deal.

Yeah, seems like mixing two different models works as expected. By the way, can I have two different models, using one (e.g. 1070ti) for display/viewport and the second one (e.g. 1080ti) exclusively to render, without it being limited by the lowest memory? Would this setup be interesting to make faster renders?

Posted on 2018-06-23 16:02:28

There wouldn't be any performance benefit to leaving one of the cards for display only - it would be just like rendering on a system with one fewer video card. It *might* be beneficial if you wanted to keep working in other 3D-heavy applications while the rendering was taking place, I suppose.

Posted on 2018-06-25 16:08:23

Yeah, that's the idea. But maybe it's better to just stick with 2 gpus.

Posted on 2018-06-25 16:11:40

Is there a link anywhere to testing with multiple GPUs and CPUs? I wonder what the ramifications are of:
1. GPUs on one CPU, Storage on the other (PXIe m2, or ioFX etc.)
2. 1 GPU per processor, PCIe storage Raided across processors
3. GPU on each processor, srorage on seperate M2 lane( not sure how that impact Lane communication.

I have an Asus Z9 PE-D8, which actually has an evil secret and the main GPU slot is PCIe-2 only, so I can't put both GPUs on one processor like I wanted. It got me thinking about the best set up for multiple GPUs. Maybe it doesn't make a difference...

Posted on 2018-12-09 10:04:16

Because Redshift is a GPU based renderer, we haven't tested it much on dual-CPU systems. However, from my recent work I believe Redshift generally does better with a high clock speed CPU - and dual processor systems don't generally offer the highest clock speeds, so I don't think that would be an ideal platform unless you have need for a lot of CPU cores in other programs.

You mentioned that your main GPU is on a PCI-E gen 2 slot, but that shouldn't be a big deal. We did testing with PCI-E 3.0 vs 2.0, as well as lane width scaling, and found there was almost no difference in Redshift between 2.0 and 3.0. As you lose slot bandwidth there is a little bit of performance degredation, but it isn't bad at x8 (vs the normal x16) and even x4 is decent... just avoid using x1 slots, as they are far, far slower:

https://www.pugetsystems.co...

I also don't think the location of the storage would matter much, at least not with the benchmarking we are doing.

Posted on 2018-12-10 17:14:43
jhsu

I'm thinking about putting together a 6 gpu render rig for Redshift. Seeing your tests results for Furryball, I noticed there's little effect on performance by CPU:
https://www.pugetsystems.com/labs/articles/FurryBall-GPU-Rendering-Platform-Comparison-Skylake-X-Xeon-W-and-Threadripper-1024/

I'm wondering if it's the same with Redshift, and if so, would I be able to run 6 gpu's at 8x pcie speed on an older X99 + i7-4930K system with good performance, or should I be looking at something more up to date, like an X299 Sage + i7-9800X?

Posted on 2019-02-21 20:11:07

Older platforms (chipset + CPU combinations) will almost always have fewer PCI-Express lanes than newer ones, and if you go back far enough may have slower ones as well (PCI-E gen 2 vs 3). And then of course you have older CPUs, which are generally lower in clock speed and with fewer cores. That can have an impact on GPU rendering when there isn't enough CPU power to keep up with what the video cards are doing, but it varies from one render engine to the next.

Six GPUs is also tricky, since it means either a non-standard size motherboard or single-slot video cards... both of which have disadvantages. It is hard to make a specific recommendation without knowing more about your exact situation, budget, and goals - but in general, I would stick with a modern CPU and motherboard if at all possible. I would also recommend at minimum one CPU core per video card (ideally a couple extra) and system RAM that is at least double the total amount of GPU memory across all the video cards you are using.

I'm also curious, personally, as to what GPUs you plan to use - and why six of them? :)

Posted on 2019-02-21 20:49:41
jhsu

I have an offer to purchase 6x 1070's for under $300 each. It's such a good deal I want to weigh my options. I already have a Threadripper 1950x with 64GB ram and 1070+1080ti. It runs great as a workstation so I wanted to see if it were possible to optimize my budget for a gpu only render rig.

How does this motherboard look:
Asus X99E WS (LGA 2011-3)
https://www.asus.com/Mother...

Paired with an i7-4930K (LGA 2011)
https://ark.intel.com/produ...

The processor has 40 PCIe lanes and I'm assuming the motherboard uses a PLX chip in there to enable seven lanes @ x16/x8/x8/x8/x8/x8/x8

If I get those used, I can justify springing $800 on 128GB ram. But I don't know if that would be a noticeable hit to render performance.

Posted on 2019-02-21 22:46:52

Hmm, yes - that motherboard does use PLX (if my memory is correct) to split the lanes up along those lines. However, I have a couple of concerns about that:

1) The GTX 1070 is normally a dual-width video card. Do you know which specific manufacturer and model these are? I would worry they won't fit in the way you hope.

2) The 4930K is old enough now that I haven't tested with it in... a long time. I am not sure how much that CPU might hold back the rest of this system.

3) The 1070 is a good card on its own, but GPU rendering tech is starting to move beyond it. For example, the next version of OctaneRender is going support the new RT cores in the GeForce RTX line of cards, and with that enabled it is looking like a 2-3x performance increase over the previous generation. So pretty soon, two or three RTX 2070 cards would be able to give the same level of performance (roughly speaking) as six GTX 1070s in that engine. I am not sure if or when Redshift will add similar support, but if they do then being on the older generation of technology might not be as cost-effective as you are hoping.

Have you looked at what the cost would be to, instead, upgrade your Threadripper system to have 2-3 of the newer RTX series video cards?

Posted on 2019-02-21 22:57:38
jhsu

That is a very good point. I was planning on using risers and attach the cards to a rack. Though adding more cards to my current system faces the same issue, there's just not enough space between the slots with how thick the gaming cards are that are already there. I would have to use risers and remount the cards if I wanted to add more GPU's either way.

Budgeting everything out for the 6x 1070 system would be ~3k for ~800ob (octanebench). The equivalent 4x RTX 2070 would be ~2200 for ~840ob, but that's just the cards and I wouldn't be able to add 4 gpu's to my current system, so would still have to assemble a whole new rig with newer components :/

I'll have to mull it over. Thanks for the input!

Posted on 2019-02-22 00:06:57

Out of curiosity, what motherboard, chassis, and power supply do you have? Many X399 (Threadripper) boards can support 3-4 GPUs, you may just need to be careful about selecting models that have the proper sort of cooling layout.

Posted on 2019-02-22 16:48:00
jhsu

I have a Threadripper 1950x on a Taichi x399 board, 850W psu, 32GB RAM in a Corsair 300R. I modded the case from an old build so I'm not partial to keeping it. I would consider having some kind of open air rig and connect extra cards with risers.

My motherboard's pcie slots run x16 x8 x16 x8. Now I'm wondering if it's possible to use something like this to split the ports?
Supermicro RSC-R2UT-2E8R 2-port Riser Card - 2 x PCI Express x8 2U Chasis
https://www.compsource.com/...

Posted on 2019-02-23 23:16:16
jhsu

Ha! I'm really tempted to try and open air system just to have an excuse to use something like this:
https://amfeltec.com/produc...

Posted on 2019-02-23 23:20:08

I don't have any experience with PCI-E expanders like that, but it might work. Personally, though, with your motherboard I would strongly consider just going with four GPUs stacked together in a nice chassis. You'd need to upgrade from the 300R, and you would also need a bigger power supply, but with those two changes you could switch to four rear-exhausting GeForce RTX series cards. If you can get the GPUs for ~$2200, then I bet you could get a chassis and PSU as well and keep the whole thing under $3000.

Posted on 2019-02-25 17:14:19
jhsu

Yea, I think you're right. Looking further into that splitter it seems it's only compatible with limited motherboards that support PCIe bifurcation, and also only runs at PCIe 2.0

I guess I have to throw in the towel with trying to support more than 4 gpus on my system, can't say I didn't try though :)

I'm sure your time is very valuable, I really appreciate you taking the time to help me figure it out!

Posted on 2019-02-26 05:17:56

You are very welcome! I hope that whatever config you settle on serves you well :)

Posted on 2019-02-26 16:47:28