Puget Systems print logo
Read this article at https://www.pugetsystems.com/guides/1181
Article Thumbnail

Redshift 2.6.11 Multi-GPU Performance Scaling

Written on June 18, 2018 by William George


Redshift is a production-quality, GPU-accelerated renderer. Traditionally this type of rendering was done on CPUs, but graphics processors (GPUs) are ideal for highly parallel tasks like this - and it is easier to fit multiple video cards in a single computer, to boost performance, than multiple CPUs.

Speaking of multiple cards, how well does rendering speed scale across multiple GPUs in Redshift? Are there diminishing returns as more cards are added? We are putting Redshift 2.6.11 to the test, looking at scaling from one to four video cards in a single workstation.

Test Setup

To see how increasing the number of video cards in a system affects performance in Redshift, we ran the benchmark included in the demo version of Redshift 2.6.11 with 1, 2, 3, and 4 NVIDIA GeForce GTX 1080 Ti video cards. This benchmark uses all available GPUs to render a single, still image. For animations, there are also methods to assign a different frame to each video card - which may be more efficient in some situations, but is outside the scope of the benchmarking tool Redshift provides.

On the hardware side, we wanted to use a high clock speed processor so that the video cards could really shine. We also needed a platform that would support as many video cards as possible in a large tower workstation. Given that combination of goals, the configuration which made the most sense was Intel's Xeon W - specifically, the W-2125 processor on a Gigabyte MW51-HP0 board. That provided the right PCI-Express slot layout for up to four GPUs, and the Xeon W-2125 runs fast: 4.0GHz base and up to 4.5GHz turbo.

If you would like full details on the hardware configuration we tested on, just .

Benchmark Results

Here are the Redshift 2.6.11 benchmark render times with 1, 2, 3, and 4 of the GeForce GTX 1080 Ti 11GB graphics card:

Redshift Benchmark GeForce GTX 1080 Ti GPU Performance Scaling from 1 to 4 Video Cards

Or another way to look at it, here is how adding video cards increased rendering performance - shown as a percentage compared to the speed of a single card:

Redshift Benchmark GeForce GTX 1080 Ti Performance Scaling as Percentage


As demonstrated above, video card performance in Redshift scales very well as additional cards are added. It isn't quite perfect, or linear, scaling - there is some level of diminishing returns - but it is still more than enough to justify their use in multi-GPU workstations.


Performance in Redshift scales very well across multiple GPUs - but that statement can lead to incorrect conclusions. Doubling the number of video cards in a system almost doubles performance, but does *not* double the price of the computer. Much of a workstation may stay the same, even as more video cards are added, so the percentage increase in price for an additional card is usually less than the percentage increase in Redshift performance you will end up getting. When looking at the total price of a system, a few lower-cost cards can often outpace one or even two top-end GPUs - so multiple video cards is the way to go for the best value in Redshift.

Redshift Workstations

Puget Systems offers a range of powerful and reliable systems that are tailor-made for your unique workflow.

Configure a System!

Labs Consultation Service

Our Labs team is available to provide in-depth hardware recommendations based on your workflow.

Find Out More!
Tags: Multi, GPU, Scaling, Rendering, Redshift, Benchmark, NVIDIA, GeForce, 1080 Ti, Performance, Intel, Xeon W, Video, Card

Thanks for this bench, Will. I have two questions.
First: Is this scalability maintained independently of the model? Are 2x 1060 ti getting the same percentage per unit?
Sec: Is it possible to multi-gpu render with 2+ different models? 1x 1080ti + 1x 1070ti + 1x 1050, for example? As far as I know, SLI has nothing to do with multi-gpu render; is it right? Do you know if there's any downside in doing so?

Well, a little more than 2.

Posted on 2018-06-22 19:28:42

You are correct that SLI has nothing to do with GPU based rendering, along with most other non-gaming GPU applications.

Mixing GPU models should work just fine, but note that you will be limited on scene size and complexity by the lowest amount of RAM on any of the cards used (the memory amounts do not add together or average - the lowest amount limits the system as a whole).

We've actually gotten that question a surprising number of times, though, so I might go ahead and do some testing on a mixed-card setup. I don't have a 1050 handy, but I could probably put together a 1080 Ti + 1070 Ti + 1060 or something along those lines.

And as for scaling, I don't have access to a full four of any other GPU models right now... but based on what I've seen in other GPU renderers I would expect scaling to be roughly the same (in terms of % increase with each card added) no matter the model of card used.

Posted on 2018-06-22 19:34:10

Okay, the 1080 Ti cards were in use - but here is what I got with a 1070 Ti + 1060: 546 seconds.

If it had been two 1070 Ti cards, I would have expected around 463 seconds based on testing we've done here ( http://puget.systems/go/148263 ) combined with the scaling seen in the article above. For two 1060s, I would have expected around 683 seconds. The result above from combining the two falls between those, so I'd say that mixing GPUs - at least within the same generation - works as expected.

Posted on 2018-06-22 20:19:11

Thanks, Will. Being limited by the lowest amount of RAM is a big deal.

Yeah, seems like mixing two different models works as expected. By the way, can I have two different models, using one (e.g. 1070ti) for display/viewport and the second one (e.g. 1080ti) exclusively to render, without it being limited by the lowest memory? Would this setup be interesting to make faster renders?

Posted on 2018-06-23 16:02:28

There wouldn't be any performance benefit to leaving one of the cards for display only - it would be just like rendering on a system with one fewer video card. It *might* be beneficial if you wanted to keep working in other 3D-heavy applications while the rendering was taking place, I suppose.

Posted on 2018-06-25 16:08:23

Yeah, that's the idea. But maybe it's better to just stick with 2 gpus.

Posted on 2018-06-25 16:11:40

Is there a link anywhere to testing with multiple GPUs and CPUs? I wonder what the ramifications are of:
1. GPUs on one CPU, Storage on the other (PXIe m2, or ioFX etc.)
2. 1 GPU per processor, PCIe storage Raided across processors
3. GPU on each processor, srorage on seperate M2 lane( not sure how that impact Lane communication.

I have an Asus Z9 PE-D8, which actually has an evil secret and the main GPU slot is PCIe-2 only, so I can't put both GPUs on one processor like I wanted. It got me thinking about the best set up for multiple GPUs. Maybe it doesn't make a difference...

Posted on 2018-12-09 10:04:16

Because Redshift is a GPU based renderer, we haven't tested it much on dual-CPU systems. However, from my recent work I believe Redshift generally does better with a high clock speed CPU - and dual processor systems don't generally offer the highest clock speeds, so I don't think that would be an ideal platform unless you have need for a lot of CPU cores in other programs.

You mentioned that your main GPU is on a PCI-E gen 2 slot, but that shouldn't be a big deal. We did testing with PCI-E 3.0 vs 2.0, as well as lane width scaling, and found there was almost no difference in Redshift between 2.0 and 3.0. As you lose slot bandwidth there is a little bit of performance degredation, but it isn't bad at x8 (vs the normal x16) and even x4 is decent... just avoid using x1 slots, as they are far, far slower:


I also don't think the location of the storage would matter much, at least not with the benchmarking we are doing.

Posted on 2018-12-10 17:14:43

I'm thinking about putting together a 6 gpu render rig for Redshift. Seeing your tests results for Furryball, I noticed there's little effect on performance by CPU:

I'm wondering if it's the same with Redshift, and if so, would I be able to run 6 gpu's at 8x pcie speed on an older X99 + i7-4930K system with good performance, or should I be looking at something more up to date, like an X299 Sage + i7-9800X?

Posted on 2019-02-21 20:11:07

Older platforms (chipset + CPU combinations) will almost always have fewer PCI-Express lanes than newer ones, and if you go back far enough may have slower ones as well (PCI-E gen 2 vs 3). And then of course you have older CPUs, which are generally lower in clock speed and with fewer cores. That can have an impact on GPU rendering when there isn't enough CPU power to keep up with what the video cards are doing, but it varies from one render engine to the next.

Six GPUs is also tricky, since it means either a non-standard size motherboard or single-slot video cards... both of which have disadvantages. It is hard to make a specific recommendation without knowing more about your exact situation, budget, and goals - but in general, I would stick with a modern CPU and motherboard if at all possible. I would also recommend at minimum one CPU core per video card (ideally a couple extra) and system RAM that is at least double the total amount of GPU memory across all the video cards you are using.

I'm also curious, personally, as to what GPUs you plan to use - and why six of them? :)

Posted on 2019-02-21 20:49:41

I have an offer to purchase 6x 1070's for under $300 each. It's such a good deal I want to weigh my options. I already have a Threadripper 1950x with 64GB ram and 1070+1080ti. It runs great as a workstation so I wanted to see if it were possible to optimize my budget for a gpu only render rig.

How does this motherboard look:
Asus X99E WS (LGA 2011-3)

Paired with an i7-4930K (LGA 2011)

The processor has 40 PCIe lanes and I'm assuming the motherboard uses a PLX chip in there to enable seven lanes @ x16/x8/x8/x8/x8/x8/x8

If I get those used, I can justify springing $800 on 128GB ram. But I don't know if that would be a noticeable hit to render performance.

Posted on 2019-02-21 22:46:52

Hmm, yes - that motherboard does use PLX (if my memory is correct) to split the lanes up along those lines. However, I have a couple of concerns about that:

1) The GTX 1070 is normally a dual-width video card. Do you know which specific manufacturer and model these are? I would worry they won't fit in the way you hope.

2) The 4930K is old enough now that I haven't tested with it in... a long time. I am not sure how much that CPU might hold back the rest of this system.

3) The 1070 is a good card on its own, but GPU rendering tech is starting to move beyond it. For example, the next version of OctaneRender is going support the new RT cores in the GeForce RTX line of cards, and with that enabled it is looking like a 2-3x performance increase over the previous generation. So pretty soon, two or three RTX 2070 cards would be able to give the same level of performance (roughly speaking) as six GTX 1070s in that engine. I am not sure if or when Redshift will add similar support, but if they do then being on the older generation of technology might not be as cost-effective as you are hoping.

Have you looked at what the cost would be to, instead, upgrade your Threadripper system to have 2-3 of the newer RTX series video cards?

Posted on 2019-02-21 22:57:38

That is a very good point. I was planning on using risers and attach the cards to a rack. Though adding more cards to my current system faces the same issue, there's just not enough space between the slots with how thick the gaming cards are that are already there. I would have to use risers and remount the cards if I wanted to add more GPU's either way.

Budgeting everything out for the 6x 1070 system would be ~3k for ~800ob (octanebench). The equivalent 4x RTX 2070 would be ~2200 for ~840ob, but that's just the cards and I wouldn't be able to add 4 gpu's to my current system, so would still have to assemble a whole new rig with newer components :/

I'll have to mull it over. Thanks for the input!

Posted on 2019-02-22 00:06:57

Out of curiosity, what motherboard, chassis, and power supply do you have? Many X399 (Threadripper) boards can support 3-4 GPUs, you may just need to be careful about selecting models that have the proper sort of cooling layout.

Posted on 2019-02-22 16:48:00

I have a Threadripper 1950x on a Taichi x399 board, 850W psu, 32GB RAM in a Corsair 300R. I modded the case from an old build so I'm not partial to keeping it. I would consider having some kind of open air rig and connect extra cards with risers.

My motherboard's pcie slots run x16 x8 x16 x8. Now I'm wondering if it's possible to use something like this to split the ports?
Supermicro RSC-R2UT-2E8R 2-port Riser Card - 2 x PCI Express x8 2U Chasis

Posted on 2019-02-23 23:16:16

Ha! I'm really tempted to try and open air system just to have an excuse to use something like this:

Posted on 2019-02-23 23:20:08

I don't have any experience with PCI-E expanders like that, but it might work. Personally, though, with your motherboard I would strongly consider just going with four GPUs stacked together in a nice chassis. You'd need to upgrade from the 300R, and you would also need a bigger power supply, but with those two changes you could switch to four rear-exhausting GeForce RTX series cards. If you can get the GPUs for ~$2200, then I bet you could get a chassis and PSU as well and keep the whole thing under $3000.

Posted on 2019-02-25 17:14:19

Yea, I think you're right. Looking further into that splitter it seems it's only compatible with limited motherboards that support PCIe bifurcation, and also only runs at PCIe 2.0

I guess I have to throw in the towel with trying to support more than 4 gpus on my system, can't say I didn't try though :)

I'm sure your time is very valuable, I really appreciate you taking the time to help me figure it out!

Posted on 2019-02-26 05:17:56

You are very welcome! I hope that whatever config you settle on serves you well :)

Posted on 2019-02-26 16:47:28

Old thread, but worth asking, I've got a 1070 and a 980ti - practically the same specs other than the 980ti being heavier in power consumption. Is that a setup that could cause any problems?

Posted on 2019-08-06 19:52:11

We actually look at exactly that in another article:


Long story short: mixing GPUs from different generations seems to work just fine in Redshift :)

Posted on 2019-08-06 21:42:54

I've got a new 2080TI and my old GTX 970... but will the GTX 970 just bottleneck the 2080TI? I mean the whole point of the upgrade was due to Redshift crashing with the GTX970 (I believe due to a shortage of vram)...

Posted on 2019-08-08 04:56:24

If the 4GB on that GTX 970 isn't enough for your scenes, you will want to leave it out to avoid similar problems. You could still include in when rendering smaller stuff, where the VRAM isn't a problem, potentially... or you could upgrade it later on to a second RTX series card :)

Posted on 2019-08-08 05:03:06

Great point, thank you for getting back to me!

Posted on 2019-08-12 07:44:40
tim unsworth

Currently rendering out an animation and was looking at speeding up the process / frame times and took a look at the task manager to see what's going on. I'm not that technically minded when it comes to hardware so was wondering how best to do this? Is there a way to manage the way each component uses its memory?
As you can see by the screen grab i'm running a RTX2080 and 16gb ram. I'm considering buying another RTX2080 and 32GB Ram, but want to get the current system working to its best first. Looks like the RTX is generally hovering around 5-15% - clearly it can do more - any way to bring this % up to make the most of the RTX and get faster render times?
I'm rendering out C4D scenes (R20) through Redshift.
Would also welcome best settings in c4d/redshift to maximise how it uses the render / gpu hardware.
Many thanks.


Posted on 2019-10-09 22:00:39

The default GPU usage graph that Windows 10 has in Task Manager is looking at 3D performance from the video card, like what video games use, but Redshift and other GPU rendering applications don't use that. Instead, to get a better feel for usage, go to:

Task Manager
Performance tab
Click on the GPU on the left side (GPU 1, in your case)
Go to the drop-down where it says "3D" at the top of the graph on the right hand side, and select CUDA instead

That should get you a better view of the type of GPU activity that Redshift should be making. It still may not max-out at 100% all the time while rendering, but hopefully that helps. I think you may also be able to see GPU memory usage in that view.

Posted on 2019-10-09 22:07:50
tim unsworth

You learn something new every day. Thanks William, I'll take a look.

Posted on 2019-10-10 05:56:54

I'm having some problems with a multi-GPU build, I have an ASUS X299 SAGE motherboard with a i7-9800x, with 64Gb RAM (about to double it) running 2 x 8GB RTX2070's (originally 4) The renders are running very slow, almost double the time compared to an older ASUS p9X79 Pro with an i7-3930k, 64Gb RAM running 2 of the RTX2070 8GB cards.
Everything is running the same version of software and swapping the GPU's makes no change.

Any ideas on what could be causing this?

Posted on 2020-03-10 00:09:54

That seems really strange! You've got plenty of RAM, and that CPU shouldn't be holding anything back (especially compared to the older 3930k). Maybe something with the motherboard? Have you tried switching around which slots the video cards are in, or checking (either within Windows or in the BIOS settings) to see if something is amiss with PCI-E lane speeds? What about GPU driver versions?

Oh - have you tried any other software for comparison? Maybe run OctaneBench as an alternative to Redshift, to see if it shows the same lower performance.

Posted on 2020-03-13 18:08:03

Yeah, Its got me baffled... I've switched around which slots I have cards in and swapped cards with another computer. Updated the BIOS and checked the PCiE config, updated the drivers to the latest stable version. The big PC is running Windows 10, the other is running 7, but I don't see how that would make a difference...
I'll install and test OctaneBench and see how it performs

Posted on 2020-03-16 01:18:08

I ran Octanebench and the problem computer got a Total score of 440.15 and the other computer got 440.41, so it looks to be running about average for 2 x RTX 2070 Supers

Posted on 2020-03-16 23:26:07

Okay, if they GPUs are scoring the same in both systems then the problem likely lies outside of the hardware. I know that isn't super helpful, but maybe something is configured differently in software (Redshift, etc) somehow? I'm not really good at troubleshooting stuff like this without physically being present at a system :/

Posted on 2020-03-17 05:35:25