Puget Systems print logo
Read this article at https://www.pugetsystems.com/guides/1121
Article Thumbnail

DaVinci Resolve 14 GPU Scaling: Core i9 vs Xeon W vs Dual Xeon SP

Written on March 6, 2018 by Matt Bach


Blackmagic's DaVinci Resolve is known for how well it utilizes multiple GPUs to improve performance, but our previous testing found that the the scaling was not nearly as dramatic as many claim. However, as that testing was some of our first in-depth testing of DaVinci Resolve, there is quite a bit we want to expand on that may affect our results.

First, we have recently revamped our DaVinci Resolve testing process to be more in line with realistic work loads. Not only did we add things like OpenFX, we also dramatically increased the number of codecs tested. We have added not only ProRes 4444, but also CinemaDNG, ARRIRAW and different RED compression levels. In addition, we opted to test the RAW footage not only at "Full Res." decode quality but "Half Res." as well in case that alters performance.

Second, since our previous GPU scaling testing, the newer NVIDIA Titan V 12GB GPU has been released which showed some terrific performance gains in DaVinci Resolve when we compared it to a range of GeForce cards a month ago. However, due to the high cost of that card, we will also be including the GTX 1080 Ti which gives absolutely terrific performance for it's cost.

Lastly, we will be looking at three different CPUs and their associated platforms including Core i9, Xeon W, and Dual Xeon SP Processors. This provides a range of not only different raw CPU power, but also different PCI-E configurations.

Test Hardware & Methodology

To see how DaVinci Resolve scales with multiple GPUs across various platforms, we opted to test 1-4 GTX 1080 Ti and Titan V GPUs with Core i9, Xeon W, and Dual Xeon SP platforms:

All of our test footage is downloaded or transcoded from media that is publicly available. This was done so that anyone can repeat our testing in order to both verify our findings and to see how their current computer stacks up to the latest hardware available. To test each type of footage, we used three different "levels" of grading. The lowest level is simply a basic correction using the color wheels plus 4 Power Window nodes that include motion tracking. The next level up is the same adjustments but with the addition of 3 OpenFX nodes: Lens Flare, Tilt-Shift Blur, and Sharpen. The final level has all of the previous nodes plus one TNR node.

Performance was measured in the Color tab using the built-in FPS counter. After playback was started, we waited 15 seconds for the FPS to stabilize then recorded the lowest FPS number over the next 15 seconds. This method allowed us to achieve highly consistent and replicable results.

For all the RAW footage we tested (CinemaDNG, ARRIRAW, and RED), we not only tested with the RAW decode quality set to "Full Res." but we also tested at "Half Res." ("Half Res. Good" for the RED footage). Full resolution decoding should show the largest performance delta between the different GPUs, but we also want to see what kind of FPS increase you might see by running at a lower decode resolution with different CPU and GPU combination.

The footage used in our testing is shown below with links to where you can download it yourself:

Codec Resolution FPS Camera Clip Name Source
ProRes 422 HQ 3840x2160 24 fps Ursa Mini 4K City Train Station Blackmagic Design
Production Camera 4K Update
ProRes 4444 3840x2160 59.94 fps Canon C200 Untitled00024199 4K Shooters
Canon C200 Raw Footage Workflow
CinemaDNG 4608x2592 24 fps Ursa Mini 4K Interior Office Blackmagic Design
[Direct Download]
ARRIRAW 6560x3100 23.976 fps ALEXA 65 A003C025
(Open Gate spherical)
ALEXA Sample Footage
RED 3840x2160
23.976 fps EPIC DRAGON A016_C001_02073O_001 RED
Sample R3D Files
RED 4096x2304
29.97 fps RED ONE MYSTERIUM A004_C186_011278_001 RED
Sample R3D Files
RED 6144x3160
23.976 fps EPIC DRAGON A007_C115_07181B_001 RED
Sample R3D Files
RED 6144x3077
23.976 fps WEAPON 6K S005_L001_0220LI_001 RED
Sample R3D Files
RED 8192x4096
23.976 fps WEAPON 8K S35 S002_C074_02065Z_001 RED
Sample R3D Files
RED 8192x4320
25 fps WEAPON 8K S35 B001_C096_0902AP_001 RED
Sample R3D Files
RED 8192x4320
23.976 fps EPIC-W 8K S35 S002_C074_02065Z_001 RED
Sample R3D Files
3940x2160 29.97 fps Transcoded from RED A004_C186_011278_001
6144x3160 23.976 fps Transcoded from RED A007_C115_07181B_001
8192x4320 25 fps Transcoded from RED B001_C096_0902AP_001

While this is by no means every codec available, we do feel that this covers a wide range of footage that many users work with on a daily basis. In the future we may cut down on the number of RED clips and replace then with something like XAVC-S or AVCHD but for now we really wanted to see how the different compression levels impact performance.

4K Media - Live Playback FPS (RAW DATA)

[Click Here] to skip ahead to analysis section


4K ProRes 422 HQ

4K ProRes 4444

4K RED 11:1 (Full Res.)

4K RED 11:1 (Half Res.)

4K RED 7:1 (Full Res.)

4K RED 7:1 (Half Res.)

4K CinemaDNG (Full Res.)

4K CinemaDNG (Half Res.)

4K Media - Live Playback FPS (Analysis)

Since our 4K testing alone contains over 600 data points across six different codecs, it can be difficult to pull meaningful conclusions from the data. If you tend to use just one of the codecs we tested, we highly recommend looking at just that data but for a more general take on GPU scaling in DaVinci Resolve with each platform we decided to average the results from each type of media.

Starting with relatively simple color grading using the color wheels and 4 Power Windows, there is actually very little to talk about. With this level of grading, we simply were running at full FPS (or very near to it) with nearly every single GPU and CPU combination we tested. If anything, the only thing to point out is that the Dual Xeon Gold 6148 system under-performed by just a little bit but this was entirely due to just the ProRes 4444 test.

Adding 3 OpenFX effects, we start to see a bit of a difference with more GPUs - although interestingly the CPU itself made very little impact on performance. On the GPU side, we saw a decent performance gain going from one GTX 1080 Ti to two, but minimal gains adding a third and fourth GPU. With the Titan V, there was a very small increase in performance going from one GPU up to two, three, and four GPUs, but the difference was only around 2 FPS in total.

One thing we want to point out is that most of our test media is 24-25 FPS and - with the exception of ProRes 422 HQ - we were able to achieve full playback FPS with just two GTX 1080 Ti GPUs or a single Titan V. The one test that has a higher framerate (ProRes 4444 at 59.94 FPS) actually saw pretty decent scaling all the way up to 4 GPUs. So it isn't really that Resolve doesn't scale, but rather that more than two GPUs is not necessary to achieve 24-25 FPS with this level of grading.

Adding TNR, we start to really see some great GPU scaling since we are not hitting full playback FPS nearly as often. Once again, the CPU itself made very little difference, but in this test we saw decent gains with two and three GTX 1080 Ti cards and even a few more FPS with a fourth card. With the Titan V, however, we did hit a bit of a wall after three GPUs since that was often what was necessary to give full playback FPS. A fourth Titan V was useful in some isolated cases but for most users who work with 4K footage it is likely overkill.

6K Media - Live Playback FPS (RAW DATA)

[Click Here] to skip ahead to analysis section


6K RED 12:1 (Full Res.)

6K RED 12:1 (Half Res.)

6K RED 7:1 (Full Res.)

6K RED 7:1 (Half Res.)

6K ARRIRAW (Full Res.)

6K ARRIRAW (Half Res.)

6K Media - Live Playback FPS (Analysis)

Our 6K testing is not quite as extensive as our 4K testing, but it still contains over 460 data points across four different codecs which can make it difficult to pull meaningful conclusions from the data. If you tend to use just one of the codecs we tested, we highly recommend looking at just that data but for a more general take on GPU scaling in DaVinci Resolve on each platform we again decided to average the results from each type of media.

Starting with relatively simple color grading using the color wheels and 4 Power Windows, there is not much to discuss. With this level of grading, we simply were running at full FPS (or very near to it) with every single GPU and CPU combination we tested. The only exception was with 6K RED 7:1 media with "Full Res." decode quality where the Xeon W-2175 oddly saw significantly lower performance than the other two CPUs. We are not sure why this is, but we confirmed the result multiple times and for whatever reason, that CPU simply doesn't perform well with that exact codec, compression, and resolution.

Adding 3 OpenFX effects, we start to see a bit of a difference but interestingly the scaling appears to be worse than what we saw with our 4K test media. Once again, the CPU itself made very little impact on performance except with 6K RED 7:1 where the Xeon W-2175 gave lower than expected results. On the GPU side, we saw a decent performance gain going from one GTX 1080 Ti to two, but almost nothing when adding a third or fourth GPU. With the Titan V, the difference was even less as we saw virtually no benefit from using more than a single GPU.

Just like with the 4K results, this doesn't mean that Resolve doesn't scale well, but rather that we are hitting full FPS with just two GTX 1080 Ti GPUs or a single Titan V. Unlike the 4K testing, however, all of our media is 23.976 FPS so there is no higher framerate footage that might show a larger benefit from having more GPU power.

Adding TNR, we see improved GPU scaling up to three cards, but oddly we saw an overall drop in performance when we added a fourth card. This is a very unexpected result, but was remarkably consistent when using ARRIRAW at either decode quality or RED footage with "Full Res." decode quality.

Honestly, we have no idea why this is happening. At first, we thought it may be due to the PEX chip on the Xeon W system that is used to divide 16 PCIe lanes between the third and fourth GPU (since that CPU doesn't have enough lanes to run all four GPUs at full x16), but the dual Xeon system runs all four GPUs at x16 speeds and saw the exact same performance drop. We also thought it may be from a CPU bottleneck, but we didn't see any significant difference between the two single CPU setups (which should be roughly the same in terms of performance) and the Dual Xeon setup which has much more raw CPU horsepower.

8K Media - Live Playback FPS (RAW DATA)

[Click Here] to skip ahead to analysis section


8K RED 12:1 (Full Res.)

8K RED 12:1 (Half Res.)

8K RED 9:1 (Full Res.)

8K RED 9:1 (Half Res.)

8K RED 7:1 (Full Res.)

8K RED 7:1 (Half Res.)

8K Media - Live Playback FPS (Analysis)

Once again, if you tend to use just one of the codecs we tested we highly recommend looking at just that data. However, for a more general take on GPU scaling in DaVinci Resolve on each platform we again decided to average the results from each type of media.

Starting with relatively simple color grading, it may appear that there isn't much to discuss but there is actually some very important data that isn't displayed well in the averaged chart above. While we hit full FPS with every CPU and GPU combination when using DNxHR HQ or any of the RED footage at "Half Res." decode quality, with "Full Res." (especially at 9:1 and 12:1) the results were... odd. With these, we saw a large differences in performance between each CPU and a very consistent drop in performance with more than a single GPU. Unfortunately, there wasn't even really a pattern to it. With 8K RED 12:1, the Core i9 7960X performed much better than the other two CPU platforms. However, with 8K RED 9:1 the Dual Xeon 6148 was on top with a single GPU but saw a significant drop in performance as we added more GPUs to the point that it was worse than the other CPUs by the time we got to four GPUs.

Adding 3 OpenFX effects, the results are again a bit odd even though the averaged chart above doesn't really show it very well. In most cases, there was a benefit to using two GTX 1080 Ti GPUs, but we didn't see much with a third or fourth card. Similarly, the Titan V did great as a single GPU, but there was almost no performance increase from using multiple cards.

Once again, the RED 9:1 and 12:1 with "Full Res." decode was where things got weird. With 8K RED 12:1, the Core i9 7960X was again the best performing CPU and we even saw a performance gain with two GTX 1080 Ti or two Titan V GPUs. However, with 8K RED 9:1 the results were all over the place. The Dual Xeon Gold 6148 in particular was very unexpected. With that CPU, we saw an overall great performance gain going from one to two GPUs, then a moderate drop in performance with three GPUs, followed by a significant drop in performance when a fourth GPU was added.

With TNR added, the results get a bit more clean, but still not quite what we expected. Once again, with "Full Res." decode quality on the RED footage we saw at best minimal gains with multiple GPUs and often drops in performance as we added cards. If you stick to "Half Res." or use non-RAW media, however, we mostly saw pretty decent gains with up to three GPUs although there was rarely a benefit to having a fourth card.


We were hoping that our results would end up being relatively straight-forward, but unfortunately, reality has a knack for complicating things. However, after deeply analyzing all 1,500+ data points, there are several interesting conclusions we can draw:

1: The CPU/platform makes very little difference

This will of course not hold true if you really skimp on the CPU, but when using high-end models there was surprisingly little difference in terms of playback performance. We did have some odd results here and there (especially with 8K footage), but overall we saw minimal difference between the Core i9 7960X, Xeon W-2175, and the Dual Xeon Gold 6148 CPUs. The Core i9 and Xeon W CPUs should be roughly equal in terms of raw performance, but even with the Dual Xeon (which has much higher raw CPU performance and more PCI-E lanes) we only saw on average a 1-2 FPS benefit at most. Considering the much higher cost of those CPUs, simply using more or higher-end GPUs is likely to be a more effective way to improve performance for most users.

Different CPUs and motherboards may limit the number of GPUs you can use in your system, however, which is an important consideration to take into account.

2: RED footage at "Full Res." decode quality is... weird

This was not as much of an issue with 4K RED footage, but with 6K and especially 8K RED footage trying to use "Full Res." decode quality resulted in very odd results. Not only did we simply see lower playback FPS compared to using "Half Res." but in many cases using "Full Res." decode resulted in a performance drop when we increased the number of GPUs. It may be that we are hitting a CPU or storage bottleneck, but given the fact that we saw the same thing with the Dual Xeon CPUs and are using a very fast storage drive (3,500 MB/s read) we think this is more of an issue with DaVinci Resolve itself.

3: More GPUs is NOT always faster

With just basic color grading and 4 Power Windows, even a single GTX 1080 Ti was able to give us full playback FPS in almost every case so there is simply no need for multiple GPUs. Adding OpenFX definitely put more load on the GPU(s) which allowed us to see a benefit from up to two GTX 1080 Ti GPUs, although there was still little benefit to having more than a single Titan V.

Adding TNR was really where we started to see the benefit of multiple GPUs. Discounting some of the weird results with RED footage at "Full Res." decode quality, we saw decent scaling with up to three GTX 1080 Ti GPUs and in some cases even saw a benefit with a fourth GTX 1080 Ti. With the Titan V, we saw the biggest benefit going from one to two GPUs but there was still some benefit to having a third Titan V. Adding a fourth card, however, rarely improved performance even though we were not hitting full playback FPS.

DaVinci Resolve 14 GPU Scaling performance benchmark

So what would we recommend to someone looking for a high-end DaVinci Resolve workstation? For the average professional color grader, the platform (Core i9, Xeon W, Xeon SP) really shouldn't make much of a difference. Because of that, we would recommend using a Core i9 CPU for two reasons. First, it is lower cost then comparable Xeon CPUs which leaves more of your budget open for GPU performance. Second, it is much more common which means that there should be less in the way of software/hardware bugs or other issues. On the GPU side, we would recommend either a pair of GTX 1080 Ti GPUs or a single Titan V. Two GTX 1080 Ti's should be less expensive, but it is a more complicated setup which means it will be more prone to odd performance scaling issues like what we saw with some RED footage when using "Full Res." decode quality.

For a best-of-the-best DaVinci Resolve Workstation, we would go with two or maybe three Titan V GPUs. Even just two Titan V GPUs is slightly faster than four GTX 1080 Ti GPUs and even though the Titan V cards should be a bit more expensive, having just two GPUs opens the door for smaller form factor systems or things like multiple Blackmagic Decklink or RAID PCI-E cards. Again, the platform shouldn't make much of a difference here, although if you do opt to use three Titan V cards you could see a small performance gain with either a Xeon W or Dual Xeon SP setup.

DaVinci Resolve Workstations

Puget Systems offers a range of poweful and reliable systems that are tailor-made for your unique workflow.

Configure a System!

Labs Consultation Service

Our Labs team is available to provide in-depth hardware recommendations based on your workflow.

Find Out More!
Tags: DaVinci Resolve, NVIDIA, Titan V, GTX 1080 Ti, Core i9, Xeon W, Dual Xeon SP
HÃ¥kon Broder Lund

Interesting read! Best I've ever seen in terms of Resolve performance. Thank you for the deep work

Posted on 2018-03-06 21:37:41

It would be helpful in future tests if you include CPU/GPU load task manager grabs during the test. Red Full Res premium decode weirdness is because it is not using enough threads/cpu as it should in Resolve 14.3/SDK 7.06 - if you try Resolve 14.01/SDK 6.3 it uses all available cpu and no weirdness. See BM forum thread: https://goo.gl/E6G8ux

Posted on 2018-03-12 23:27:34

Good to see that others have the same issue we had, thanks for linking to that thread! As for load graphs in our articles, that is something we've done in the past and use every once in a while but honestly we have found them to be limited in their usefulness. For some things they can help point out the cause of an issue, but in other cases they actually tend to be misleading and point you in the entirely wrong direction. Especially in an application like Resolve that uses a heavy mix of both the CPU and GPU, you might see low CPU load but all that really means is that there is a bottleneck or issue somewhere else that is completely unrelated to the CPU. We do use them internally to verify/investigate issues, but in articles like this that are already have an almost overwhelming amount of data we tend to not publish it simply because they can easily be mis-read.

Posted on 2018-03-13 17:25:41
Mark Fry

so where are we with resolve 15..one might would presume everything would have sped up

Posted on 2018-08-20 18:48:43

I just got back from vacation today, but Resolve 15 is definitely pretty high on my to-do list. I have a few other projects I need to get done first, then that is next up - hopefully I can get testing done in the next few weeks.. It will probably just be color grading performance in this round, but in the future I would love to include some benchmarks for Fusion as well now that it is fully integrated. Super interested to see how performance changes. They specifically called out performance improvements for H.264 media (something Resolve has historically had issues with), so that in particular I am really interested to see. "Improved CUDA performance for high resolution clips and timelines" will also be interesting to see if it makes an impact for 6K/8K.

Posted on 2018-08-20 20:26:59

You can also add tests with the new NVIDIA Quadro RTX and RTX 2080Ti cards; o)

Posted on 2018-08-21 12:10:05
Mark Fry

Sounds good Matt.The 8700 overclocked to 5g does work amazingly well with 1080Ti I am quite surprised what you can do with this combo...Resolve in this 15 Iteration is becoming the defacto standard on PC..
Ona side note ,,One can only picture what real time simulations will be created with the new hardware just anounced I keep thinking of avatar like everyone else I presume

Posted on 2018-08-21 21:10:26

The main issue with the 8700K is that it only has 16 PCI-E lanes to use for GPUs and Decklink cards. That is totally fine for even moderately difficult grading since it can easily support dual graphics cards, but if you need more it just can't handle it. It surprised me a bit when I looked it up, but it is actually the majority of our customers who want something that the 8700K isn't suited for like 3-4 GPUs or 2 GPUs plus a Decklink card.

That is really the primary reason why we typically recommend one of the X-series or Xeon-based CPUs - they have plenty of PCI-E lanes to use for whatever our customers need in addition to the extra power that can be relevant in many situations. If Resolve usage really picks up in the hobbiest/Youtube/blog/etc. communities, I could completely see us offering a Resolve workstation based around the 8700K (or whatever "consumer" CPU we have at that point). At the moment, however, the demand we see is simply too low to warrant it being something we offer outside of custom configurations.

Although now that I think about it, I wonder how much of Fusion is multi-threaded. If it is anything like After Effects, it is going to perform much better on a CPU like the 8700K. Guess I'll have to add that to the to-do list!

Posted on 2018-08-21 22:29:50

can u please also check the new nv-link function on a setup with 2 rtx 2080 ti? would be awesome if it can double the performance as it can at some games like sniper...

Posted on 2018-09-24 02:04:00

The NV-Link on the RTX cards isn't true NV-Link like you would find on the Quadro GV100, it is actually just SLI using the NV-Link connection. SLI isn't used by Resolve or most other similar applications - it generally is reserved for real-time visualization applications like video games or similar programs. So using it in Resolve won't give any higher performance, and in our experience having it enabled is actually more likely to cause stability/performance issues.

Posted on 2018-09-28 16:16:06

This seems like very poor performance. Is there any performance difference using Quadro cards given their dedicated pro catered drivers? Just seems odd that Something like (2) Titan Vs are required to get 24fps video playback for just light grading work. TNR I understand as heavy. But (3) 1080 Ti still getting under 24fps on average for 4k media? That seems unnacceptably bad. The price to performance ratio there is unjustifiably poor. And the small difference in performance between 4k avg and 8k avg seems to suggest that something else is to blame here other than any amount of brute force GPU power.

Posted on 2018-10-17 16:38:51

Quadros are really about the same performance as GeForce given similar specs. Just as an example, the P6000 tends to perform about the same as a GTX 1080 in most applications. Really, Quadro is used mostly for the higher VRAM capacity and slightly better reliability than raw performance.

I would check out some of our newer articles like https://www.pugetsystems.co... . If I remember right, Resolve 14 had a few odd scaling issues that was fixed in Resolve 15 so those results are going to be more accurate. In Resolve 15 you are going to be fine with a single GPU for light grading work unless you want to grade RED 8K media at full resolution. Adding OpenFX is where you start needing dual cards, however, since those are very heavy on the GPU.

Posted on 2018-10-19 16:25:18

Have to wonder whether part of the gain with adding more GPU boards was gaining more available RAM for the application. Adobe applications can only use 10 cores per CPU and so adding more CPU's provides more cores that the application can actually use to run code and processes. Does having 22GB VRAM with two GTX 1080 boards and having more total bandwidth with use of two PCIe slots provide the performance boost? Easy to verify by comparing a single GTX 2080 against two GTX 1060 boards.

Posted on 2018-12-05 19:38:15

Unfortunately, VRAM doesn't multiple like that. If you have two 11GB video cards, Resolve still can only load 11GB worth of data into the cards. The reason for this is that in order for each GPU to do it's share of the processing, it needs to have access to all of the data, so each card needs to have it's own copy. This may change in the future with new technology like NVLink on the NVIDIA RTX cards, but at the moment this is not a feature available in DaVinci Resolve.

Posted on 2018-12-05 19:40:52

Thanks Matt, this last comment was really helpful. I just recieved the following information from blackmagic support: "DaVinci Resolve 14 and onwards prefer a single larger GPU rather than multiple smaller GPUs. If you do wish to use multiple GPUs then we recommend that they are the same make and model. You can use two GTX 1080 Ti GPUs simultaneously. If the GPUs are the same make and model then you can even use the GUI card to perform image processing as well as drive the GUI. Resolve is limited to processing at the speed of the slowest GPU and with the memory of the smallest GPU. As such processing speed is also limited by weaker GPUs."

Im sure moste readers will know the later. My question is currently:

1. I currenlty us a normal MSI GTX 1080 8Gb so I can buy another MSI GTX 1080 8Gb
(same brand same mode) for around 400 Euro OR I can sell mine (for
supposedly 400 Euro) and spend 800 Euro on a single GPU. Up until reading your last post I thought I go for two GPUs since VRam is most important in resolve (so I thought) but now I doubt it.

I plan to buy a ryzen 7 2700x that uses the AM4 socket. I read that
when using two GPUs the socket only provides 8x lanes per socket. Will
this be limiting when using resolve with two normal 1080 8gb? I also plan on adding a deklink (one I can afford a better grading display with SLI 10bit)

3. For gaming I would install a SLI bridge - will this interfere with the use of the two GPUs in resolve

Sorry for the long post - Im thankfull for hints you might have!!

Posted on 2019-01-25 21:58:06

Most of their response is right on, although some of it is a bit of a generalization.

1) I think this depends on what you can get for 800 Euro and what you are doing in Resolve. If you are working with 4K footage, I would do dual 1080 cards since you are unlikely to need more than 8GB of VRAM and two 1080 cards will outperform any single Titan card. If you think you might work with 8K footage, however, going to a single GPU that has more VRAM is going to be a very good idea since 8GB of VRAM is likely not going to cut it.

2) Don't worry about x8 vs x16. We've done testing on it and the difference in Resolve is minimal. Sure, x16 would be better, but that means an upgrade to X299/X399 which is going to be very pricey. You would get more performance for your dollar by just upgrading to a pair of Titan cards and staying on that platform instead.

3) Resolve hates SLI. We looked at it recently as part of our NVLINK testing https://www.pugetsystems.co... . The short of it is that if you are in SLI, Resolve only sees and uses a single GPU. So if you want to use SLI for gaming, just make sure you turn it off before starting up Resolve.

Posted on 2019-01-28 17:57:17