V-Ray RT 3.6 Hybrid Mode with AMD Threadripper 1950X and NVIDIA Titan XpWritten on October 13, 2017 by William George
We recently performed testing on the latest version of V-Ray RT, which introduced a "Hybrid Mode" that utilizes both the CPU and GPUs in a system to provide extremely fast rendering. That test was run on several Core X series processors, formerly codenamed Skylake X, along with up to three GeForce GTX 1080 Ti graphics cards. The results were very interesting, with the addition of the CPU to the rendering pool definitely providing a benefit in terms of speed. However, our current Core X motherboards max out at three GPUs - yet V-Ray RT and similar rendering engines can scale to even more. As of this article's publication, our quad-GPU platform uses AMD's Threadripper processors instead... so I wanted to provide performance data showing how those processors compare to Intel's Core X in this application.
There are a few differences in the test system used here, though, which prevent direct comparisons from being made. Primarily, we did not have access to a full set of four GTX 1080 Ti cards. Those have been in shortage, so I had to use Titan Xp models instead. Those are noticeably faster than the 1080 Ti, though also a lot more expensive. That means that results from the Threadripper CPU + 1-4 GPUs here are not fair to stack up against the previous Core X data points. However, we can directly compare the CPU-only test results (when no GPUs were being used). As a bonus, we can also compare the GPU-only results (without the CPUs in the mix) to see how the 1080 Ti and Titan Xp cards stack up.
Before we dive into the test platform, methodology, and results here is a little background for those who may not have read our previous article. With V-Ray RT 3.6, you can now use both the CPU and GPUs in a single computer! This is called Hybrid Rendering, and promises a free boost to rendering speeds without any additional complexity for users. The way it works is pretty ingenious: the folks at Chaos Group figured out a way to run CUDA code on the CPU. CUDA is the language used to perform general computation on NVIDIA graphics cards, and has been used by V-Ray RT for quite a while - but until now it could only run on GPUs. Being able to run the same code on CPUs as well was originally designed to allow for easier debugging, but it turned out to also be a nice way to get a speed boost when rendering without any additional hardware requirements. Chaos Group posted a great log post about this, if you want more info.
Test Hardware and Methodology
We used a single AMD Threadripper configuration for the new testing we performed, but since select results from our previous article are also included below this chart shows the specs of both platforms:
|Threadripper (X399) Test Platform||Skylake X (X299) Comparison Hardware|
|Motherboard:||Gigabyte X399 AORUS Gaming 7 (rev 1.0)||Gigabyte X299 AORUS Gaming 7 (rev 1.0)|
|CPU:||Intel Core i9 7940X 3.1GHz
(4.3/4.4GHz Turbo) 14 Core
Intel Core i9 7960X 2.8GHz
(4.2/4.5GHz Turbo) 16 Core
Intel Core i9 7980XE 2.6GHz
(4.2/4.4GHz Turbo) 18 Core
|RAM:||8x Crucial DDR4-2666 16GB (128GB Total)|
|GPU:||4x NVIDIA GeForce Titan Xp 12GB||3x NVIDIA GeForce GTX 1080 Ti 11GB|
|Hard Drive:||Samsung 960 Pro M.2 PCI-E x4 NVMe SSD|
|OS:||Windows 10 Pro 64-bit|
|Software:||V-Ray RT 3.60.03 for 3ds Max 2017|
The test methodology was the same as last time, so that the results would be comparable. To measure performance, we opened a complex indoor scene within 3ds Max 2017, switched the render engine to V-Ray RT, and then rendered it with the default settings.
What changed between runs was the mode selection within V-Ray RT and the CUDA device(s) being used. We ran first with the CPU alone, in both CPU and CUDA modes, and then also with every possible combination of 1-4 video cards.
Results - Threadripper 1950X
First up, we have the full run of GPU combinations using the AMD Threadripper 1950X processor:
Two results are shown for each hardware combination. The blue bar is the total time the rendering process took, including pre-render steps, while the red is the main render phase alone. This lets us observe some interesting things:
- In all of the tests that included one or more GPUs, the pre-render steps (the difference between the two results we show) is very consistent. The Threadripper 1950X processor takes 80-100 seconds to perform those functions, regardless of how many GPUs are involved in the final render.
- Running the render on secondary video cards is substantially faster than doing so on the primary video card - the one which is handling display output. We saw this on the Core X system as well.
- The 1950X processor alone, in CUDA mode, is comparable in rendering performance to a single Titan Xp video card (when it is the primary card in a system).
Results - Threadripper 1950X vs Core X CPUs
Next, we can compare the CPU-only results from the X1950 test above with our previous data from the Core X series processors:
AMD's top-end Threadripper, highlighted with orange borders, lands right in the middle of the pack of Intel processors. As we have seen with Threadripper in other heavily threaded applications, it does better in terms of price:performance ratio than Intel's models... but Intel does offer more costly CPUs with higher absolute performance. There are also a couple of additional observations worth making:
- The rendering speed difference between CPU mode and CUDA mode is greater on the AMD system than on the Intel processors, but that probably doesn't matter since the ideal software setup on both platforms is CUDA mode (even without using GPUs).
- While I did not include the lower-end 1920X (12 core) Threadripper processor in my testing, I am pretty confident that it would have landed between the Core i7 7820X and Core i9 7900X in terms of speed (closer to the Core i9, at least in CUDA mode). That puts its $799 price point more in line with Intel's options, so it probably isn't as good of a value as the 1950X in this application.
Bonus Results - GeForce 1080 Ti vs Titan Xp GPU Comparison
We cannot fairly compare the CPU + GPU results in this test with the prior Core X findings, because of the difference in both CPU platform and the video cards used. However, just as we could look at the CPU-only results, we can also look at the GPU-only data points to see how the GTX 1080 Ti compares to its bigger sibling: the Titan Xp. We will use the main rendering times only here, so that the impact of different CPUs on the pre-render steps does not affect the GPU comparisons unfairly.
The faster Titan Xp wins every time, of course, but the margin between the cards is right around 10% most of the time. The biggest outlier was when a single, secondary card was used... but even then, it only reached about a 15% difference. In some of the other tests it dips under 10% as well, so that is a pretty safe average. And while we only have test results with up to 3 GPUs for the 1080 Ti, we can use that average to project roughly how they would perform with more cards. Those estimates are highlighted with orange borders in the chart above.
So, given the price difference ($300-600, as availability fluctuates) between the 1080 Ti and Titan Xp, is the performance increase worthwhile? That probably depends on how many GPUs you are getting. With a single card, in the grand scheme of a several-thousand-dollar workstation, it may well make sense to spend about 10% more and get a roughly similar increase. However, in such situations it may be even better to go with two more modest cards (1070s or 1080s). Likewise, as you go up to 2, 3, or 4 GPUs then the added cost to have them all be Titan Xp models instead of the 1080 Ti gets much bigger. If you want the absolute best rendering speeds in V-Ray RT (or a similar GPU based rendering engine) then it could be worthwhile, but the 1080 Ti comes very close for a much lower price.
As we saw with the Core X processors, this new Hybrid Rendering mode in V-Ray 3.6 is quite beneficial. You will already have a CPU in any workstation, so no additional hardware investment is required. Simply adding your processor to the list of CUDA devices used when rendering in V-Ray RT 3.6 will increase performance, even if you already have multiple GPUs at your disposal. The speed-up you get will depend on what CPU you have, though, so if you are buying or building a new workstation for V-Ray RT it is now worth considering a more powerful CPU than you might have in the past. We have recently updated our V-Ray recommended systems accordingly, offering Core X options on the 1-2 GPU compact system, Threadripper on the 1-4 GPU tower, and Xeon options for those wanting dual CPUs.