Octane Render GPU Performance ComparisonWritten on April 28, 2016 by Matt Bach
According to OTOY, Octane Render is the "world's first and fastest GPU-accelerated, unbiased, physically correct renderer". While comparing the speed of different rendering engines is often times an apples to oranges comparison, one thing we can test is to see how well Octane Render is able to utilize different models and quantities of video cards.
In this article, we will be benchmarking various NVIDIA GeForce and Quadro cards using OTOY's OctaneBench. We will be specifically looking at how much of a performance gain you may see with both a more expensive model as well as with multiple video cards. We will also be testing on two different platforms - one dual Xeon system and one Core i7 system - to see if the CPU and motherboard has any impact on performance. This should give us a clear idea of how well Octane Render is able to utilize different video cards and if the platform itself has any impact on performance.
If you would rather simply view our conclusions, feel free to jump ahead to the conclusion section.
Since the performance of a video card can often depend somewhat on the motherboard's chipset and CPU used, we will be performing our testing across two different platforms. Our main test platform will be based around a pair of Xeon E5-2687W V3 CPUs while our second platform will be based around the Intel Core i7 6700K. The dual Xeon system will be able to provide a huge amount of CPU power and will allow us to test up to four cards at full PCI-E 3.0 x16 speeds. The Core i7 system, however, will be limited to two cards at PCI-E 3.0 x8 speeds.
Basic specifications for both machines are below:
|Motherboard:||Asus Z10PE-D8 WS||Asus Z170-A|
|CPU:||2x Intel Xeon E5-2687W V3 3.1GHz Ten Core||Intel Core i7 6700K 4.0GHz Quad Core|
|RAM:||8x Kingston DDR4-2133 8GB ECC Reg.||4x Crucial DDR4-2133 8GB|
|Hard Drive:||Samsung 850 Pro 512GB SATA 6Gb/s SSD|
|OS:||Windows 10 Pro 64-bit|
|PSU:||EVGA SuperNOVA 1600W P2 Power Supply|
For our test video cards, we used the following models:
|Test Video Cards|
|1-2x NVIDIA Quadro K2200 4GB|
|1-2x NVIDIA Quadro M4000 8GB|
|1-4x NVIDIA GeForce GTX 970 4GB|
|1-2x NVIDIA GeForce GTX 980 4GB|
|1-2x NVIDIA GeForce GTX 980 Ti 6GB|
|1-4x NVIDIA GeForce GTX Titan X 12GB|
You will notice that we are primarily focusing on GeForce and only testing a couple Quadro cards. This is because there is rarely a need to use Quadro for rendering but we wanted to include a couple Quadro cards to act as comparison points to make sure there are no surprises with Octane Render.
One thing to make special note of is that OctaneBench by default gives a total score based on the performance measured compared to a GTX 980 video card. Since we will be testing a large number of video cards ourselves, we are actually not going to use this relative score and will instead be calculating our own weighted score based on the Ms/s (mega samples per second) result for each test. We applied the exact same weighting system that OctaneBench uses to calculate the normal score, but by keeping the results absolute rather than relative to a GTX 980 it should give us a bit more accurate comparisons. The only downside to this is that it will make our results not be directly comparable to the results uploaded on the OctaneBench Results page.
One thing we want to make very clear is that our testing is only 100% accurate for the files and settings used in OctaneBench. While this should be able to give us a good baseline for how well Octane Render runs on difference video cards, you may see slightly different results with your own scenes.
PCI-E 3.0 x8 vs x16
Before we get too far into our testing, the first thing we want to do is to determine if there is any performance difference between running a video card at x8 or x16 speeds with Octane Render. While PCI-E 3.0 x16 technically has twice the bandwidth of PCI-E 3.0 x8, it is very rare for a program to fully saturate even PCI-E 3.0 x8 so we do not expect to see a difference in performance. If there is a performance difference, however, we want to know that before we get into the majority of our testing as it could change some of our testing methodology.
We performed this test on our dual Xeon system as it allowed us to test two cards at full PCI-E 3.0 x16 speeds. To force them to run at x8 speeds we simply covered half of the pins on each video card with an insulating material (a post-it note).
As you can see from the charts above, with the fastest NVIDIA GPU available today (a GeForce GTX Titan X) we did in fact see a small difference between x8 and x16. Oddly, however, the difference was actually in favor of x8 by 1-4%. This is a very strange result, but we verified it with multiple benchmark runs. Even though we used the exact same video cards in the same slots in the same system, simply forcing the cards into x8 mode - for whatever reason - resulted in slightly higher performance.
While our benchmark showed faster performance in x8 mode, there is no reason we can think of to explain this odd behaviour. Really, what we would advise you take from this test is that you do not have to worry about whether your motherboard will be able to run a video card(s) at PCI-E 3.0 x8 or x16 speeds for Octane Render - it simply doesn't matter.
GPU Performance with 2x Xeon E5-2687W V3
In the charts above, note that we only have hard results for three and four video cards with the GTX 970 and GTX Titan X. We were a bit limited on the cards we had available, but we were able to do our triple and quad GPU testing using both the fastest and the slowest GeForce video cards. Using the performance measured from those cards, we were able to calculate the approximate amount of performance gain (or speedup) you would see with three or four cards. From this, we can estimate the performance of up to four cards for the other models.
The first thing to notice is that there is clearly no advantage to using Quadro over GeForce. In fact, the Quadro M4000 was about 30% slower than the GTX 970 even though it costs about three times as much.
Second, when it comes to using multiple GPUs, Octane Render scales extremely well. While it varied a bit from card to card, on average going from one card to two resulted in a 1.98x increase in performance. Going from one card to three cards resulted in 2.97x increase, and going from one card to four cards resulted in a 3.99x increase performance. This is about as close to perfect scaling as you can get in the real world! In fact, this is more enough of a gain in performance that the strategy of using multiple, more affordable cards should give you much better performance than fewer, more expensive cards. More information on this in the conclusion section.
Lastly, while most of the time upgrading the GPU model gives a decent increase in performance, there is very little difference between the GXT 980 Ti and the GTX Titan X (only about 3%). Because of this, we would typically only recommend the GTX Titan X over the GTX 980 Ti if you need the extra VRAM available on the Titan X or if you absolutely need the fastest performance possible regardless of the additional cost.
GPU Performance with Intel Core i7 6700K
Somewhat surprisingly, the results on the Intel Core i7 6700K system are almost identical to what we saw on the dual Xeon system. Typically, software either performs better either on system with either a higher frequency (like our Core i7 system) or one with a large number of cores (like our dual Xeon system), but Octane Render is actually very platform-independent. In fact, the biggest difference in performance we saw between the two platforms was just 1.5% - and it wasn't even consistent which platform was faster!
All this really means is that if the performance of Octane Render is your primary concern, you can largely ignore the CPU and motherboard. As long as you can run all the GPUs at PCI-E 3.0 x8 speeds or faster (since we saw no difference between x16 and x8), the CPU should make little difference in render times. Of course, a faster CPU should let you do things like open scenes faster so you shouldn't skimp too much on the CPU but you certainly don't need a dual Xeon setup just for Octane Render.
What all our testing comes down to is that you should use GeForce if possible (although mixing Quadro and GeForce should work fine if you need a primary Quadro card for other tasks besides rendering) and to prioritize having multiple video cards before worrying about the individual performance of each card.
Unlike most GPU-accelerated programs (even other GPU-based rendering engines like Iray), the performance gain you should see with multiple video cards is almost perfectly linear. Going from one card to two should result in about half the render time, one card to three should cut it to a third, and going from one card to four should result in about a quarter of the render time.
To give you an idea of what cards you should consider at different budgets, we put together a small chart showing the best choice for a system that can handle one, two, three, or four video cards:
Best GPU choice for Octane Render
|Single GPU:||GTX Titan X 12GB
|Dual GPU:||2x GTX 980 4GB
|2x GTX 980 Ti 6GB
|2x GTX Titan X 12GB
|Triple GPU:||3x GTX 970 4GB
( 111.68 Ms/s)
|3x GTX 980 4GB
|3x GTX 980 Ti 6GB
|3x GTX Titan X 12GB
|Quad GPU:||-||4x GTX 970 4GB
|4x GTX 980 4GB
|4x GTX 980 Ti 6GB
|4x GTX Titan X 12GB
In the chart above, you can see that the performance gain by going with a more expensive model of video card is nowhere near as efficient as simply going with a higher number of cards. It is not always possible to install more than one or two GPUs in your system, but if you are able to you could potentially see a huge increase in performance. For example, if you have about $1500 to spend on video cards you would have the choice between two GTX 980 Ti 6GB cards or four GTX 970 4GB cards. The cost is almost identical, but the four GTX 970's will be about 30% faster. That is a free 30% increase in performance for absolutely no difference in cost!
Of course, raw performance is often not the only consideration. Accommodating more GPUs may require a larger power supply, may not allow for additional PCI-E cards like sound cards or WiFi to be used, and requires a physically larger chassis. In addition, if your renders require a large amount of VRAM (video card memory), you may need to go with a GTX 980 Ti 6GB or GTX Titan X 12GB just for the additional VRAM. That may mean you will have to give up some raw rendering performance, but it would ensure that you are able to complete the render in the first place.
Puget Systems offers a range of powerful and reliable systems that are tailor-made for your unique workflow.