Article Thumbnail

Octane Render GPU Performance Comparison

Written on April 28, 2016 by Matt Bach
Share:
Table of Contents:
  1. Introduction
  2. Test Setup
  3. PCI-E 3.0 x8 vs x16
  4. GPU Performance with 2x Xeon E5-2687W V3
  5. GPU Performance with Intel Core i7 6700K
  6. Conclusion
  7. Recommended Reading
  8. Recommended Systems for GPU-based Rendering

Introduction

According to OTOY, Octane Render is the "world's first and fastest GPU-accelerated, unbiased, physically correct renderer". While comparing the speed of different rendering engines is often times an apples to oranges comparison, one thing we can test is to see how well Octane Render is able to utilize different models and quantities of video cards.

In this article, we will be benchmarking various NVIDIA GeForce and Quadro cards using OTOY's OctaneBench. We will be specifically looking at how much of a performance gain you may see with both a more expensive model as well as with multiple video cards. We will also be testing on two different platforms - one dual Xeon system and one Core i7 system - to see if the CPU and motherboard has any impact on performance. This should give us a clear idea of how well Octane Render is able to utilize different video cards and if the platform itself has any impact on performance.

If you would rather simply view our conclusions, feel free to jump ahead to the conclusion section.

Test Setup

Since the performance of a video card can often depend somewhat on the motherboard's chipset and CPU used, we will be performing our testing across two different platforms. Our main test platform will be based around a pair of Xeon E5-2687W V3 CPUs while our second platform will be based around the Intel Core i7 6700K. The dual Xeon system will be able to provide a huge amount of CPU power and will allow us to test up to four cards at full PCI-E 3.0 x16 speeds. The Core i7 system, however, will be limited to two cards at PCI-E 3.0 x8 speeds.

Basic specifications for both machines are below:

For our test video cards, we used the following models:

You will notice that we are primarily focusing on GeForce and only testing a couple Quadro cards. This is because there is rarely a need to use Quadro for rendering but we wanted to include a couple Quadro cards to act as comparison points to make sure there are no surprises with Octane Render.

One thing to make special note of is that OctaneBench by default gives a total score based on the performance measured compared to a GTX 980 video card. Since we will be testing a large number of video cards ourselves, we are actually not going to use this relative score and will instead be calculating our own weighted score based on the Ms/s (mega samples per second) result for each test. We applied the exact same weighting system that OctaneBench uses to calculate the normal score, but by keeping the results absolute rather than relative to a GTX 980 it should give us a bit more accurate comparisons. The only downside to this is that it will make our results not be directly comparable to the results uploaded on the OctaneBench Results page.

One thing we want to make very clear is that our testing is only 100% accurate for the files and settings used in OctaneBench. While this should be able to give us a good baseline for how well Octane Render runs on difference video cards, you may see slightly different results with your own scenes.

About our testing: We rely on our customers and the community at large to point out anything we may have missed in our testing. If there is some critical part of Octane Render you think we skipped in our testing, please let us know in the comments at the bottom of the page. Especially if you are able to provide a file that we can integrate into our testing, we really want to hear your feedback!

PCI-E 3.0 x8 vs x16

Before we get too far into our testing, the first thing we want to do is to determine if there is any performance difference between running a video card at x8 or x16 speeds with Octane Render. While PCI-E 3.0 x16 technically has twice the bandwidth of PCI-E 3.0 x8, it is very rare for a program to fully saturate even PCI-E 3.0 x8 so we do not expect to see a difference in performance. If there is a performance difference, however, we want to know that before we get into the majority of our testing as it could change some of our testing methodology.

We performed this test on our dual Xeon system as it allowed us to test two cards at full PCI-E 3.0 x16 speeds. To force them to run at x8 speeds we simply covered half of the pins on each video card with an insulating material (a post-it note).

As you can see from the charts above, with the fastest NVIDIA GPU available today (a GeForce GTX Titan X) we did in fact see a small difference between x8 and x16. Oddly, however, the difference was actually in favor of x8 by 1-4%. This is a very strange result, but we verified it with multiple benchmark runs. Even though we used the exact same video cards in the same slots in the same system, simply forcing the cards into x8 mode - for whatever reason - resulted in slightly higher performance.

While our benchmark showed faster performance in x8 mode, there is no reason we can think of to explain this odd behaviour. Really, what we would advise you take from this test is that you do not have to worry about whether your motherboard will be able to run a video card(s) at PCI-E 3.0 x8 or x16 speeds for Octane Render - it simply doesn't matter. 

GPU Performance with 2x Xeon E5-2687W V3

OctaneRender OctaneBench video card benchmark
In the charts above, note that we only have hard results for three and four video cards with the GTX 970 and GTX Titan X. We were a bit limited on the cards we had available, but we were able to do our triple and quad GPU testing using both the fastest and the slowest GeForce video cards. Using the performance measured from those cards, we were able to calculate the approximate amount of performance gain (or speedup) you would see with three or four cards. From this, we can estimate the performance of up to four cards for the other models.

The first thing to notice is that there is clearly no advantage to using Quadro over GeForce. In fact, the Quadro M4000 was about 30% slower than the GTX 970 even though it costs about three times as much.

Second, when it comes to using multiple GPUs, Octane Render scales extremely well. While it varied a bit from card to card, on average going from one card to two resulted in a 1.98x increase in performance. Going from one card to three cards resulted in 2.97x increase, and going from one card to four cards resulted in a 3.99x increase performance. This is about as close to perfect scaling as you can get in the real world! In fact, this is more enough of a gain in performance that the strategy of using multiple, more affordable cards should give you much better performance than fewer, more expensive cards. More information on this in the conclusion section.

Lastly, while most of the time upgrading the GPU model gives a decent increase in performance, there is very little difference between the GXT 980 Ti and the GTX Titan X (only about 3%). Because of this, we would typically only recommend the GTX Titan X over the GTX 980 Ti if you need the extra VRAM available on the Titan X or if you absolutely need the fastest performance possible regardless of the additional cost.

GPU Performance with Intel Core i7 6700K

Somewhat surprisingly, the results on the Intel Core i7 6700K system are almost identical to what we saw on the dual Xeon system. Typically, software either performs better either on system with either a higher frequency (like our Core i7 system) or one with a large number of cores (like our dual Xeon system), but Octane Render is actually very platform-independent. In fact, the biggest difference in performance we saw between the two platforms was just 1.5% - and it wasn't even consistent which platform was faster!

All this really means is that if the performance of Octane Render is your primary concern, you can largely ignore the CPU and motherboard. As long as you can run all the GPUs at PCI-E 3.0 x8 speeds or faster (since we saw no difference between x16 and x8), the CPU should make little difference in render times. Of course, a faster CPU should let you do things like open scenes faster so you shouldn't skimp too much on the CPU but you certainly don't need a dual Xeon setup just for Octane Render.

Conclusion

What all our testing comes down to is that you should use GeForce if possible (although mixing Quadro and GeForce should work fine if you need a primary Quadro card for other tasks besides rendering) and to prioritize having multiple video cards before worrying about the individual performance of each card.

Unlike most GPU-accelerated programs (even other GPU-based rendering engines like Iray), the performance gain you should see with multiple video cards is almost perfectly linear. Going from one card to two should result in about half the render time, one card to three should cut it to a third, and going from one card to four should result in about a quarter of the render time.

To give you an idea of what cards you should consider at different budgets, we put together a small chart showing the best choice for a system that can handle one, two, three, or four video cards:

Best GPU choice for Octane Render

GPU Budget $1000 $1500 $2000 $3000 $4000
Single GPU: GTX Titan X 12GB
(58.55 Ms/s)
- - - -
Dual GPU: 2x GTX 980 4GB
(88.21 Ms/s)
2x GTX 980 Ti 6GB
(113.26 Ms/s)
2x GTX Titan X 12GB
(115.66 Ms/s)
- -
Triple GPU: 3x GTX 970 4GB
( 111.68 Ms/s)
3x GTX 980 4GB
(132.3 Ms/s)
3x GTX 980 Ti 6GB
(169.3 Ms/s)
3x GTX Titan X 12GB
(174.09 Ms/s)
-
Quad GPU: - 4x GTX 970 4GB
(148.17 Ms/s)
4x GTX 980 4GB
(177.5 Ms/s)
4x GTX 980 Ti 6GB
(227.1 Ms/s)
4x GTX Titan X 12GB
(235.91 Ms/s)

In the chart above, you can see that the performance gain by going with a more expensive model of video card is nowhere near as efficient as simply going with a higher number of cards. It is not always possible to install more than one or two GPUs in your system, but if you are able to you could potentially see a huge increase in performance. For example, if you have about $1500 to spend on video cards you would have the choice between two GTX 980 Ti 6GB cards or four GTX 970 4GB cards. The cost is almost identical, but the four GTX 970's will be about 30% faster. That is a free 30% increase in performance for absolutely no difference in cost!

Of course, raw performance is often not the only consideration. Accommodating more GPUs may require a larger power supply, may not allow for additional PCI-E cards like sound cards or WiFi to be used, and requires a physically larger chassis. In addition, if your renders require a large amount of VRAM (video card memory), you may need to go with a GTX 980 Ti 6GB or GTX Titan X 12GB just for the additional VRAM. That may mean you will have to give up some raw rendering performance, but it would ensure that you are able to complete the render in the first place.

Recommended Reading

If you are configuring a system for rendering with a GPU-based engine, we have a number of articles regarding the hardware requirements for various rendering engines that you may be interested in:

Recommended Hardware for GPU-Based Rendering
Summary of what you need to know when choosing hardware for a GPU-based rendering workstation.

Octane Render GPU Performance Comparison
How well does Octane perform with different models and numbers of GPUs?

NVIDIA Iray GPU Performance Comparison
How well does Iray perform with different models and numbers of GPUs?

NVIDIA Iray CPU Scaling
Does having more CPU cores give you more performance in Iray?

 

Recommended Systems for GPU-based Rendering

 

 


 

 

Also great for:

  • Redshift
  • Solidworks Visualize
  • Furryball 
  • Arion
  • Blender - Cycles
  • Any other GPU-based rendering engine!

Dual GPU

Purchase

Compact workstation

  • Intel Core i7 CPU
    (up to 10 cores)
  • Supports up to 256GB of RAM
  • Up to two NVIDIA GeForce/Quadro video cards

Quad GPU

Purchase

Maximum Rendering Performance

  • Intel Core i7 CPU
    (up to 10 cores)
  • Supports up to 512GB of RAM
  • Up to four NVIDIA GeForce/Quadro video cards

Tags: Octane, Render, GPU, Video Card
'Ainoa Manuia

Great article, i see the price recommendations are for production house setups. Cant wait till you guys do an AMD zen review

Posted on 2016-06-02 07:54:55
Andrew

I keep seeing x8 and x16 pcie graphic compared, what about skylake and haswell-e pcie ssd speed, the limitation of dmi 3.0 up to 4gb is worth trying with the latest intel 3608

Posted on 2016-06-02 15:16:18
Bradley Bird

This article doesn't mention SLI at all. Are we assuming these are running in 2/3/4 way SLI? Also the CPU seems like overkill in the Xenon set-up. Is that purely to accommodate the number of PCI lanes needed to run four cards at x8? Did you do any testing at 1x? What are you using for a video card top run the system?

Posted on 2016-06-13 23:24:59

These GPUs were run as individual cards, not in SLI. SLI is not needed for things like GPU rendering and enabling it shouldn't really do anything for performance. As for the CPU choice, the dual Xeon setup was intended to be overkill because we wanted to see if there was any difference in performance between a more standard CPU (the Core i7 6700K) or a powerful dual CPU setup (dual Xeon E5-2687W v3). What we found was that there was virtually no difference in performance which means that Octane is almost entirely CPU-independent when rendering.

The dual Xeons did allow us to test up to four cards at PCI-E 3.0 x16, but since our testing also showed no difference between running the cards in x8 or x16, that isn't really a requirement for four GPU setups. The Core i7 6700K doesn't have enough PCI-E lanes to accommodate four GPUs, but something like a Core i7-6850K/6900K/6950X should give you that capability (provided the motherboard can handle it) if you want a four GPU setup for Octane.

We didn't test the cards at x1 - it is extremely rare to try to run a GPU at that speed and isn't something I would recommend. From other testing we have done in the past, you can sometimes get away with as low as PCI-E 3.0 x4 without any performance loss, but I would expect massive drops in performance if you tried to run at x1. Really, x8 is the lowest I would recommend running a video card at.

Posted on 2016-06-13 23:54:13
Muhamad Razif Mohd Razali

The gpu run as individual? Let say i've put my 2 gtx970. There's no need to put the sli bridge isn't it?

How to enable it to run as individual?

Do I still need a motherboard that support sli if i'm intended to use 2 x gtx970 for gpu rendering?

Please. I do need help.
Wanna use 3ds max applications.

Posted on 2016-07-28 15:36:16