Puget Systems print logo

https://www.pugetsystems.com

Read this article at https://www.pugetsystems.com/guides/1473
Article Thumbnail

RealityCapture 1.0.3: NVIDIA GeForce and Titan Performance Comparison

Written on June 5, 2019 by William George
Share:

Introduction

RealityCapture, like other photogrammetry applications, is built to take a batch of photographs and turn them into digital, 3D models. The many steps involved in that process can take a lot of time, and utilize both the CPU and GPU at different points. We recently put together a benchmark tool for RealityCapture, and after looking at processor performance last week we are now diving into a comparison of the current NVIDIA GeForce and Titan video cards.

Since RealityCapture uses CUDA, only NVIDIA cards will work - and indeed, such a video card is required for this program to run fully.

Test Hardware

Here is a list of the hardware we tested RealityCapture on. The CPU and RAM capacity were kept the same across all test runs, to avoid either of those throwing off the comparison, and we used the Core i9 9900K because it turned in the best performance in our recent CPU article. All results included here are from after the recent Windows 10 security patch addressing MDS vulnerabilities in some Intel processors, but we did see a small increase in Core i9 9900K performance when we moved to the newer Z390 motherboard - which helped offset some of the performance that chip lost because of the update.

Benchmark Details

For testing photogrammetry applications, we have four image sets that we own the rights to - covering both smaller and larger size model and map projects. The smaller image sets are included in our public RealityCapture benchmark, which you can download and run if you want to compare your system's performance to what we measured in our testing.

  • Rock Model - 45 photos at 20 megapixels each
  • School Map - 51 photos at 18 megapixels each
  • School Model - 278 photos at 18 megapixels each
  • Park Map - 758 photos at 18 megapixels each

Each image set was processed 2-3 times on each GPU, and the fastest overall result was used for the comparisons below.

Results Overview

Here are charts for each of the four image sets, showing the total time (in seconds) they took to process on each GPU. The video cards are listed in the same order on all charts, regardless of performance, to make it easy to see how they stack up. The fastest result on each project is highlighted (bold) to ensure it stands out from the crowd, but it turns out that may not have been necessary.

Detailed Results

For those who want to dig further into the differences in how each GPU performs, here is a table showing the times for each step within RealityCapture on each of the image sets:

RealityCapture 1.0.3 NVIDIA GeForce and Titan Performance Table

Analysis

GPU selection appears to have a fairly small, but still measurable, impact on RealityCapture's performance. Across cards ranging from ~$300 to $2500 in price, there was a spread of 10-20% performance difference. The Titan RTX did come out with the fastest times across the board, but sometimes only by fractions of a second. All in all, reducing processing speeds in RealityCapture by upgrading the video card is an option - but a costly one, considering the modest gains as you go from one card to the next.

Multi GPU Performance

What about multiple video cards, though? How viable is that path to increasing performance in RealityCapture?

RealityCapture 1.0.3 Multi GPU Scaling with GeForce RTX 2080 Ti

Doubling up on video cards provides a 5-13% boost in processing speed in RealityCapture, which is not too shabby considering the incrimental differences we saw earlier from one card to the next. In fact, at some price points this means that two video cards would be a better choice than a single, faster, and more expensive card. For example, check out how two GeForce RTX 2080 Ti cards (2 x $1199 MSRP) best a single Titan RTX ($2499 MSRP) for about the same price:

RealityCapture 1.0.3 Performs Better on Two GeForce RTX 2080 Ti Video Cards Than on a Single Titan RTX for the Same Price

Conclusion

All of the NVIDIA graphics cards we tested with RealityCapture worked well, so in the end your choice should probably come down to budget. In some situations, spending a little more can get you a nice boost - like moving from the GTX 1660 Ti to the RTX 2060, which are less than $100 apart and represent the biggest jump in performance for your dollar among the models in this roundup.

As you go higher, and cards get more expensive, opting for two less expensive video cards instead of one higher cost GPU is often going to be the best value. It is entirely possible that three or four video cards would further increase GPU speeds in RealityCapture, but going beyond the two we tested here would require using a bigger chassis and power supply (which further increases cost) and potentially a different CPU, which would lead to an overall reduction in performance instead of an improvement.

Looking for a
Photogrammetry Workstation?

Puget Systems offers a range workstations tailored for Pix4D, Metashape, and RealityCapture. Even in the most demanding situations, our workstations and designed to minimize downtime and allow you to work as efficiently as possible.

Tags: RealityCapture, photogrammetry, GPU, NVIDIA, GeForce, Titan RTX, RTX, Performance
Tyler Unzicker

Hi William,
We're getting ready to build another RC system and I've found your benchmarks very useful. However, we are still debating the benefits of multiple GPUs and it seems, from our testing, that multi GPUs may have a more significant impact than stated in this article. Isn't it true that if the system you were testing on was running a 9900k then using dual GPU would run at x8/x8? From what I've seen the 9900k only has 16 PCI lanes. The only reason I bring this up is because we have done some tests on one of our current rigs that is running triple 980ti's at x16/x16/x16 and the difference between our 1 card and 3 card test was significant (the 1 card test took almost twice as long as the 3). Even more surprising is that the test with 3 980ti's (i7 5960x, 3 980ti's running at x16, 64gb ram) was slightly faster than our newest rig (TR 2950x, 2 2080ti's running at x16, 128gb ram).

Posted on 2019-07-10 16:47:07

Yes, the test system here had a Core i9 9900K on a Z390 based motherboard. That means 16 PCI-E lanes from the CPU, and when used with two video cards they each get 8 lanes. Most of the time, at least in other applications, we have seen little if any performance impact from having that PCI-E lane reduction... though I have not specifically tested that in RC or any other photogrammetry programs yet.

Going with a different CPU and motherboard / chipset could allow more than two video cards, and potentially also keep some of those video cards at 16 lanes rather than 8... but in trade, you'd have to use a CPU that is slower in RealityCapture than the 9900K. The reason that CPU was selected for this GPU test is that it is the fastest we have found for RC, which you can read about more on this article: https://www.pugetsystems.co...

I suppose there is a possibility that going to a CPU with more PCI-E lanes and three or four GPUs instead of two could end up counteracting the effects of having a slower processor... but would it be enough to actually result in faster overall performance? I am not sure - and *if* it did, it would come at a pretty high price increase (a more expensive motherboard, CPU, CPU cooler, power supply, chassis, and 1-2 more video cards).

If you'd like to see how your current systems stack up against the hardware we have tested in our articles, consider downloading and running our public benchmark for RealityCapture: https://www.pugetsystems.co...

I would love to see the results from your systems in that test :)

Posted on 2019-07-10 17:01:41
Tyler Unzicker

I guess I forgot to mention that we also tested putting a single 2080ti in the i7 rig and it outperformed the single 2080ti test on the TR even though the i7 is only running at 3 ghz. I agree with everything you have said so far, I'm just racking my brain trying to figure out why the i7 rig running at 3 ghz is performing better than the TR at 4.2 ghz.

Also, we have run your public benchmark but we are currently running one of our own data sets to get a more accurate representation of what to expect.

Posted on 2019-07-10 17:12:04

That older Core i7 doing so well surprises me a little bit too... but then again, in our CPU testing it did seem like Threadripper in particular was a poor match for RealityCapture. Clock speed and core count don't always tell the whole story: there are architecture differences, cache behavior, communication between cores & RAM, and more that all come into play in real-world performance.

Also, since you've given it a shot, if you have any feedback about our benchmark or insight about how it does / does not mimic your own data set behavior I would love to hear it! We want to improve our benchmarking over time, so that the public tests and our articles are as beneficial for users as possible :)

Posted on 2019-07-10 17:48:06
Brian LaBrec

Hi William,

Thanks for all of your help! I just wanted to check if your benchmarks are being run at "high detail". I noticed the low times you got for the reconstruction process, even for the "park map" at 758 photos. This might be that your images are a little scarce in pixels for photogrammetry, but I understand for shared benchmarks gigabytes of data isn't ideal to download.

From our own tests, we've found that the first process(calculating depth maps) in RC reconstruction is very GPU heavy and our times vary drastically based on what GPU/GPU-combo we're using. If we were to go wild with it, it might also be useful to time how long each process takes in a reconstruction per GPU/CPU. For datasets with hundreds of very large and pixel dense images I have a feeling that the time saved by utilizing more, higher quality GPU's will speed up their portion of processing more than a higher end CPU will.

Thanks again, I'm interested to hear your thoughts!

Posted on 2019-07-11 13:30:54

The benchmark tests currently run on Normal rather than High in order to keep the processing times more reasonable. If your particular settings and data sets lead to more time being spent on GPU-accelerated steps, though, the GPU selection (both model and quantity) could certainly become a bigger factor! If I have a chance at some point, I might take a look at the impact of moving from Normal to High on one of my larger image sets... but right now, the focus is on testing AMD's new Ryzen 3rd Gen chips to see if they best the Intel Core i7 / i9 chips that are currently the fastest for RC.

Posted on 2019-07-11 20:15:27