Puget Systems print logo

https://www.pugetsystems.com

Read this article at https://www.pugetsystems.com/guides/1819
Article Thumbnail

RealityCapture 1.0.3 HT / SMT Performance Analysis

Written on July 2, 2020 by William George
Share:

TL;DR: Disabling HT/SMT increases RC performance on many-core CPUs

RealityCapture performed the best with Hyperthreading and Simultaneous Multithreading turned off when running on high core processors (16+ cores). On processors with fewer cores, leaving those features enabled resulted in better performance.

Introduction

Intel's Hyperthreading (often shortened to HT) and AMD's similar Simultaneous Multithreading (SMT) are features found on many mid-range and almost all high-end processors and enabled by default. These technologies work by duplicating a portion of each CPU core's pipeline, allowing a second software thread to be ready and waiting to execute commands as soon as the core finishes processing the thread it is actively working on. That doubles the number of "cores" the operating system sees, and in some applications can lead to a substantial increase in performance. However, in certain situations, it can also reduce performance - especially if a program struggles to use lots of cores effectively (shown by processing times stagnating or even go down as the number of cores in a CPU goes up).

In past articles we have observed that some photogrammetry applications seem to perform better with lower core count processors, so we wanted to look at whether HT and SMT could be negatively impacting performance in these programs. In this article, we are focusing on RealityCapture. If you would prefer to skip our test setup and various benchmark results, feel free to jump straight to the Conclusion.

Looking for a RealityCapture Workstation?

Puget Systems offers a range of powerful and reliable systems that are tailor-made for your unique workflow.

Configure a System!

Labs Consultation Service

Our Labs team is available to provide in-depth hardware recommendations based on your workflow.

Find Out More!

Test Setup

Listed below are the specifications of the systems we will be using for our testing:

AMD Ryzen Test Platform
CPU AMD Ryzen 9 3950X
CPU Cooler Noctua NH-U12S
Motherboard Gigabyte X570 AORUS ULTRA
RAM 4x DDR4-2933 16GB (64GB total)
Intel 9th Gen Test Platform
CPU Intel Core i9 9900K
CPU Cooler Noctua NH-U12S
Motherboard Gigabyte Z390 Designare
RAM 4x DDR4-2666 16GB (64GB total)
AMD Threadripper 3rd Gen Test Platform
CPU AMD TR 3970X
CPU Cooler Noctua NH-U14S TR4-SP3
Motherboard Gigabyte TRX40 AORUS PRO WIFI
RAM 4x DDR4-2933 16GB (64GB total)
Intel X-10000 Series Test Platform
CPU Intel Core i9 10980XE ($979)
CPU Cooler Noctua NH-U12DX i4
Motherboard Gigabyte X299 Designare EX
RAM 4x DDR4-2933 16GB (64GB total)
Shared PC Hardware/Software
Video Card NVIDIA GeForce RTX 2080 Ti 11GB
Hard Drive Samsung 960 Pro 1TB
Software Windows 10 Pro 64-bit (version 1909)
RealityCapture 1.0.3.10403
Puget Systems RealityCapture Benchmark

Test Methodology

For benchmarking photogrammetry applications we now have four image sets that we own the rights to, covering both smaller and larger size models and map projects. All of these image sets are available in our public RealityCapture benchmarks, split up to allow quick or extended tests, which you can download and run if you want to compare your system's performance to what we measured for this article.

  • Rock Model - 45 photos at 20 megapixels each
  • School Map - 51 photos at 18 megapixels each
  • School Model - 278 photos at 18 megapixels each
  • Park Map - 758 photos at 18 megapixels each

Benchmark Results

The focus of this article is on each processor's performance in RealityCapture with Hyperthreading or SMT enabled versus disabled, rather than comparing one processor to another, so the graphs are set up to reflect that. There are two results for each CPU, with HT/SMT on shown in blue and then off shown in red. These charts are showing the total processing time for each image set, in seconds, so smaller numbers and shorter lines indicate better performance. Scroll through the gallery below to see an overview of the results:

For those who want to dig into how HT and SMT impacted performance in different processing steps, here is a full table of the results:

RealityCapture 1.0.3.10403 Hyperthreading and SMT On vs Off Performance Table

Click to enlarge

Analysis & Conclusion

The results shown above are a bit mixed, so lets break them down by processor:

  • AMD's Threadripper 3970X benefited across the board from turning off Simultaneous Multithreading. Presumably this would hold true for other 3rd Gen Threadripper processors as well.
  • AMD's Ryzen 3950X also saw an increase in performance with SMT disabled, but much smaller than the Threadripper. In fact, the 1-3% improvements are probably close to the margin of error. I think with anything smaller (the 3900X or lower) the outcome would likely be different.
  • Intel's Core i9 10980XE saw gains in three of the four tests with HT off, but did lose a bit of performance in the fourth test. As with the Ryzen line, I think anything with a lower core count in the Core X family would likely see no benefit from disabling HT.
  • Intel's Core i9 9900K performed best with HT turned on, particularly with the largest image set in our benchmark. The small gains it had with two of the other image sets are likely within the margin of error, so on this and other mainstream Core processors I would recommend leaving Hyperthreading enabled.

Should Hyperthreading or SMT be turned on or off for RealityCapture?

It really depends on what processor you are using: for high core count chips, roughly 16 cores or more, turning HT and SMT off should slightly increase performance on average. On processors with fewer cores, leaving those features enabled should give better results.

Looking for a Photogrammetry Workstation?

Puget Systems offers a range of poweful and reliable systems that are tailor-made for your unique workflow.

Configure a System!

Labs Consultation Service

Our Labs team is available to provide in-depth hardware recommendations based on your workflow.

Find Out More!
Tags: Intel 9th Gen, Intel X-series, AMD Ryzen 3rd Gen, Intel X-10000, AMD Threadripper 3rd Gen, photogrammetry, Core i9 9900K, Ryzen 9 3950X, Hyperthreading, SMT, RealityCapture
Nino Skupnjak

Interesting. Can you test 3950x (and other) with affinity set to only half of processors? I'm wondering what the result would be.

Posted on 2020-07-10 08:45:56

I'm not sure if I'm going to be putting more time into testing this direction or not, but if I do I will try to remember to post back here. However, I am pretty sure I can guess the results - and it would depend on which of the logical cores you set affinity to:

- If you set it to the 16 "real" cores (and none of the SMT cores) then performance should be the same as simply disabling SMT... or very close to it (background processes could still be hitting those SMT cores, which wouldn't happen when disabling it in the BIOS).

- If you set it to 8 real cores and their 8 matching SMT cores, then performance would be similar to an 8-core Ryzen processor (but at the clock speeds of the 3950X, and maybe a little better thanks to more cache memory).

- If you set it to some other, strange mix then performance could be a bit different. I don't think there would be any way to make it "better" than just the 16 real cores, though, whether using affinity or turning SMT off manually... and remember that with this processor (the 3950X) the SMT on/off difference was quite small anyway (<3%). The same principle should apply across all the CPUs that benefit from having SMT off, though: affinity would be another way to achieve the same result, but I personally find it sort of annoying to work with. I think one of my colleagues came up with a script for setting affinity, though; I may check with him to see how easy to was to set up.

Posted on 2020-07-10 17:31:55