Agisoft PhotoScan Multi Core PerformanceWritten on September 11, 2015 by Matt Bach
When designing a computer there are literally thousands of different hardware components to choose from and each one will have an impact on the overall performance of your system in some shape or form. Depending on the software you will be using, however, some components will simply be more important than others. In the case of PhotoScan, the two components that most affect performance is the CPU and the video card. In this article we want to answer the question: how do you know which CPU is best for PhotoScan?
Before even attempting to answer this question, it is important to understand the two most basic CPU specifications:
- The frequency is essentially how many operations a single CPU core can complete in a second (how fast it is).
- The number of cores is how many physical cores there are within a CPU (how many operations it can run simultaneously).
This doesn't take into account the differences between CPU architectures, but in an ideal world a CPU that has the same frequency but twice the number of cores would be exactly twice as fast. Unfortunately, making software utilize multiple cores (and do so effectively) is difficult in most situations and almost impossible in others. Add in the fact that higher core count CPUs tend to have lower operating frequencies and it becomes even more difficult to ensure that you are choosing the best possible CPU for your software.
In this article, we want to find out how well PhotoScan can utilize multiple cores - also known as multi-threading - to help determine what type of CPU (either one with a high frequency or a high core count) will give you the best possible performance. While PhotoScan can do a number of different tasks, for this article we are going to focus on the four steps PhotoScan goes through to convert a series of photographs into a 3D model:
- Align Photos
- Build Dense Cloud
- Build Mesh
- Build Texture
If you want to skip over our individual benchmark results and simply view our conclusions, feel free to jump ahead to the conclusion section.
For our test system, we used the following hardware:
|Motherboard:||Asus Z10PE-D8 WS|
|CPU:||2x Intel Xeon E5-2687W V3 3.1GHz Ten Core|
|RAM:||8x Kingston DDR4-2133 8GB ECC Reg.|
|GPU:||NVIDIA GeForce GTX Titan X 12GB|
|Hard Drive:||Samsung 850 Pro 512GB SATA 6Gb/s SSD|
|OS:||Windows 8.1 Pro 64-bit|
|PSU:||Antec HCP Platinum 1000W|
|Software:||Agisoft PhotoScan 1.1.6 build 2038 (64-bit)|
Since we want to determine how many CPU cores PhotoScan can effectively utilize, we used a pair of Xeon E5 2687W CPUs to give us 20 physical CPU cores with which to test. To see how well PhotoScan utilizes multiple CPU cores we will change the number of cores made available to the software by setting the affinity in Windows. This way we can accurately benchmark PhotoScan with anywhere from a single core to the full twenty cores possible with this setup. To help with consistency - and since the benchmarks we performed ran for several days - we programmed a custom script using AutoIt to start PhotoScan, set the CPU affinity, load the test file, time how long it takes each of the four steps to complete, close PhotoScan to clear any data from the system RAM, then loop while making more and more cores available.
To analyze the data, we will be presenting our results in terms of how long it took each action to complete with X number of cores compared to how long it took to complete with just a single core. From these results, we will then use Amdahl's Law to estimate the parallel efficiency for that action. 100% is perfect efficiency where a high core count CPU is ideal, but as the efficiency drops lower and lower having a high frequency CPU becomes more and more important.
One thing we want to make very clear is that our testing is really only 100% accurate for the photographs we used. We did find that the multi core efficiency doesn't change much based on the resolution or number of photos but if you want more accurate results for what you actually do in PhotoScan we recommend following our Estimating CPU Performance using Amdahls Law guide. It can be a time consuming process (even with automation the testing for this article took a significant amount of machine time) but it is really the only way to know for sure what the parallel efficiency is for what you do.
For our test data we used the Monument sample data that Agisoft has made available on their website. We found that this set of images was a good mix of being hard enough on the hardware to really test the multi core efficiency while also being relatively quick to finish (since we had to complete a large number of runs to give us the data we need). To ensure that our results will be accurate for larger data sets, we also did some spot testing using other projects including a larger data set provided to us from one of our customers. What we found is that while the resolution of the images doesn't influence the time it takes to align the photos, it does change the amount of time it takes to complete the other three steps at close to a 1:1 ratio. In addition, the number of images included in the data set also causes a linear increase in the amount of time it takes to complete all four steps (including aligning the photos). In other words, a data set with either twice the number of images or twice the MP count will take approximately twice the amount of time to complete. Further, if you have a data set with twice the number of images and twice the MP count it will have a roughly 4x longer build time.
What that means is that we have found the multi-core performance to be roughly linear so the CPU that is best for a large number of high res images will also be best for a smaller number of lower res images. The amount of actual build time will of course be different - and you will need more RAM for larger data sets - but a configuration that gives a 50% performance increase for a small data set should give a roughly 50% performance increase for a larger data set.
Depending on the settings you are using, aligning the photos should end up being around 1% to 8% of the total time to generate a 3D model from start to finish depending on the settings you are using. This means that having the absolute best CPU for this step is only moderately necessary for medium and high settings and will have almost no impact on ultra high.
In the graph above, the lines with dots are the actual speedup we recorded in our testing. The solid lines shows the calculated efficiency we arrived at by using Amdahl's Law on the results. What is interesting is that for all of our testing in PhotoScan we saw a very distinct drop in performance at 11 cores (which is right when we started to utilize the second CPU). This is actually very normal since as soon as a second CPU is utilized it introduces all the overhead associated with having to transfer data across the two CPUs. The good news is that PhotoScan recovers very quickly to the point that after only about four cores the performance is right on track with what we would expect.
Overall, with medium accuracy we saw about a 90% multi core efficiency and with high accuracy we saw a 92% multi core efficiency. This is fairly middle of the road for software that is able to utilize multiple CPU cores.
Build Dense Cloud
"Build Dense Cloud" is an interesting portion of PhotoScan as it is the one task that is GPU accelerated (so the performance of your GPU impacts the time it takes to complete) and is also the task that takes by far the longest amount of time to complete. Depending on the settings you want to use, this step can be anywhere from 50% of the total model generation time (on medium) to 96% if you use the ultra high settings. In other words, this one step is easily going to be the most influential one for choosing a CPU that will give you the best performance in PhotoScan.
You will notice that we didn't actually run 20 passes with the quality set to ultra high - it simply took way too long at that quality setting for it to be practical. However, we did enough spot testing that we are confident in our estimation of multi core efficiency.
At the medium quality settings, we saw an overall efficiency of about 68%. This increased to 70% when we raised the quality to high, but dropped down to 66% when we set the quality to ultra high. However, since "Build Dense Cloud" utilizes the GPU we did some additional testing to see how the number of video cards in the system impacted the multi core efficiency:
We only did this testing with the ultra high quality setting, but we found that for each GPU we added to the system the multi core efficiency increased by about 5%. We go into more detail on how well PhotoScan utilizes different models and numbers of video cards in our Agisoft PhotoScan GPU Acceleration article but what this means is that using more video cards actually improves how well PhotoScan is able to take advantage of additional CPU cores for this task. And since "Build Dense Cloud" can be anywhere from 50% to 96% of the total time it takes to generate a 3D model this will have a huge impact on deciding which CPU is the best for PhotoScan.
Building the mesh is a bit of an interesting step as at medium settings it accounts for about 30% of the total time to generate a 3D model. However, this drops off to 10% at high settings and only 2% at ultra high settings. This makes it a moderately important factor if you will be using medium settings, but not as big of a deal once you get to high or ultra high settings.
Like the other steps, we saw a drop in performance when we started to use the second CPU but it recovered very quickly. In the end, for building the mesh we saw a multi core efficiency of about 88% for medium settings and 85% for high settings.
Building the texture is the fourth and final step and accounts for anywhere from 8% (on medium) to 3% (on high) to just 1% (on ultra high) of the total time it takes to generate a 3D model in PhotoScan.
Oddly, the results for this step were a bit inconsistent with large spikes and dips in performance. However, we can still make a fairly accurate estimate of 50% for the multi core efficiency. This is honestly a pretty bad multi core efficiency and means that a high frequency CPU with a low core count will be best for this step. The one bit of good news here is that we didn't see any drop in performance when we started to use the second CPU like we did in the other three steps.
To summarize our results, here is the parallel efficiency we saw for each of the four steps at medium, high, and ultra high settings:
(higher is better - 100% is perfect)
|Build Dense Cloud||68%*||70%*||66%*|
*Add ~5% for every additional GPU
This information if extremely useful to help us determine what CPU is best for PhotoScan, but to convert these numbers into a single overall multi core efficiency for PhotoScan at different quality settings we need to factor in how long it takes to run each of the four steps compared to the overall time it takes to generate a 3D model. When we do that, we get the following:
(higher is better - 100% is perfect)
|Overall parallel efficiency (single GPU)||73.7%||69.2%||66.5%|
This means that for every core you add, you only get about a 70% increase in performance. This may sound pretty good at first, but in reality you start seeing diminishing returns for every CPU core you add after only about 4-6 CPU cores which typically would make a lower core count, higher frequency the better CPU for PhotoScan. Normally, we would use this overall efficiency to tell you what the best CPU is for PhotoScan but unfortunately this is where it starts to get complicated. If you read our Agisoft PhotoScan GPU Acceleration article you will see that the performance and number of video cards in the system makes a huge impact on how long it takes to complete the "Build Dense Cloud" step. On top of higher-end video cards being ~35% faster than a mid-range card we also found that adding a second GPU increases performance by 30-40%, adding a third GPU increases performance by another 20-25%, and adding a fourth GPU increases performance by another 12-15%.
The amount of performance increase you can achieve by using multiple video cards ends up being incredibly significant and will actually be much more of a difference maker than having a CPU that best fits with the 70% multi core efficiency we measured. For most other software packages, this would simply mean that you should invest more into your video cards than your CPU but for PhotoScan it is a bit more complicated due to two factors:
- As you increase the number of GPUs, the multi core efficiency of the "Build Dense Cloud" step increases by about 5% per GPU
- For every physical GPU in the system, Agisoft recommends you disable one CPU core (done through Preferences -> OpenCL)
Since you need one core reserved per GPU and the multi core efficiency increases as you add more GPUs to the system this means that as you add video cards you also want to have a CPU with a higher core count in order to get the best overall performance. Since CPUs with higher core counts are more expensive, this means you have to balance purchasing multiple video cards (which is where your main performance gains will be) with a more expensive high core count CPU to match. Unfortunately, this makes it very difficult to say that one CPU is better than another.
We came up with two ways to help you make sure that you have the correct combination of CPU and GPU to give you the absolute best performance in PhotoScan for your budget. The first is a Google doc where we used both the results from this testing and our Video Card article to come up with an estimate of approximately how long it should take different combinations of CPU and video cards to build a theoretical 3D model. If you want to play around with that, it is available at:
You will need to make a copy of the sheet (through File -> Make a Copy) before you can edit anything, but this gives you the freedom to input different CPU options (although the ones we have in there should be among the best choices) as well as what your budget is. Based on that, it will highlight the CPU/GPU combinations that fall within your budget so that you can look through them to find the one with the lowest build time.
If you are configuring a system for PhotoScan, we recommend also reading our Agisoft PhotoScan GPU Acceleration article. Alternatively, if you don't want to wade through all those different results, we also came up with three different recommended systems based on whether you have the budget for a dual, triple, or quad GPU system. These systems have a few CPU and video card options but they were designed so that you can't possibly make a bad decision. If you choose a more expensive CPU or GPU option, you will see a decrease in the time it takes to build a 3D model in PhotoScan:
Recommended Systems for Agisoft PhotoScan
Compact Dual GPU
Ideal for budgets of $4,000+
Mid Tower Triple GPU
Increased performance for $6,000+
Full Tower Quad GPU
Maximum performance for $11,000+