Puget Systems print logo

https://www.pugetsystems.com

Read this article at https://www.pugetsystems.com/guides/1604
Matt Bach (Senior Puget Labs Technician)

Real world benchmarks are about more than just competing for big numbers

Written on November 8, 2019 by Matt Bach
Share:

Over the last year, we have been hard at work improving, polishing, and making our internal benchmarks available to the general public which cover a range of applications in the content creation, engineering, and scientific computing fields. In fact, while most of our benchmarks still have a "BETA" modifier on them, we already have eight of our benchmarks available that you can download and run right now!

But why are we spending so much effort on this project? After all, this kind of development takes a significant amount of time and is often much harder to do than you might realize since applications like Photoshop, Premiere Pro, DaVinci Resolve, etc. are not made to be used in this manner. They are developed to do the job that they are designed for, not for us (or anyone else) to do performance testing with.

Why are real-world benchmarks necessary?

While real-world benchmarks are important for a wide number of reasons, there are three major reasons why we feel it is worth the investment of time and money to develop these benchmarks:

1) They provide a standardized and accurate method to evaluate hardware performance

While somewhat of an obvious use for a benchmark, it is surprising how often we get asked why we go through all this effort when we could just use Cinebench, Geekbench, or the plethora of available game benchmarks. To be fair, Cinebench is a great benchmark to evaluate CPU-based rendering performance in Cinema4D, but even it means almost nothing if the software you actually use is Adobe Photoshop.

Even between applications that are made by the same company, the way they use the CPU, GPU, and other hardware can be significantly different. For example, the best CPU for After Effects is not at all the best CPU for Premiere Pro. If similar applications like this are so different, you can imagine how inaccurate a game benchmark can be for anything beyond other games that use the same engine.

With the sort of targeted benchmarks we are creating, you can get a much more accurate idea of how different hardware will perform in the real world, which in turn helps people get the exact right hardware for their workflow. Of course, the fact that you can do a million different things in most applications means that even our benchmarks won't be 100% accurate for everyone, but it is vastly better than making decisions based on a completely unrelated benchmark.

2) They allow us to democratize hardware testing

While we believe that the testing and hardware evaluation we do is incredibly useful, we cannot feasibly test everything. Just as an example, since we do not sell laptops it is hard for us to spend the effort and funds to benchmark laptops with various applications. It simply isn't something that we can justify financially.

However, there are a ton of hardware reviewers out there who just need a little bit of help to get beyond gaming and synthetic benchmarks. In the end, part of our mission as a company is to help empower creators and that extends beyond our direct customer base. If our real-world benchmarks can indirectly help people get the exact right laptop for their workflow, that is definitely something we want to do!

3) They greatly improve the troubleshooting process

If you are having performance or other issues in an application like Photoshop, it could be caused by several factors including hardware, software, workflow, or simply your expectations not fitting reality. A standardized benchmark like the ones we are developing allows you to greatly reduce the number of factors that could be causing the problem.

For example, if Photoshop is slower than you expect on a new system but our benchmark gives you a result that is right where we would expect it to be, you can be reasonably confident that your system and Photoshop itself is working properly and that it is either an expectation or workflow problem. On the other hand, if the scores are way off, you know that your installation of Photoshop has a problem or something is broken with the system itself.

This is something our support department is starting to utilize more and more as our benchmarks become available to the public, and it can dramatically reduce the time to resolution when a customer is having an issue with their system. And of course, it extends to everyone, not just our own customers.

What is coming in the future for our benchmarks?

While the benchmarks we already have available are great, there are a couple of projects coming up that we are very excited about.

First, we are exploring releasing a paid version of our benchmarks for commercial use. We do not anticipate removing the free versions of our benchmarks, but a paid version gives us the financial freedom to include features that are useful for hardware reviewers, computer manufacturers, and various other commercial uses. These features include the ability to run from the command line, generate log files, and more official support. This is a big step towards democratizing hardware testing as it will give any reviewer the ability to quickly, easily, and effectively test various hardware configurations in real-world applications.

Another project that is coming up is the ability to upload and browse results. This is a large and complex project, but it will allow individual users to directly compare their system to a range of other hardware configurations. As we mentioned in the last section, we cannot feasibly test every hardware combination out there, but this should go a long way towards mitigating that. While we are not 100% sure what this will look like, the results browser for OctaneBench is the type of system we imagine.

Overall, we are very excited to continue to develop our benchmarks. We feel that hardware reviews are often focused too heavily on gaming or synthetic benchmarks, and we want to do our part to help fix that. As always, if you have any suggestions or feedback on the work we are doing, let us know in the comments!

Tags: benchmarks
Nikolas Kanellopoulos

You are doing an excellent work and thank you for publishing the benchmarks! But what about the Nuke users? Do you have any plans for us? Thanks in advance!

Posted on 2019-11-08 23:12:43

Nuke, modo, and a bunch of others are on the list, but we're not sure when or if we will get to them. Making benchmarks that are real world requires us to have not only a deep technical understanding of the software, but also what people are actually doing in their day to day work. Not to mention figuring out how to get the application to actually be used as a reliable benchmark in the first place! That is a very large undertaking that we have to be able to financially justify.

Even with the benchmarks we have now, there is still a lot of work to be done. If we do get to Nuke, it unfortunately likely won't be for several years yet. But it is on the list of potential applications to take on!

Posted on 2019-11-08 23:18:28
Nikolas Kanellopoulos

I understand the complexity of the project thats why we count on you. There are so many articles for building Nuke PCs, but nobody mentions the real performance differences between two cpus (what foundry suggests), a high core count i9 and a high clock count like 9900K...

Posted on 2019-11-08 23:37:24
Jan Dorniak

If you want to upload benchmark results, why not tie into the Phoronix Test Suite and upload to OpenBenchmarking.org? I acknowledge this might not mesh with your business interests but give it a thought. Also, while they advocate Linux benchmarks, it works on Windows as well.

Posted on 2019-11-09 14:12:38
Dave Sang

thank you for your effort here!

Posted on 2019-11-11 20:31:46
Fabio Bernardino

I'm curious why most of your systems are using crucial DDR4-2666 instead of a faster memory as in the Lightroom workstation as it uses a Ryzen 3900X which in theory should perform at its best with a 3600MHz memory. Is that choice due to reliability of the Crucial modules and stability of the whole system ?
Thank you.

Posted on 2019-11-13 15:54:53
jaybates86

+1 to this. I'm curious about the 2666 ram as well.

Posted on 2019-11-13 16:37:52

Hey Fabio, you are spot on that it is about stability. Our plan is to move to DDR4-2933 once there are 16GB (and hopefully 32GB) modules that follow the JDEC specs, but we really avoid using RAM that is beyond those JDEC specs in our workstations since we value reliability over a few percent higher performance. Most of our upcoming articles are actually going to be using 2933MHz for Ryzen in anticipation of this move, but I'm not sure when those sticks will be available for us to purchase.

Something to note is that 3600MHz is way outside what AMD has certified for Ryzen. It can be faster depending on the workload, but the spec for 3rd Gen Ryzen is 2933MHz max if using four sticks or 3200MHz if you are only using 2 sticks. We did a post about how RAM speed affects Video Editing (https://www.pugetsystems.co... and while we didn't cover it in that post, we saw significantly more benchmark/application crashes with 3200MHz RAM - likely because using four sticks like we did is beyond spec - and 3600MHz was even worse. If I remember correctly, at one point we had to run our Premiere Pro benchmark 6-7 times just to get it to complete without crashing!

Of course, if you are building your own computer it is completely up to you as to how far you want to push the RAM speed. Our customers are typically not the kind of people that ever want to fiddle with their system, but if you don't mind a tweak here or there, go for it!

Posted on 2019-11-13 17:27:19

Just to add some more detail to what Matt said, and because we get asked about this so often, here are AMD's official supported memory speeds for the 3rd Gen Ryzen processors:

2 modules: DDR4-3200
4 modules (single rank): DDR4-2933
4 modules (dual rank): DDR4-2666

Currently DDR4-3200 is not available in JDEC spec'd memory or capacities larger than 16GB, so that would impose a 32GB limit on systems that want the best performance within AMD's official supported speeds. For those who want to max-out memory, all of the 32GB modules I've seen are 2666MHz (and usually dual rank too, I think) so that means that to get the largest capacity you have to be on the lower end of the spectrum for RAM speeds.

Its a mess, unfortunately, and one that has been compounded by AMD repeatedly using memory speeds *outside of their own specs* in their marketing presentations - leading the public to think that those speeds are within spec, and thus wondering why we aren't testing with them.

Posted on 2019-11-13 17:48:36
Fabio Bernardino

Thank you for the quick replies. When looking for a motherboard (I'm in Brazil) and checking the QVL memory modules, I noticed the real speed of the fastest modules were even DDR4-2133 and a lot of them were real DDR4-2400 and DDR4-2666 (probably overclocked by the Intel XMP profile).
Actually only few fastest modules were real DDR4-2933.
I think AMD and youtubers made a lot of confusion about the infinity fabric. It's more like arguing than having a solution to have the work done.
Your comments made the issue simple to understand.
Congratulations on your machines and the publications.

Posted on 2019-11-13 19:14:55

I'm glad we could help! Best of luck with your system build :)

Posted on 2019-11-13 19:40:54

Excellent article, very sad you actually have to write what should be common sense. Every time a new processor be it x86-64 or ARM surfaces first "benchmarks" that tend to hit the media are Geekbench. Makes me cringe how much faith is put into that one, useless metric.

Posted on 2019-11-13 20:01:41

Synthetic benchmarks have their place - we use them in our production process to make sure the CPU/GPU/etc. are performing as they should in a general sense because they have very little variance between runs. But I completely agree that using them as an actual metric of real world performance is going to be inaccurate more often than it is accurate.

Posted on 2019-11-13 20:06:41