Over the past few months, and several hardware releases, many things have come to light with Unreal and how my test scenes behave on various hardware. Not to mention there was a major update with the release of 4.26, which changed how the Movie Render Queue functions as well as added GPU Lightmass baking. This all has led to some changes to how we’ll be benchmarking going forward. I’ll go over the new plans, and welcome any additional feedback.
The first test scene we used is an ArchViz test file created by Epic. This scene is a fantastic example of an architectural workflow. It features high-resolution materials, as well as relying on ray tracing features. The main problem with this scene is that when running at 4k with ray tracing enabled, it takes over 18GB of VRAM. This would cause any system with less than 16GB of VRAM to crash. Even if the video card in question has enough VRAM, it would at best get about 3 frames per second. At that low of a frame rate, the internal scripts would actually fail. The script is created to run for three minutes, count how many frames were rendered, then calculate the average FPS. However, if the FPS dropped too low, it would end up taking longer than three minutes and always return an average of 2.5FPS.
So what is the solution? Instead of editing every material, object, and rendering setting, I’m going to shift the focus to rasterized, and add the new GPU lightmass baking. This is still a very standard ArchViz workflow. The GPU lightmass produces excellent results, pretty close to ray tracing quality, while getting rasterized performance. Many people in various industries are really looking forward to this tool.
The next scenes are Megascans Apartment and Goddess Temple, created by Quixel and Epic. These float between something a higher end game may see, or a virtual production, or a cinematic render. These were previously only used for ray tracing, but now I’m going to add a rasterized pass to them as well. Just wanted to get another data point for rasterize performance. The other scene that I used for rasterize testing is the Virtual Studio, however, that is a basic enough scene that is often CPU limited, which skews the results.
Also with these scenes, I’d like to reintroduce the Cinematic Rendering. Previously, the Apartment scene would take roughly 3:30 for any video card to render the cinematic. Meanwhile, the Temple scene would crash any card with less than 16GB or VRAM. I’m hoping the improvements in 4.26 will make this a useful test again.
Lastly, is the Virtual Studio. This is a fairly basic scene representative of what you might see in a news or sports broadcast. The biggest problem here is that with ray tracing off, it is easily topping 200+ FPS and being bottlenecked by the CPU. This makes all video cards report virtually the same FPS.
So I’m going to drop the rasterize tests and only run this with Ray Tracing enabled. Depending on how long the overall test is taking I may add a GPU lightmass bake into here as well. This one may also be on the chopping block altogether. I originally included it because it is set up to accept an NDI video input with live greenscreen. However, this actually causes very little overhead.
In the end, there will be three rasterize passes, three ray trace passes, two cinematic renders, and one or two GPU Lightmass bakes. These changes should give a much more accurate and reliable result to our testing.
This only covers the GPU portion. There is still some work to be done with the CPU portion of the benchmark. Some of that, such as code compiling, takes a significant amount of time on lower end CPUs, so some decisions need to be made on what is worth testing. CPU is also a little complicated because depending on what you do, it could be a major bottleneck, or something you don't really run into.