Puget Systems print logo

https://www.pugetsystems.com

Read this article at https://www.pugetsystems.com/guides/1102
Matt Bach (Senior Puget Labs Technician)

The thin line between stress testing and hardware abuse

Written on January 30, 2018 by Matt Bach
Share:

Over the last month, we have been streamlining and updating our benchmarking and testing process to make it both more efficient and much more effective. As part of this process, the topic of stress testing has come up several times. Stress testing is something we actively perform on all our systems because we want to catch any bad hardware in-house long before the system reaches the customer. In addition, if a component is mostly fine but has a slight flaw that will cause it to fail in a few months, we want to cause that failure to happen for us during the production process so we can fix it with minimal impact on the end user.

Currently, we do this stress testing with a number of applications including Linpack, Prime95, and Furmark among others. So far, this method has been very effective. Over the last year, 70% of all video card failures and nearly 80% of all CPU failures were caught in-house long before the system reached the customer. The end result is significantly less customers that need to deal with a hardware failure (and the support/repair process that goes along with it) which is a huge win in our books.

However, there are a number of more useful projects we could participate in to stress test our systems such as Folding@Home, SETI@Home, or even cryptocurreny mining (donating the proceeeds to charity). Each of these not only stresses the hardware in the system, but they do so in a productive and useful manner. The idea we have kicked around is that anytime a machine is idle for more than a short period of time - which is often the case when in queue for quality control or shipping - we would have it set to run a combination of these useful applications to put an extended heavy load on the system.

The concern we have had is whether it would actually too effective of a stress test. Is there a line where hardware testing becomes hardware abuse? Yes, we want to cause hardware to fail if it is close to doing so, but if we stress test for an excessive amount of time that could potentially shorten the lifespan of the system. This is a tricky balance to find. We feel very good about the balance we have right now because of the long term in-house and in-field failure rates we have hard data for with our PCs. On the one hand, "if it isn't broken don't fix it" but on the other hand, we might do even better with a change to our processes. What do you think? This is something we are very interested in hearing feedback on in the comments.

Tags: Benchmarking, Testing, Stress Test
Drac

Putting a system under stress is fine as long as it is not destructive - like running the system hot! I imagine there are ways to break in a system without doing something that will shorten its life.

Posted on 2018-02-01 03:51:23
MARC

I would say 4 to 12 Hours. I believe you only stress test now for under an hour within the OS? This is not including tests performed on parts before / during assembly of course.

Posted on 2018-02-09 04:30:33
AC

Instead of Furmark, folding @home + Einstein @home (heavy on PCie bandwidth). Mining is less intensive usually.
For CPU perhaps Asus Realbench or Handbrake HEVC / x265 encoding should be used.

Posted on 2018-03-01 06:50:38
Mariem Martin Buenaventura

Is there a free software to stress test a computer?

Posted on 2018-03-26 06:32:08

Yes, and a few are listed (named and linked) in this blog post.

Posted on 2018-03-26 18:39:50