Article Thumbnail

Advantages of ECC Memory

Written on November 5, 2013 by Matt Bach
Share:
Table of Contents:
  1. What is ECC?
  2. What about Registered Memory?
  3. ECC Failure Rate Analysis
  4. Downsides of ECC RAM
  5. Conclusion

What is ECC?

ECC (which stands for Error Correction Code) RAM is very popular in servers or other systems with high-value data as it protects against data corruption by automatically detecting and correcting memory errors. Standard RAM uses banks of eight memory chips in which data is stored and provided to the CPU on demand. ECC RAM is different as it has an additional memory chip which acts as both error detection and correction for the other eight RAM chips.

Prior to ECC memory, error detection was done via even or odd parity bits. In a computer, data is most commonly stored 8-bit chunks. When parity is being used, an additional ninth bit - or parity bit - is written which allows the system to detect when there is an error. If the system uses even parity, then the 1's and 0's (including the additional parity bit) should add up to an even number. For example, if the data written to the RAM is "10011011", since even parity is being used, a 1 would be added to the data so that when you add up the numbers (1+0+0+1+1+0+1+1+1), you get an even number. If an error were to occur and the data the RAM sends to the system is instead "10011001+1" (which adds up to an odd number), then the system knows that the data is corrupt.

ECC is an extension to parity as it uses multiple parity bits assigned to larger chunks of data to not only detect single bit errors, but correct them automatically as well. Instead of a single parity bit for every 8 bits of data, ECC uses a 7 bit code that is automatically generated for every 64 bits of data that is stored in the RAM. When the 64 bits of data is read by the system, a second 7 bit code is generated, then compared to the original 7 bit code. If the codes match, then the data is free of errors. If the codes do not match, the system can determine where the error is and fix it by comparing the two 7 bit codes.

The method of comparing the two codes is most commonly done by what is called the Reed-Solomon code. Warning, only attempt to understand the Reed-Solomon code if you really, really like math.

What about Registered Memory?

Registered (often referred to as "buffered") memory uses a technology that is often paired with, but not directly related to, ECC RAM. Registered memory has a "register" that resides between the RAM and the system's memory controller which lessens the load that is placed on the memory controller itself. This allows for more memory modules to be used at one time than would otherwise be possible.

While ECC RAM is not always Registered (since you may need the error correction of ECC without the large quantities made possible by Registered memory), almost all Registered memory will be ECC. This is simply due to the fact that systems that use large amounts of memory are almost always going to prioritize stability as well.

ECC Failure Rate Analysis

ECC RAM is theoretically more stable and reliable than standard RAM, but many times theory does not match up with fact. To see if ECC RAM really is more reliable, we looked up our failure rates for ECC and non-ECC RAM over the past 3 years.

One thing to note is that while we have tried many different brands of memory over the years, we have always returned to Kingston due to their consistently lower failure rates - up to 6x better in some cases! Because of this, we decided to include only Kingston desktop/server memory in our failure rate analysis. Including other brands makes ECC RAM look even better, but we feel that comparing within a single brand is a much more realistic comparison.

As the graph above shows, ECC RAM has a much lower failure rate than non-ECC RAM. The ~1% failure rate of the Kingston non-ECC RAM is still very, very good (which is why we primarily use Kingston), but the ECC RAM is even better at an average .24% failure rate.

One thing to notice is that over the past three years, Kingston RAM has become even more reliable over time. This is true for both ECC and non-ECC RAM and is currently at the point where we have not had a single stick of ECC RAM failure this year at all.

While a lower failure rate is certainly great, it is worth a little more investigating to determine what the cause of the failure was. Memory errors or system instability is much worse than a simple no POST failure. A faulty stick of RAM causing the system to not POST is an inconvenience, but is very unlikely to affect the data stored on the system. Memory errors, on the other hand, are much more likely to corrupt data if left unchecked.


The incredible thing about the graphs above is that over the past three years, we have not had a single case of memory errors or system instability caused by ECC RAM. Every single failure was due to either no POST or the system rebooting when we tested the memory for errors. While the rebooting issue is not ideal, the 25% reboot failure actually adds up to only 2 sticks ever with that specific problem, and both were all the way back in 2011.

The failures for non-ECC RAM, on the other hand, are overwhelmingly caused by memory errors. In fact, only 9% of the failures (No post, other/misc, and incorrect size/speed) were the type of failures that would not put your data at risk. The other 91% of failures were the type that you absolutely do not want to see in a server or other system that contains valuable data.

One thing we do want to make clear is that although non-ECC RAM currently has about a 1% failure rate, the testing we perform on all of our systems catches the majority of the issues. In the field, the failure rate for non-ECC Kingston RAM is only about .4%, or roughly one stick for every 250 sticks we sell. So while ECC RAM is certainly important for servers and systems with high-value data, non-ECC RAM is more than stable enough for use in most home or work systems.

Downsides of ECC RAM

ECC is designed to be more stable than traditional RAM, and our failure records show that this is indeed the case. However, there are a few downsides to using ECC RAM. The first, and most obvious, is that not every computer can use ECC memory. Most server and workstation motheboards require ECC RAM, but the majority of desktop systems either won't work at all with ECC RAM or the ECC functionality will be disabled.

Second, due to the additional memory chip and the inherently more complex nature of ECC RAM, it costs more than non-ECC RAM. The amount varies, but you should expect to pay roughly 10-20% more depending on the size of the memory stick. The larger the stick, the higher the price premium.

Finally, ECC RAM is slightly slower than non-ECC RAM. Many memory manufacturers say that ECC RAM will be roughly 2% slower than standard RAM due to the additional time it takes for the system to check for any memory errors. To verify this, we examined multiple benchmarks that we run on each system we produce. By using comparable CPUs (For example: Intel Core i7 4771 3.5GHz Quad Core 8MB versus Intel Xeon E3-1275 V3 3.5GHZ Quad Core 8MB) we found that this 2% estimate to be roughly correct. Our own benchmarks showed a performance hit ranging from .72 to 2.2% which, given normal testing deviations, is right in line with the 2% estimate.

Conclusion

If you have a server or system with high-value data where system stability is of upmost importance, these few drawbacks are very likely not even close to being an issue. The cost of RAM has come down so much recently that even a 20% increase in price only equates to about $10 per stick, which in a server environment is a very worthwhile investment. As for the performance decrease, 2% is such a small amount that it is likely never going to be perceptible outside of performance benchmarks.

At the cost of a little money and performance, ECC RAM is many times more reliable than non-ECC RAM. And when high-value data is involved, that increase in reliability is almost always going to be worth the small monetary and performance costs. In fact, anytime it is possible to do so, we would recommend using ECC RAM.

Tags: ECC, Registered, Memory, RAM
Brad

Are there any desktop class motherboards that support ECC RAM?

Posted on 2013-12-24 19:59:27
mrnuke

I'd like to point out that ECC memory is considered server-grade. They are sold to a market which pays much closer attention to reliability than the consumer market. As a result, ECC RAM is usually subjected to much stricter testing and validation before shipping. This is a big factor in why ECC modules see much lower failure rates.

Noe to addressing the performance drop with ECC. This has more do to with the memory controller. A well designed memory controller should incur no penalty under normal operation.

A little beside the issue, but ECC RAM is usually registered. That would explain the performance drop more than the time it takes the "system to check for any memory errors". To add a twist to this, when using several ranks per channel, registered RAM can actually have an advantage, as it allows the memory controller to issue a 1T command rate, versus the 2T command rate caused by bus loading. An unbuffered dual-rank module has eighteen times the bus loading on the command lines versus a registered DIMM.

Posted on 2014-11-24 08:25:25
keval Patel

hello sir,
i am vfx artist and video editior.
i want to buy workstation but i am confused which one wiill be better.
core i7 4790k or xeon e3
for adobe after effects, premier (4k video editing)
dell or hp??

Posted on 2015-07-13 09:53:20

The Xeon E3 line only goes up to about 3.7GHz, so the i7 4790K (at 4GHz) would be faster. However, for video editing - particularly 4K editing - I would usually advise something even more powerful. Those processors you asked about are both quad-core, but if you move up to the Xeon E5 line then you can get anywhere from 4 to 18 cores... and you can use two CPUs in a pair, for even more performance. Professional video editing workstations usually consist of processors using that sort of technology, though of course that gets more expensive as well.

As for Dell or HP... I'd suggest neither! If you are in a region we serve (the US, Canada, Mexico, etc) I would recommend purchasing one of our Genesis media editing workstations. Or if you are on a budget, and need to stick with just a quad-core processor, check out the Spirit.

Posted on 2015-07-13 17:55:09
goblin072 .

Some WS motherboards support 4 types of ram. I have read ECC performance hit is less than 1 percent in games. But that was non buffered ECC.

It would be interesting to see Benchmarks GAMES and Non Games

64 Gigs of non ECC

64 Gigs of unbuffered ECC

64 Gigs of Buffered ECC

64 Gigs of LRDIM ECC

Which would you want in you workstation? I am talking about performance of both games and non game applications. Is unbuffered ECC going to be the fastest or is buffered ECC?

Posted on 2016-02-16 15:33:09
funklord

A 2% performance hit is nothing compared to the severe inconvenience of memory corruption.
I'd choose ECC for every machine including phones and tablets even if it meant a 50% slowdown.

Posted on 2016-05-03 13:15:52
Siegfried

Are you sure that your statistics are in percent?
if you write "is only about .4%, or roughly one stick for every 250 sticks" that means that it is permille (based on 1000) not percent.?

Posted on 2016-07-13 08:02:58

1% = 1 out of 100
0.4% = 0.4 out of 100, or 4 out of 1000
4 : 1000 = 1 : 250

Posted on 2016-07-13 15:28:09