> But once you know you have good DIMMs, it doesn't look like you need to be quite so paranoid about bit errors.
Assuming that only the one-error-per-year cases were due to random bit flips, and all the multiple-errors-per-year cases were due to bad DIMMs, I came up with about a 1/5 chance of getting a single random bit-flip over a 6 year lifespan. But there also seems to be about a 1/3 chance of having a DIMM randomly go bad after a couple years, which of course without ECC would manifest as random crashes and lost (or maybe corrupted) work.
The errors we're talking about here are transient. The memory location itself is still usable, the contents get changed when a cosmic ray hits. After the hit, the corrupted value is held without a problem.
Memtest checks if the memory location has a gross fault which prevents it from storing values correctly.
Assuming that only the one-error-per-year cases were due to random bit flips, and all the multiple-errors-per-year cases were due to bad DIMMs, I came up with about a 1/5 chance of getting a single random bit-flip over a 6 year lifespan. But there also seems to be about a 1/3 chance of having a DIMM randomly go bad after a couple years, which of course without ECC would manifest as random crashes and lost (or maybe corrupted) work.