Hacker News new | comments | show | ask | jobs | submit login

I think it should use ECC memory and an i3-CPU that supports it. Random memory bit flips in are going to corrupt data at a steady pace.

Intel i3 processors that support ECC: http://ark.intel.com/search/advanced/?s=t&FamilyText=2nd...

Also, it'd be interesting to hear why Backblaze doesn't use SuperMicro SAS-boards instead with a SAS-expander, like HP SAS Expander.

Oh, about the "random memory flips" -> in our particular application, the client running on a customer's laptop encrypts the data then calculates a SHA-1 checksum THEN transmits the file through HTTPS to the pods. The pods write it to disk with the checksum there. Once every couple of weeks we re-read the file and re-calculate the SHA-1 checksums. If there was ever a problem, we would detect it. These turn out to be VERY rare, but they do happen where a file is fine for many years then a bit is flipped "on disk" (we don't think they are in the RAM, but it doesn't matter, it is an "end-to-end" check). We assume this is happening in consumer systems also, but at the rates we see it would be undetectable in consumer's worlds (1 bit per customer lifetime - it would probably create a tiny mis-spelling in a MS Word document, or maybe one pixel would be wrong in one JPEG).

Or 1 bit flip could corrupt and entire 128 bites block of AES encrypted data. Openssl would complain when trying to decypher the file giving a "bad magic number" error.

BTW, keep up the great work guys!

Disclaimer: I work at Backblaze. The answer to pretty much any question is "sort by price". :-) The SAS-expanders are just a tiny bit more expensive, or at least they were the last time we checked. We were worried early on that many other designs seemed to prefer the expanders vs the port multipliers, but in all the years over 450 pods we've never seen the port multipliers give us any problems. Maybe they aren't as fast as the SAS expanders? But that isn't our current bottleneck so it wouldn't help us at all.

More worried about the lack of ECC. Have you done tests on (normally) undetected errors?

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact