You've probably all seen it by now, but from @HNStatus: 
Server back up and seemingly stable. Now restoring our latest backup to recover from limited filesystem corruption.
Good thing they're just silly internet points :)
edit: to clarify "lost" karma
Just internet points, fortunately :)
 - https://news.ycombinator.com/bestcomments
It looks like you've already exceeded that now... Well done! The internet gods must favour you.
Now commented back.. Hope the thread owner reads them and replies back.
On a more general note if anybody has backups and they aren't regularly tested restoring them, then you really don't have backups! As an added bonus, regular restoration tests let you practice for the "real deal" and you know how long the entire process will take.
We'll never need that old repo again.
The google cache got a few comments, but very few.
I really don't want credit, I just want the original question and the couple of responses to have the opportunity to see the day of light, despite the HN outage. So please vote up the original post!!!
Indeed, but the time and CPU it takes to do backups can be a very nasty trade off.
For that matter, as I write this, due to a mistake I made after a 85 minutes power outage yesterday, I'm just now doing my daily incremental backup of my home machines to an LTO-4 tape drive. Keeping that drive fed fast enough to prevent "shoe-shining" took some effort, Bacula spools up to 100G at a time to a partition on a single 15K disk on a separate controller. But if I had a LTO-5 drive, from what I've heard there's no single disk in existence that can keep up with a drive (not counting SSDs, which are a very poor match for this use case).
I'd like to migrate to ZFS but have yet to. Still just running EXT4.
HN should be on a replicated data store like Riak. Losing a node or two shouldn't take the system down, or should at least run in a degraded state (read only) until hardware is restored.
edit: the code has been public for a long time, and there is not a database to replicate. the site ran as a single server for years, and it is unlikely the front end caching has changed anything about the "database" components.
Since RAID failures actually are somewhat common, they are probably looking at a higher level replicated storage system now, a la DRBD, or some kind of distributed file system, a la Gluster.
Similarly, if you rm -rf a vital directory tree, RAID can ensure that it goes away reliably.
Filesystem corruption without hardware failure is far rarer in my experience. Have you seen an instance that wasn't a proverbial user error?
Back in ~2004 I watched IT spend a whole day recovering our 60-person startup's main Linux NFS server, due to a software bug in the storage driver. Had to rebuild the whole system from backups.
HN is persisted to flat files.
Maybe that's nothing new, but I just noticed it. Seems like a bug.
It doesn't seem to do anything though.