I wonder what the reliability stats on this setup is, though. Is it really cheaper to jam all those drives in one unit without redundant PSU's, MB's, or a boot drive?
I'd guess you'd have to build at least 2 of these units and mirror them to get any sort of reliability. And, at that point, how long does it take to copy 58 TB across https?
I would assume they don't care about any of the hardware, just the data. If you look at the setup, and how the drives are raided, there 45 drives are sub-divided into 15 raid arrays, of which it would take 3 to die before they lost data. Essentially, they would need 20% of their drives to die simultaneously for them to lose data.
Now, for the rest of the hardware, its not that important to if it fails. If one of the other components die, you're only looking at some down time (and a possible dead hard drive or two from a PSU dying, which I assume they monitor regularly). As long as the data is secure and in one piece, it doesn't really matter whether the pod is up or down, until someone needs the data. Just send out your repair guy to replace it and reboot it, and its fine.
In sysadmin terms, they have aimed for data integrity over availability.
If a system goes down and they have to replace a disk or a power supply or motherboard, the data is still safe, and if by chance there's any $5/month users who need to restore data held on that one unit, well they can wait a few hours :-)
I wonder what the reliability stats on this setup is, though. Is it really cheaper to jam all those drives in one unit without redundant PSU's, MB's, or a boot drive?
I'd guess you'd have to build at least 2 of these units and mirror them to get any sort of reliability. And, at that point, how long does it take to copy 58 TB across https?
data is hard.