

Ask HN: Do you trust Amazon S3 or Mosso Cloudfiles not to lose or corrupt your data? - bk

Do you treat these systems as "reliable" and "safe" backends (since they're physically and geographically replicated) or do you feel it's still necessary to back up data hosted on their services?<p>How/where do you back up that data to, especially with large amounts of data (e.g. 100s of GB+)?
======
jasonkester
S3 does this for a living. I don't.

S3's entire business revolves around never losing anybody's stuff. They have
hordes of smart people working on the problem, and they have an architecture
that makes it really hard to lose anything by accident.

My business, on the other hand, revolves around letting people draw cartoon
testicles onto other people's powerpoint presentations in the pretense of a
"web meeting". Which of us would you rather trust to keep hold of your
valuable data?

~~~
gcv
_S3 does this for a living. I don't._

So does Carbonite, and, as you can see in today's news, it also lost people's
data. I strongly suspect Amazon does a better job than Carbonite with both
software and infrastructure, but still. Mistakes happen, bugs happen, and data
gets lost even by smart people working for solid and reputable companies.

S3 only backups = eggs in one basket. It's a terrific, strong basket made of
titanium and suspended on aircraft cable, but it's still just one basket.

~~~
sgoraya
_So does Carbonite..._

Good example - when I read the story I was not surprised. It was bound to
happen to at least a company or two. I used carbonite for a bit before I went
back to my own server (w/nightly backup). I thought I let my paranoid
tendancies take over, but I just never felt good backing up my data to
carbonite, et. al.

I also agree that S3 is much more failproof but would still rather keep things
on my own server. I may be doing different in the future though, but for now
there is no place like my own server!

------
charlesju
I trust S3 and Mosso Cloudfiles more-so than my own single-failure HDD. They
have a lot of redundancy built into their system, and although every system
has risks, its risk are far less than our person undistributed implementations
of file storage.

~~~
lyime
It pretty much comes down to this. No one can guarantee 100% redundancy, not
even Amazon. Although their setup and infrastructure will probably reduce
overall MTTF (mean tike to failure) in comparison to a small hosting company
or your personal backup solution. Very simple math can show that their system
is probably more reliable then others, yours or mine.

------
spkthed
Sure? It's like anything else, don't trust a single solution. In addition to
local backups S3 is good, in addition to S3, Mosso. There will always be
accidents, data loss, corruption, etc. The only way to mitigate that risk is
simply to cover your bases and avoid relying on a single thing.

~~~
electromagnetic
Agreed, I have backups on my laptop and external hard drive, so if one of my
drives fail I'm covered. If both fail, well that's why I've got the really
important stuff backed up on my FTP server.

Every few months I email my hotmail account with all my writing. As text is
ridiculously compressible I haven't even hit the 10MB attachment limit. I also
have all my more current documents saved in Google Docs, mostly for the
portability but also for the very slim chance of an "Oh my god I broke my
laptop, holy crap my house burnt down, and dammit I forgot how to connect to
my FTP server and for the love of god I forgot the password to my hotmail
account."

------
rs
They are as safe as any other service provider. Ultimately, its always good
practice to:

1\. Do your own backups

2\. Routinely test that you can recover from these backups

I would argue that point (2) is much more important than point (1). I do try
to do that at least once a month to ensure that there aren't any bugs in the
backups, including any missing parts of the infrastructure.

For mirroring really large data, rsync is a viable solution.

Edit: I do want to add that performing your own backups is really subjective
and you might need to ask yourself - what's the cost to me/my business/my
users in the event that I can't recover from backups and or my provider failed
in their own reliability (for e.g. Carbonite)

------
vaksel
Just use both at the same time. Use S3 for active stuff, and Mosso as your
secondary backup. The chances of S3 and Mosso crapping out at the same time
are pretty much nill. And the cost of hosting something on S3/Mosso, as a one
time backup is dirt cheap

------
mikecuesta
I have a lot of faith in S3, more so than any local storage I may have.

------
iamelgringo
My site, cuuute.com is hosted on EC2. I use Elastic block for database
storage, and I back both that and my EC2 instance to S3 on a regular basis.

I've been using that as my hosting for 2-3 months, and I couldn't be happier.

------
bbuffone
S3, Mosso and other cloud providers are not an appropriate back up mechanism.
These are good for sharing files and a temporary storage system. The only
legitimate backup is a physically stored disk. S3 doesn't provide versioning
nor deletion protection.

The problem is I hear about these new "Cloud" storage companies claiming
backup, but when asked what do they do. They rely on Amazon to move the files
into different data centers. But anyone can delete a file or directory by
accident and poof the files are gone forever.

If the storage provider does have permanent physical storage in there backup
plans, don't think your files are forever.

~~~
JungleDave
Yes, that's a good example of why not to use S3 as a simple backup
destination. However software like Jungle Disk on running on top of S3 adds
versioning and deleted file retention to make it act like a real backup
system.

Having a local backup is great too, but that won't protect you from fire,
flood, or theft in many cases.

~~~
bbuffone
I wasn't talking about zipping up your files on a CD and putting them in a
draw. There are companies like IronMountain, which give you off-site physical
storage.

I am sure the deletion protection Jungle Disk provides is good but what if
somebody internally deletes a S3 file or directory of files they are gone for
ever. This isn't an S3 problem just a issue with the strategy.

~~~
JungleDave
Someone at Iron Mountain or Fedex could lose your backups too (accidentally or
maliciously). Neither is too likely, nor is someone internally deleting data
at S3. That said, if you have data you can't ever take a chance to lose, don't
store it in only one place ever - not just on S3, not just on a USB drive, not
just in off-site physical storage. Keep at least two copies, maybe more. One
of the things we're working on is allowing you to backup to multiple cloud
providers - so even in the unlikely event of a catastrophic cloud failure
you'd have other copies available.

