How/where do you back up that data to, especially with large amounts of data (e.g. 100s of GB+)?
S3's entire business revolves around never losing anybody's stuff. They have hordes of smart people working on the problem, and they have an architecture that makes it really hard to lose anything by accident.
My business, on the other hand, revolves around letting people draw cartoon testicles onto other people's powerpoint presentations in the pretense of a "web meeting". Which of us would you rather trust to keep hold of your valuable data?
So does Carbonite, and, as you can see in today's news, it also lost people's data. I strongly suspect Amazon does a better job than Carbonite with both software and infrastructure, but still. Mistakes happen, bugs happen, and data gets lost even by smart people working for solid and reputable companies.
S3 only backups = eggs in one basket. It's a terrific, strong basket made of titanium and suspended on aircraft cable, but it's still just one basket.
Good example - when I read the story I was not surprised. It was bound to happen to at least a company or two. I used carbonite for a bit before I went back to my own server (w/nightly backup). I thought I let my paranoid tendancies take over, but I just never felt good backing up my data to carbonite, et. al.
I also agree that S3 is much more failproof but would still rather keep things on my own server. I may be doing different in the future though, but for now there is no place like my own server!
While S3 might be doing this for a living, Amazon doesn't. AFAIK,revenues from cloud services are not at all significant given Amazon's scale. What does S3 license say? Is Amazon liable if it loses data stored in S3?
Every few months I email my hotmail account with all my writing. As text is ridiculously compressible I haven't even hit the 10MB attachment limit. I also have all my more current documents saved in Google Docs, mostly for the portability but also for the very slim chance of an "Oh my god I broke my laptop, holy crap my house burnt down, and dammit I forgot how to connect to my FTP server and for the love of god I forgot the password to my hotmail account."
1. Do your own backups
2. Routinely test that you can recover from these backups
I would argue that point (2) is much more important than point (1). I do try to do that at least once a month to ensure that there aren't any bugs in the backups, including any missing parts of the infrastructure.
For mirroring really large data, rsync is a viable solution.
Edit: I do want to add that performing your own backups is really subjective and you might need to ask yourself - what's the cost to me/my business/my users in the event that I can't recover from backups and or my provider failed in their own reliability (for e.g. Carbonite)
I've been using that as my hosting for 2-3 months, and I couldn't be happier.
The problem is I hear about these new "Cloud" storage companies claiming backup, but when asked what do they do. They rely on Amazon to move the files into different data centers. But anyone can delete a file or directory by accident and poof the files are gone forever.
If the storage provider does have permanent physical storage in there backup plans, don't think your files are forever.
Having a local backup is great too, but that won't protect you from fire, flood, or theft in many cases.
I am sure the deletion protection Jungle Disk provides is good but what if somebody internally deletes a S3 file or directory of files they are gone for ever. This isn't an S3 problem just a issue with the strategy.