
Ma.gnolia.com crashes hard -- no backups? - eli
http://ma.gnolia.com/
======
mdasen
This is why it's so important to take some simple steps to ensure that you
don't bork your site. For my production sites, I usually have the following
setup:

* Replicated Database with one slave in a separate data center just as disaster recovery (not a performance thing with the separate data center). Heartbeat for the slave(s) in the same datacenter so that it stays up.

* Nightly backups offsite. Used to be rsync, but recently thinking of using tarsnap for the ease of it (and S3 puts it in several data centers).

* Files stored either in a MogileFS setup or S3 in multiple data centers.

It doesn't have instant failover to another data center, but the offsite DB
slave should mean no data loss beyond a second or two. Ma.gnolia is a decent
sized site. Maybe they did have a decent infrastructure and it will only be a
little while before they've gotten everything back. Of course, after the
fiasco with the blog site that used RAID as their backup system, I've started
to think that many people don't take data as seriously as it needs to be
taken.

~~~
bprater
I'd really like to store incrementals of my MySQL at S3. Anyone using
something they really like? Or should I just break-down and do full dumps each
night?

~~~
mdasen
Tarsnap.

Tarsnap lets you say: tarsnap -c -f backup01302009 mysql_dir/

And you can just adjust the date each day. It gives you the luxury of a full
dump (anytime you want to restore, just reference backup01302009), but it only
actually stores the deltas (making sure not to duplicate data that might be in
backup01292009 or backup01282009 and so on). Tarsnap stores the data to S3 so
that it's replicated in multiple data centers.

It costs a little more than S3 at 30 cents per GB, but it's metered out so
that if you only use 1MB of storage, you'll only be charged 0.03 __cents __for
that storage. You _could_ try creating your own way of doing incrementals, but
I doubt you'd get it as efficient as Colin (the math genius behind Tarsnap)
and so I doubt you'd get it cheaper. Plus, this way you don't have to deal
with it.

And remember, it's hard to fill up a database.* As the Django Book notes:
"LJWorld.com's database - including over half a million newspaper articles
dating back to 1989 - is under 2GB." So, if they were using Tarsnap, they
might be storing 5 or 10GB tops at a whopping $1.50-$3 per month plus whatever
the transfer of their deltas was for the month. Oh, and tarsnap compresses the
data too. So, maybe they'd be paying $1 or something lower.

* Clearly, if you hit it big time, you might not want to continue paying for tarsnap. However, if you become the next big thing, you can hire someone to deal with it for you.

~~~
joshu
This doesn't work. You can't just copy the files and expect them to be in a
sane or consistent state.

You either need to a) use InnoDB hotbackup or b) use a slave, stop the slave,
run the backup, and restart the slave to catch up.

At delicious we used B, plus a hot spare master, plus many slaves.

Additionally, every time a user modified his account, it would go on the queue
for individual backup; the account itself (and alone) would be snapped to a
file (perl Storable, iirc.) Which only got generated when the account changed,
so we weren't re-dumping users that were inactive. A little bit of history
allowed us to respond to things like "oh my god all my bookmarks are gone" and
various other issues (which were usually due to API-based idiocy of some sort
or another.)

~~~
bjclark
Using a slave isn't fool proof either. If someone were to run a malicious
command, it gets replicated, and could get backed up before being caught.

~~~
joshu
I didn't say that. Read what I wrote.

You use the slave so you can shutdown the database and get a consistent file
snapshot. Then you do offline backup.

------
bprater
Gut wrenching, for sure. To see your whole model explode, it's just not worth
it.

If you don't have a backup plan -- close your browser, stop surfing Y!Hacker,
and go write a shell script to dump your database and rsync to anywhere else
on earth.

~~~
eli
...and then do a test restore!

~~~
rarrrrrr
YES! This comment should be modded up to the top.

We talk to a lot of customers about their backups every day. I'd say more than
half of the backup failure stories we hear come from failing to adequately
test that data can actually be restored.

Some have been comically tragic. Like creating an offsite backup using
encryption keys that are only archived as part of said backup... :(

~~~
timf
_"creating an offsite backup using encryption keys that are only archived as
part of said backup..."_

Ugh, that is awful.

------
rarrrrrr
An easy way to have a space efficient, perpetual backup of a database:

Run the dump file every N units of time.

Compress the dump file using gzip with the --rsyncable option. This increases
size by 1% but makes it efficiently diff/patchable.

Use diff to make a patch going back to the previous compressed dump of the
file. Keep the patch, discard the previous dump if you like. You now can apply
the patches in succession to go back however far you like.

Finally, and most importantly, use par2 to store parity along with your backup
files to protect against silent bitrot.

Note: --rsyncable is in newer gzip versions. It's too new to be included in OS
X Leopard.

~~~
yeahit
Wouldnt these two solutions be even nicer?

* Backup the Binlog of the DB (At least its called Binlog in MySql)

or

* Make a diff of the uncompressed dump and zip that. This diff would even be human readable.

------
Locke
It's easy to say, "What? No backups... how stupid can you be?" But, without
knowing the particulars, I wouldn't necessarily jump to that conclusion. It's
very easy to have regular automated backups fail in some way.

Unfortunately, it's not enough _just_ to have backups. You have to actually
verify that they're correct and up-to-date. Verification is easy when your
database is small (for example). You can just load it in your development
environment occasionally. But, how do you verify your backups if you have
hundreds of gigs or even terabytes of data?

As an example, I've seen cases where backups were successful every night...
but, they were being run against a slave db and replication had failed. The
result: Excellent backups of weeks old data.

~~~
timf
That's nasty. Maybe you could automatically inject test data into a fake
account and run test queries on the backup. That will not verify you are
getting everything but at least that you got _something_ recent.

~~~
Locke
Verifying backups is almost always many times more difficult than setting up
the backups themselves. Checking that replication was in fact working turned
out to be fairly easy to automate (although, short of checksumming all the
records in all the tables you can't really be 100% sure everything is okay).

That's the thing with backups, they can fail or become corrupt in many ways.
If you don't use them for something on a regular basis or have some regular
verification process you may not know what you have until it's too late.

And, of course, I've also seen situations where subtle data corruption in the
master database leads to weeks of subtly corrupt backups. By the time the
corruption was discovered we faced the choice of rolling back several weeks to
the last good backup or fixing the corruption. In the end we had to do a kind
of merge -- it was a real pain.

~~~
bemmu
If I just have one MySQL machine and take daily database dumps, what would be
good way of testing that my backups are OK?

~~~
Locke
The best situation for me has been one of my personal projects. It's a gaming
site that I use everyday. The database dump is less that 100MB gzipped, so I
load it in my development environment every week or so. That way as I develop
I'm verifying (to some extent) the quality of the backup.

As a baseline, you should at least restore from your backups occasionally.

It helps that I'm familiar with the data -- so after restoring the backup I
expect to see games, messages, forum posts, etc that I've just seen in
production.

I do some more thorough automated tests on the backups less frequently. App
specific things, like replay games and verify results, etc. This process is
more about assuring backward compatibility with the code though.

I think verification should ultimately be somewhat app specific. That said,
I'm sure you can find tools to help with verification.

------
rarrrrrr
The trouble with convincing humans to back up is that it's important, but not
urgent. For any given day of procrastination, the likelihood of consequences
is small.

Our motto is "If it's not backed up, it's down." (Applies doubly so for us as
a backup company.) For sufficiently seasoned administrators, non-redundant
storage should cause sleeping difficulties.

------
raghus
_Early on the West-coast morning of Friday, January 31st,_ \- poor guys must
be frazzled. Jan 31 is Saturday.

~~~
seldo
Or else this is a dire warning from the future! There's still time to save the
data; this accident won't happen until tomorrow!

------
quoderat
The backup strategy, in my experience, is the first or second thing you need
to be thinking about when you set up a server, service or anything IT-related.

The first thing I ask when my boss tells me to set up something new: how are
we going to back that up?

~~~
eli
It's a classic dilemma -- most people don't care about backups until
immediately after it's too late.

------
timf
I'm planning on burning time on building a full failure solution. Records
snapshotted at least daily and any single node/service can die entirely and
there is an exact, _tested_ manual recovery checklist or automatic rollover
option in place for each permutation.

This runs counter to the more cavalier "release early, polish later" advice I
keep seeing. Maybe I am doubly freaked out because the things I'm storing are
_not_ easily recovered or re-imported by the users themselves or any kind of
algorithm/redux.

~~~
bprater
It'll be a time sink. Why not do it as dirty as possible now and go back and
tune it as time goes on?

~~~
timf
See the original submission, that's why not :-)

I want to not only have a backup scheme but also make sure it's restore-
tested. Maybe I wasn't totally clear, not planning on a beautiful failover in
each place in the beginning (planning failover for the DB at least). Just a
tested (even if manual) restore procedure in each situation.

------
jodrellblank
I doubt Google have backups, take a copy home on a CD, have a massive third
party database with dedicated certified people, or backup to Amazon S3.

I suspect their file storage system is simply good enough to replicate data
intelligently across many machines/sites like a giant RAID array.

I also think that it's a truly massive benefit they have over other companies,
particularly small companies, and that it's rarely discussed as such.

~~~
eli
Magnolia ain't exactly indexing every page on the internet though, ya know?

I agree with you that that is one of Google's key strengths, though. PageRank
is neat, but the real edge is that the site is _fast_ and the infrastructure
is based on commodity hardware, so it's cheap.

~~~
jodrellblank
The real edge is not that it's based on commodity hardware, but that it's
based on scalable low-management software (I imagine all this, and skimmed
their Google File System paper once. I don't work for Google).

It wouldn't matter if they were blades or Sun UltraWhatevers, the real benefit
is that if one dies, nobody need panic - plug another in and "the system" will
rebuild it. If a rack dies, plug another in and the system will rebuild it. If
a datacenter dies, the others will cover for it.

No backup tapes, no manual tweaking each new build.

~~~
mikeyur
As its been said a million times before, hardware is cheap. Creating a system
that can utilize the cheapest Celeron servers you have with the flick of a
switch is what you really need to create.

Adding a few more servers is much cheaper than downtime.

------
critic
If YC crashes and nulls my karma, there's gonna be a lawsuit!

------
sam_in_nyc
His note says it's January 31st... What will get fixed first? The date, or the
data? I'm betting on data.

~~~
whatusername
I don't know about you - but it's January 31st for me.. :P (although I do
believe it was just a typo)

~~~
jjs
Tell us what the future is like! ;-)

~~~
whatusername
very hot down here..

Just had 3 days in a row over 43 (109) and todays pretty warm as well. (my
house still hasn't cooled down for sure)

------
acangiano
Backing up is like testing, everyone knows that they should do it, but very
few people actually do. That said, I'm sorry to hear about their troubles.
They are good people genuinely trying their best.

------
eli
What gets me is how people even do normal development without _some_ sort of
backup -- even an informal one.

Maybe I'm just lucky enough to work with relatively small datasets, but our QA
server contains a complete copy of the production database and is at most a
week old. So even if we didn't run any other backups (which we do), we could
restore from that server and be back up and running within hours.

------
nickb
From reading the notice, I get the feeling that they don't have any backups!
Not very re-assuring at all.

------
decadentcactus
Well, I finally did it. I stopped what I was doing, and worked out my backup
strategy, which includes (finally) getting a script put together, and now to
adjust it to email me copies and also (contemplating) using S3 to store
copies. Maybe also box.net, depending on a few things.

------
billturner
They have a page/section at GetSatisfaction, but it doesn't seem to be updated
anytime soon (though there are 3 employees set to answer questions):
<http://getsatisfaction.com/magnolia>

------
philfreo
... glad I'm on del.icio.us, but going to export all my bookmarks now anyways.

~~~
timf
Let's say you had an insane amount of bookmarks (or something precious like
it). Would you pay to have DVDs burned and sent to you?

It seems like a convenient re-assurance for users to get DVD backups mailed to
them every so often -- but it would be a total pain to support this as a small
business...

~~~
rbanffy
A burned DVD full of bookmarks is, indeed, an insane amount of data.

At about, say, 2K per bookmark, that's a couple million bookmarks. If one
bookmarks a site every 60 seconds, 24x7 that's about 4 years of work. A
slightly more reasonable person, bookmarking every 60 seconds for only 8 hours
a day, would have to be bookmarking websites continuously since about 1996 to
fill up a DVD.

~~~
timf
I understand, I am more responding to the preciousness and non-replaceability
of it, the actual application for this I have in mind is not for bookmarks :-)

------
sam_in_nyc
Does anybody use this site? I can never, ever remember the URL, and that
bothers me a lot.

The way I see it, there are too many social bookmarking sites out there. If
there's now one less, I wouldn't mind.

------
palish
I wish Twitter allowed comments on tweets. Companies (like Magnolia) use it
for important announcements / status updates, and yet there's no way to see
feedback from the community.

~~~
tptacek
That's what search.twitter.com is for.

~~~
timf
<http://search.twitter.com/search?q=%40magnolia>

~~~
timf
Yikes: "I feel like I've lost a piece of me. This is _scary_."

------
vaksel
sounds like the same thing as Journal Space
[http://www.techcrunch.com/2009/01/03/journalspace-drama-
all-...](http://www.techcrunch.com/2009/01/03/journalspace-drama-all-data-
lost-without-backup-company-deadpooled/)

------
ahoyhere
We worked out our backup strategy for our time tracking system
(<http://letsfreckle.com/>) while in beta. While we're still small, we do full
db dumps to off-site, once an hour.

This does not require a lot of sophistication. And yes, we might have some
downtime while we restore in the event of a catastrophe, but we _do_ have the
data.

I can't even imagine the nerve of shipping a product to the public that
doesn't have even something this rudimentary in place.

