

A gentle reminder that RAID doesn't make offsite backups - there
http://journalspace.com/this_is_the_way_the_world_ends/not_with_a_bang_but_a_whimper.html

======
tdavis
It is astonishing that a site with years worth of User-generated content
didn't have any sort of backup plan. Or just years worth of data in general!

RAID isn't a backup solution, _period_.

------
timf
And no wayback machine either, ouch:

<http://web.archive.org/web/*/http://journalspace.com>

 _"We're sorry, access to<http://journalspace.com> has been blocked by the
site owner via robots.txt."_

~~~
gojomo
Their robots.txt shows they allowed Google (at an extreme 3-minutes-between-
fetches crawl-delay), but no other search engines, nor the Internet Archive.

Otherwise, a tool like 'Warrick' might get back a significant portion of the
content -- especially any that was old and well-inlinked. As it stands, maybe
they can scrape a little from Google's cache.

Warrick: <http://warrick.cs.odu.edu/>

------
dreish
The warning about RAID is well-founded, but I'm finding it hard to imagine
what sort of backup plan could defeat deliberate sabotage by a former insider.
I keep coming up with ideas, and immediately thinking of problems with them.
Short of visually examining a substantial number of backed-up files to confirm
they contain genuine and correct data ... but then wouldn't it be possible,
with enough effort, to create data that looked superficially valid but was in
fact, on much closer inspection, mixed up beyond usefulness?

RAID isn't a guard against filesystem bugs, carelessness, or extreme physical
server damage, but it seems to me this is more of a warning against bad blood.

~~~
olefoo
If they had been backing their stuff up to AmazonS3, or just dragging a USB
disk into the datacenter every month, they would have a copy of the data. And
more than likely, if they used the S3 option, they'd have a copy current from
the night before.

Somebody has to eat the broccoli and do the shit work of making sure that the
data is backed up; if you can't do that or you hand off the job to the sort of
person who is going to sabotage your company because they don't feel
appreciated... you aren't a startup, you're a hobby.

~~~
dreish
The suspected culprit was the chief IT person at the company, and I think
we're being led to believe he was both clever and motivated. USB disks would
likely contain garbage data N days before the culmination of the sabotage,
where N is whatever it takes to make sure the full rotation schedule gets
ruined.

On the other hand, it probably isn't a coincidence that this happened to
someone who had no backups at all.

~~~
cconstantine
At a high level the way to defeat this kind of attack is to require more
people to be involved. Large conspiracies most often fail.

In this case that could mean having 3 USB disks, each with their own 'owner'.
The owner of each disk would be responsible for making the backups and storing
them in a place only they can access.

This is also lesson on how to let someone go. They say it's possible that a
former employee may have done the deed. When letting someone go on bad (and
even good) terms you need to have IT disable all of their accounts as soon as
possible, preferably while they are in-process of being fired.

------
eli
Not even a dev server or QA server with some version of the database? Not even
a one-off manual backup from ages ago? _nothing_?

Pretty amazing.

~~~
dabeeeenster
Come to think of it I find this pretty much impossible to believe. Unless the
data set was many, many terabytes of data, surely there would be some sort of
copy of the database somewhere? Even the schema with some reference data...

~~~
tdavis
That's entirely possible, but if the data is years old or simply reference
data, what's the point? They just lost years worth of peoples' blogs (or
whatever). That isn't really something that can be recovered from. Would you
go back? Would you recommend it to a friend?

~~~
dabeeeenster
If I owned JournalSpace and only had a schema left, I'd want to give it
another go, done properly. You still have the name, pagerank, domain name etc.
Sure, your name is (rightly) dirt for a while but I'd still want to give it a
go...

------
eli
Or on-site backups, for that matter.

~~~
gaius
I am absolutely certain, having been in a similar situation, that the
sysadmins kept saying "we need this" and the accountants kept saying "no". I
am also certain that the business will blame the sysadmins for not "selling"
it to them.

~~~
eli
Sure, I've been in similar situations too... but if it means buying a USB hard
drive on the office supply budget and running manual backups to that, so be
it.

------
amix
"Real men don't take backups, but they cry a lot."

That said, it's really hard to do backups of lots of data.

~~~
scott_s
If you can fit all of your data on one large drive, then it's not hard to do a
backup.

~~~
drusenko
that's not true. transferring a large amount of data in a consistent manner
can be difficult. dumping your database without taking down your live site (or
seriously impacting performance) can also be difficult without a dedicated
backup slave. and the same can hold for any storage solution where you are
short on IOs, no matter if the data fits on a drive or not.

~~~
eli
I dunno, dude. This doesn't seem like a high that requires extremely high
availability.

And anyway, if they can deal with rebuilding a RAID after a failed drive then
(by definition) the site can deal with copying all the data off the drive.
Heck, you could periodically yank a drive our of the RAID and replace it with
a fresh one, and you'd have a backup.

~~~
cconstantine
Just yanking RAID drives isn't a way to guarantee DB consistency, especially
if it's RAID5.

Database backups almost universally have to be made by the database system.
This is no excuse for lacking a backup system; backing up databases is a
solved problem.

~~~
eli
Seems pretty clear that it's RAID 1 (mirroring). And isn't that exactly what
RAID guarantees in a mirroring setup? That the two drives will have the same
bits?

~~~
cconstantine
Even RAID1 doesn't guarantee this'll work. There are many ways simply
disconnecting the secondary drive while cause problems:

1) The app may be in the middle of a series of DB commands that all need to
complete with success before the DB is consistent at the app layer.

2) The DB is in the process of writing out some table rows and hasn't
finished.

3) The DB has written some temporary locks to portions of the database that
need to be released.

4) The OS hasn't committed writes from the DB to disk

5) The disk hasn't committed writes from the os to platter yet.

Your best bet of this working is to cleanly shut down your app, then cleanly
shut down your DB and run an fs sync. After all that it might be ok to yank
the drive.

Just yanking a RAID1 drive may work sometimes, but I wouldn't count on it.
Especially when every DB system I know of has some sort of backup/dump
mechanism. As someone else mentioned, RAID is great for providing high
availability, but it does not provide disaster recovery.

~~~
gaius
With any sane storage subsystem you can un-mirror and re-silver hot. Lots of
people used to back up Oracle like this (using a 3 way mirror), so you'd only
be in hotbackup mode for a minute. This was before RMAN.

------
acangiano
That's what you get for shouting at those disks. Oh, different story. Sorry.
:)

------
DenisM
Speaking of off-site backups, I have recently discovered
<http://www.jungledisk.com/homeserver/index.aspx>

It backs up your stuff to S3, and it can work on your Mac/PC, or on Windows
Home Server. Have't tied it yet, but looks pretty neat.

Also, new NAS from HP has built-in S3 backup:
[http://www.engadget.com/2008/12/29/hp-mediasmart-server-
ex48...](http://www.engadget.com/2008/12/29/hp-mediasmart-server-ex487-gets-
hands-on-love-and-full-blown-rev/)

~~~
acangiano
I agree. JungleDisk is an easy to use program that allows you to back up on
S3, and it works with GNU/Linux as well. For $20, you get unlimited free
updates, unlimited installations tied to the same S3 account, and all three
versions (Mac, Windows, Linux). Amazon S3 is pricey, but this program looks
like a great deal. Other recommended services are Dropbox and CrashPlan.

~~~
grouchyOldGuy
I love Dropbox and I am a recent convert to it, thanks to someone remarking
about it in a HN thread a few weeks ago. I now have my critical (encrypted)
accounts & passwords file in Dropbox on computers both at home and work. I
like that it's cross-platform too. I haven't heard about CrashPlan before, so
I am going to Google it now. Thanks.

------
parenthesis
A reminder also to keep a local copy of your blog, or whatever, that, for all
you know, is hosted like this.

~~~
seldo
It's a good thing that stuff like this doesn't happen too often, or people
would be more wary of trusting all their data to web services.

I mean, we _assume_ Google has a really amazing backup strategy for Gmail. We
_hope_ they do, because, jesus, I have a lot of irreplaceable information in
there. But we don't actually know.

Hmmm... does Gmail have an "export all" feature...?

~~~
timmorgan
pop3 + fetchmail + crontab makes me feel a whole lot better

~~~
PieSquared
Sadly it doesn't work for chat... I use a lot of gmail chat, and there is no
known way to download a chat archive or anything such.

------
lionheart
I can't believe that they didn't have _any_ backups. It's not hard to do.

My operation is way smaller, and I have a cron script that runs once a week to
dump the database, zip it, and then transfer it to a backup server.

This is one the first things that I set up. It really should be the first
thing any company with user-generated content to do.

~~~
chops
After building the app, of course. Without the app, there's no data to back
up. Point taken, though.

~~~
gaius
You'd have a backup strategy in place for your source control system, right?
Even if it's just tarring it and scp'ing it somewhere in a cron job.

------
niels
I'd say they deserved to go out of business.

~~~
acangiano
I hate to be judgmental. We are humans, we all make mistakes. But it's my
understanding that they were operating for several years and hosting the data
for many people. Carrying on without a backup plan and hoping for the best was
very irresponsible and I agree with you that they, sadly, deserved to go out
of business.

~~~
grouchyOldGuy
The only good thing to come out of this is that everyone involved (sysadmins
and management) now is (or should be) a devout believer in making routine
backups. It's a hard way to learn a lesson, but they're not the first nor will
they be the last to learn the hard way.

------
catch23
raid 1 mirroring is pretty much only used to prevent downtime. So if your hard
drive physically screws up, you still have all the boot sectors intact on the
other drive so even a restart wouldn't kill you. I only use mirroring on disks
with spinning platters, for solid state I just ignore mirroring. Solid state
can go bad, but not as easily as magnetic metal spinning at 7200 times a
minute.

~~~
dreish
That should say "dramatically reduce downtime". A small number times a small
number is still not zero, and it is possible, though highly unlikely, to
suffer a second drive failure while operating in degraded mode. My
understanding is that's not what happened in this case, though.

~~~
tlb
It's actually very common for both drives to fail at once. Examples:

\- case fans fail, everything overheats

\- room AC fails, everything overheats

\- power supply goes haywire, toasts drive electronics

\- box falls off the shelf, drives crash

\- fire or smoke damages drives

\- roof leaks, drips onto drives

\- one drive fails but nobody notices for a month until the next drive fails

\- burglars steal box

No, not all these have happened to me.

------
PieSquared
...'gentle'?

------
PonyGumbo
There's mention of a staff, but I can't imagine how they survived based on the
traffic estimates I've seen for their site (14,000 visitors a month per
Slashdot). Does anyone know more about their business model?

------
scorpioxy
doesn't the "automatically copied to both drives" mean they're running a Raid
1?

The problem seems obscure...what kind of software bug overwrites all data on
disk?

~~~
olefoo

        dd if=/dev/random of=/dev/sda1

------
jcapote
Especially one that only contains two drives...

