

Coding Horror and blogs.stackoverflow.com experience "100% Data Loss" - gfunk911
http://codinghorror.com
From the site:<p>Coding Horror experienced 100% data loss at our hosting provider, CrystalTech.<p>I have some backups and I'll try to get it up and running ASAP!
======
nkohari
To be fair, Atwood thought his hosting provider (CrystalTech) was backing up
his system. As it turns out, their entire VM backup solution failed silently,
so everyone thought the backups were being made.

If anything, I'd say this is a sign not to work with CrystalTech.

~~~
DTrejo
_If anything, I'd say this is a sign not to work with CrystalTech._

After a mistake like this, it may be a sign that they will never mess up
again.

~~~
potatolicious
I disagree - it's a sign that they will never mess up their VM backup system
again.

It's also a sign that they lack experience and competence. So while your odds
of suffering _this_ problem are significantly lower, your odds of the infinite
number of _other_ problems are still troubling.

------
leftnode
1\. Register an Amazon AWS S3 account - [https://aws-
portal.amazon.com/gp/aws/developer/registration/...](https://aws-
portal.amazon.com/gp/aws/developer/registration/index.html)

2\. Download my S3 backup script (or anyone's S3 Backup script) -
<http://github.com/leftnode/S3-Backup>

3\. Set up a cron to push hourly/daily/whatever tar's of your vhost's
directory to S3.

Spend, like, $10 a month. Thats 30gb of storage, 30gb up and 30gb down. Now, I
know that may not be a lot, but I doubt codinghorror.com had that much data.

~~~
PStamatiou
I do the same thing, with a script using ruby s3sync I wrote a while ago (that
I should probably update):

[http://paulstamatiou.com/how-to-bulletproof-server-
backups-w...](http://paulstamatiou.com/how-to-bulletproof-server-backups-with-
amazon-s3)

~~~
Legion
s3sync is great. I'm using it for automated backups for a number of work
projects. Thank you for writing it and sharing it.

~~~
PStamatiou
Oops I should clarify: "script using ruby s3sync I wrote a while ago"

I wrote a bash script that uses s3sync (not written by me). You can thank the
s3sync community for that! :-) <http://s3sync.net/wiki>

------
idlewords
"looks like it's 100% internet search caches for recovery. Any tips on
recovering images, which typically aren't cached?"

This does not sound like the tweet of a man with backups.

~~~
joshwa
"ugh, server failure at CrystalTech. And apparently their normal backup
process silently fails at backing up VM images."

<http://twitter.com/codinghorror/status/6573094832>

~~~
scott_s
It's not a backup if it's in the same location and managed by the same
process.

~~~
mrduncan
Also, if you don't regularly test backups they might as well not exist.

~~~
Confusion
If you regularly spend time testing backups, you should start paying someone
to do that. For instance, the guys who make the backup. Who should provide
that as part of the service. Which mean they are responsible for failures and
can be held accountable.

------
scott_s
Let me get this right: the guy who ran a site whose entire purpose is to make
fun of other people's poor programming practices has no external backups?

~~~
nollidge
I think you're thinking of Daily WTF. Jeff's blog is/was just a programming
blog.

He also frequently said he was the world's worst coder, so...

~~~
pvg
Well, he did make sure to tell us to shut up, stop what we're doing and make a
back-up. Because he knows things, you see:

[http://74.125.93.132/search?q=cache:2HHNAk2SB6EJ:www.codingh...](http://74.125.93.132/search?q=cache:2HHNAk2SB6EJ:www.codinghorror.com/blog/archives/001045.html)

~~~
jrockway
The site says he has a backup, right? So it sounds like he ... does have a
backup.

~~~
pvg
Yes, except he doesn't. Hence his tweets and his own recent question on
stackoverflow about recovering his content from internet caches.

~~~
jrockway
Ouch. I wanted to give him the benefit of the doubt, but I guess he is just
fucked.

------
gfunk911
I want to make a snarky comment, but I just feel for the guy.

~~~
jacquesm
I feel for the guy too, but it really isn't the first time in the last couple
of months that some service provider fails at their #1 stated goal: to keep
your data safe.

And it isn't just the small ones either.

For the love of - insert your favorite deity here - please _try_ to restore
your backups, and try to do so on a regular basis. If not all you might have
is the illusion of a backup.

It is a very easy trap to fall in to, and I really am happy this guy writes
about it because it seems there are still people that feel that if their data
is in the hands of third parties that it is safe.

That goes for your stuff on flickr, but it also goes for your google mail
account and all those other 'safe' ways to store your data online. In the end
you have to ask who suffers the biggest loss if your data goes to the great
big back-up drive in the sky, the service provider or you. If the answer is
'you' then go and make that extra backup.

~~~
pstuart
There's an added bonus to restoring backups on a regular (daily) basis: a one
day old instance of the production environment is always available as a
playground for training, dev or qa.

------
jacobian
"I had backups, mind you, but they were on the virtual machine itself"
(<http://twitter.com/codinghorror/status/6577510116>).

Sigh. When we will ever learn?

------
skrenta
I sent jeff tarballs from blekko's webcrawl for www.codinghorror.com,
blog.stackoverflow.com, www.fakeplasticrock.com and haacked.com - about 6300
pages overall. He's got Coding Horror back up from the basic html.

Unfortunately we don't have the images, but it looks like most of the site is
back up at least. It will probably be more work for him to re-integrate it
into the cms though.

------
walkon
Here's Google's cached version of a post of his about backup strategies:

[http://74.125.95.132/search?q=cache:2HHNAk2SB6EJ:www.codingh...](http://74.125.95.132/search?q=cache:2HHNAk2SB6EJ:www.codinghorror.com/blog/archives/001045.html)

It ends in what is now irony:

"If backing up your data sounds like a hassle, that's because it is. Shut up.
I know things. You will listen to me. Do it anyway."

~~~
natrius
The part you quoted is a quote itself, which could change the meaning for
readers like me.

~~~
walkon
The second sentence was. He quoted it earlier and repeated it at then end as a
clear endorsement and agreement of the statement.

------
synnik
I sympathize, but I have little patience for calling out your data provider
when disaster strikes. Sure, there is likely some technical fault on their
end, but ultimately you are accountable for your own site.

I'm not going to imply anything negative about him for losing the site
temporarily. We all have "learning experiences"...

I just hope that when he does come back, he posts an insightful analysis of
how he could have done more for his own reliability, rather than point fingers
at a vendor.

~~~
akeefer
How is it pointing fingers if your hosting provider lost your site plus their
backups of your site? They were being paid to provide a service, and they
failed at it in basically the worst possible way (i.e. total data loss, rather
than just downtime). Sure, he should have had his own backups, but how does
the fact that he could have recovered better form his host's screw up change
the fact that they screwed up?

That's not too far from saying that it's your fault if you get hit by a car
while walking across a crosswalk because you didn't jump out of the way fast
enough.

They failed, they failed in a catastrophic way, and they deserve to have it
made public knowledge and to lose business over it. He should do a better job
backing up his own stuff, but he's right to be angry with his hosting provider
and to call them out.

~~~
synnik
He is right to be angry, but I have to disagree about calling them out while
the site is still done.

An analysis posted after the fact can lay out where the technical failures
took place, and that is the right time to describe issues with the host.

Pointing fingers in anger is very frowned upon in all organizations I work
with. It implies a lack of ownership of your own products and a lack of
maturity in handling your business affairs. Organizations that pursue blame
before solutions do not have a positive culture -- they have a fear-based
culture.

To answer your direct question, nothing he says will change the fact that the
vendor screwed up. Your analogy is correct that it is the drivers fault if
someone hits you, but flawed in that you cannot absolve all responsibility for
your own safety when crossing streets.

When a vendor screws up, the level of professionalism that you portray when
dealing with the situation says a great deal about yourself.

~~~
akeefer
Point taken regarding calling them out before the situation is resolved: it
would be far more respectful to allow the provider a reasonable amount of time
to correct the error before explicitly attacking them. In this situation, I
can understand it a little more, as recovering the backups from various
internet caches might seem like a time-critical operation, and it would be
difficult to ask for help with that without some explanation of what's
happened. That explanation certainly could be fairly vague, though. Once the
situation is resolved, though, if the resolution is unsatisfactory I think
it's fair to say so.

I really took issue with your initial comment because "pointing the finger"
has some connotation of assigning blame unfairly or unreasonably; I 100% agree
that companies with a culture of blame are poisonous, and that people should
err on the side of accepting too much responsibility rather than too little,
and do their best to not pass blame on to other people.

There's certainly a line, however, across which I think it's reasonable to
call someone else out. Where that line is depends on the situation and your
relationship: the bar for doing it within your team is astronomically high
(you should basically always deal with those things internally), within even
the same company is still incredibly high (likewise), but it's lower when it
comes to vendor relationships. Where you draw the line is probably different
from where I draw it.

So while I agree that he should have waited, I don't think that publicly
expressing his anger with his hosting provider after the fact would count as
"pointing the finger" or "passing the buck" or otherwise indicative of a lack
of personal responsibility; to me it would be understandable frustration and
anger out of having been so dramatically let down by a third-party you were
contracting with. And honestly, that sort of negative public publicity is one
of the strongest checks we have on companies, be they hosting companies or
retail stores or any other type of establishment.

More to the point: even if he did have backups, if they really lost his data
and were unable to recover it themselves, I think he'd still be justified in
outing their failure publicly after the fact. But again, you're definitely
right that he should have waited and given the host a chance to resolve the
issue before saying anything.

------
wingo
It's not the sort of thing I like to read, but still, it makes me cringe in
sympathy.

 _/me does a git pull from his weblog_

------
LargeWu
The next Stack Overflow podcast should be quite interesting...

------
apowell
This is why I like restoring my backup slice every so often, just to make sure
it's there. A couple years ago I had to use archive.org as my external backup,
and it wasn't fun.

------
jsteele
<http://web.archive.org/web/*/http://codinghorror.com/>

\--PidGin128 via bmn

~~~
ismarc
Looks like someone changed something near the end of June '08 that prevented
archive.org from storing the site. Gotta suck.

~~~
andrewcooke
they have a lag (it's supposed to be 6 months, but they're trailing that at
the moment).

~~~
ismarc
Yeah, I knew about the lag, but 1 1/2 years of lag seemed to signify other
issues to me.

~~~
andrewcooke
another data point here (i have been wondering about what is happening ever
since this thread) - i just noticed that archive.org's bot has crawled my web
site in the last month (or, at least, something identifying itself as such in
the logs).

------
jodrellblank
When people forget to save a document and close the program, among the "ha ha
I'm so superior" replies are a few people advocating unpopular niche systems
which are nice enough to autosave.

When people make mistakes and save them over the top, there among the smug
laughter are a few people on the user's side reminiscing about long forgotten
systems of old which have versioning filesystems by default so recovery is
only a moment away.

I haven't seen any replies in this thread along those lines - everyone is
putting it firmly on the system administrator or the user. Would it be so hard
during an install for a program to say "and now enter an encryption password
and an ssh server address where I can backup to nightly"? If it's so simple
you can script it yourself in a few minutes, isn't it so simple that many/most
systems should come with that themselves?

It's long past the time where "computer lost my data" "well you've only
yourself to blame" should be considered an old fashioned attitude.

------
scblock
This has to be tough. I run a small server that includes email hosting and an
image gallery site that numerous people contribute to, and I get a lot of
questions or complaints when it is down (usually bad network or a brief DC
power outage). I have been using SSH and rsync for a long time to pull the
contents of every important directory on that server to a local Solaris server
running a RAIDZ pool and time slider (so I do get incremental backup).

I didn't actually lose data like Jeff, but the datacenter the server was in
decided to kill the power to the machine 2 weeks before the scheduled date
(poor processes and a move to a different building) and I didn't get any
notification before it happened. It took another 3 weeks for them to ship my
machine back to me. Becuase I had nightly backups I was able to restore email
and the photo site to a new Linode instance in a few hours. Without those
backups I would have been hurting bad.

------
pavs
/runs and backs up everything.

------
antirez
I've backups of my silly blog antirez.com, this guy ran a business out of his
blog and don't have external backups. The data is small, it is easy to
transfer. It's almost unbelievable to me.

~~~
jrockway
Did he? I thought Jeff had some "real job" in addition to blogging and Stack
Overflow. I honestly doubt an occasionally-updated programming blog was
putting the food on his table.

~~~
antirez
I remember he blogging about making more money from the blog than from the
work at some time and deciding to blog full time.

I guess I can not link to the blog post...

~~~
AgentConundrum
> I guess I can not link to the blog post...

Well, I sort of can:

[http://74.125.93.132/search?q=cache:9vLV7YkK14sJ:www.codingh...](http://74.125.93.132/search?q=cache:9vLV7YkK14sJ:www.codinghorror.com/blog/archives/001074.html+leaving+vertigo+site:codinghorror.com&cd=1&hl=en&ct=clnk&gl=ca)

------
joe_the_user
In the Joel thread, there was mention of "Truly Great Programmers(TM)", what
their qualities are and how to hire them. I don't know if folks were putting
Jeff in that category or not. But as a skeptic of this True Greatness, I'm
perhaps grasping at straws in saying "hey look, anyone can screw up, True
Greatness, phooey".

------
techiferous
In case you'd like to back up your own blog, I posted a four part series on
how to do it with Amazon S3 and cron: [http://techiferous.com/2009/11/getting-
started-with-amazon-s...](http://techiferous.com/2009/11/getting-started-with-
amazon-s3-and-s3fox/)

This assumes you're running your blog on your own VPS.

------
chewbranca
Sounds like a good coding horror article.

------
mhartl
Redundancy, people. All my really important data lives in at least three (and
often more) of the following eight places: (1) my current MacBook Pro; (2) my
legacy Linux box; (3) my external USB drive; (4) my personal (Slicehost)
server; (5) Dropbox; (6) Mozy; (7) Heroku; (8) GitHub. Wow, that was even more
than I expected. Did I mention redundancy?

------
bad_user
"Only wimps use tape backup: _real_ men just upload their important stuff on
ftp, and let the rest of the world mirror it ;)"

    
    
        Linus Tolvards
    

I guess people use Google's cache and archive.org these days :)

~~~
klodolph
I think it's more of an observation about which of the two are _real_ men, and
less an observation about Google.

------
mncaudill
His blog gets linked to a lot, for better or worse. That's a blow.

------
blhack
This link appears to no longer link to a relevant post :(...

------
radu_floricica
Just my 2 cents here: I encrypt my daily sql backup before sending it in a
couple of places. This way I can store it pretty much everywhere - one place
is actually on a client's server. Aescrypt for encryption, ssh with expect for
actual transfer. Oh, and watch out for expect timing out - it was a fun moment
when backups grew beyond the default timeout and expect started truncating
them. Yup, backups should be checked often.

~~~
blasdel

      Do not use SSH with Expect.
      Do not use them here or there.
      You will not use them anywhere.
    

Not just because of how much expect sucks: SSH tools leave out a --password
option for a reason! Use passphrase-less RSA keys with a restricted account.

~~~
_phred
Also, because expect(1) defeats the entire "remote shell" aspect of SSH, which
lets me do this from my workstation:

    
    
      ssh server tar zc /srv/http > http-backup.tar.gz
    

Yes, that's a remote command whose output is piped to a local file. With
imagination, ssh, and shell-fu, one can get very far with backup automation
and testing.

------
jcapote
Maybe he should've bought Microsoft Backup Server 2010

------
thechangelog
Looks like coding horror is now back online.

------
jonknee
Maybe he should have asked a question of serverfault.com for the best way to
backup his server?

------
scorpion032
I feel for him.

But I wonder if this eventually turns out as a case study for "How to backup
an entire site using the archives and search engine caches"

I know it's hard; but everything is _there preserved_ anyway. So it _is
possible_

~~~
pronoiac
I won't link it again in this thread, but check out Warrick. It's a tool to do
just that.

------
kolosy
well that didn't take long...

[http://stackoverflow.com/questions/1890914/how-do-i-
backup-a...](http://stackoverflow.com/questions/1890914/how-do-i-backup-a-
blog)

~~~
MHordecki
The question is gone, apparently.

EDIT: Back at [http://superuser.com/questions/82036/recovering-a-lost-
websi...](http://superuser.com/questions/82036/recovering-a-lost-website-with-
no-backup)

------
trs81
he's back: <http://twitter.com/codinghorror/statuses/6581734847>

------
dryicerx
Oh The Irony

This is really bad news for any service, but seriously... codinghorror.com
(shakes head)

 _Rule 16. If you fail in epic proportions, it may just become a winning
failure._

~~~
jrockway
System administration is not "coding".

~~~
gaius
That's splitting hairs. What's a backup script? What's a restore if not a unit
test?

------
clistctrl
What a nightmare, I hope he has a good writeup on the story when the site is
back online. I'm sure there are some great lessons to come out of all this for
the rest of us. Specifically, what were his expectations from the service
providers he worked with.

------
zackattack
Seems like there's room for a new startup for easy backups for shared
webhosts.

------
Methos
how about a google search site:<http://codinghorror.com/>

and then looking at cached results?

------
omouse
AWESOME. Less trash on the Web :D

------
heresy
well its just his blogs, for a second i thought it was stackoverflow.com and
the rest of his sites, which would have been funny.

------
bham
I cannot imagine how _horrible_ this must be. Oh, the horror!

