
Ask PG: Postmortem of the outage? - lukeqsee
Clarification: This is not meant with any ill-will towards PG or any of the other individuals who help run HN. It is a simple request for a postmortem <i>eventually</i>. Perhaps it&#x27;s an unneeded request, but I think a lot of HNers echo the sentiment.
======
pg
I don't know the details. Nick Sivo is in charge of this stuff, and he'll post
something about it. I know he thinks the root of the problem was a disk
failure. The server got wedged, and when we rebooted, the file system was
corrupted. I'm not sure exactly why it took so long to restore. I was out of
town the whole time this was happening.

The reason we lost so much data was that we only do nightly backups. That
seemed enough when we started. Now that HN is a bigger part of more people's
lives, we'll make more of an effort to make it proof against this sort of
problem.

~~~
swalsh
Its amazing to think that HN is still someone's "side project"

~~~
hnriot
why? the value of hn isn't the software, but the community.

~~~
toomuchtodo
That's the point. It's still a side project, but the community has enormous
value.

------
theGimp
It seems all activity from the past two days has disappeared -- backup storage
is something you never regret paying for.

You've probably all seen it by now, but from @HNStatus: [1]

    
    
      Server back up and seemingly stable. Now restoring our latest backup to recover from limited filesystem corruption.
    

[1]
[https://twitter.com/HNStatus/status/420179162138021888](https://twitter.com/HNStatus/status/420179162138021888)

~~~
aroman
Yeah, I lost about 200 karma (what was about 15% of my total) in the crash.

Good thing they're just silly internet points :)

~~~
bvirkler
I lost 25% of mine! A whole point!

~~~
jijojv
I lost 50%

------
sigvef
During the outage,
[https://twitter.com/HNStatus](https://twitter.com/HNStatus) went from
somewhere around 300 to 1163 followers.

~~~
lukeqsee
Earlier today I saw it had about 45 followers. I think it was a new account.
(Please correct me if I'm wrong.)

~~~
yaddayadda
Their first post is dated 28 July -
[https://twitter.com/HNStatus/status/361707202123268096](https://twitter.com/HNStatus/status/361707202123268096)

edited: "They're" \--> "Their" (there/their/they're will be the death of me!)

~~~
asveikau
> (there/their/they're will be the death of me!)

I sure hope that isn't literal. I've heard of "grammar nazis" but that would
be ridiculous. Stay safe!

------
morganherlocker
I was bummed that the conversation around openstreetmaps got killed in the
middle of it, and now I do not see it on the front page. Does anyone have a
link to that thread or did it disappear?

~~~
eevilspock
me too. i guess we can start over:
[https://news.ycombinator.com/item?id=7015502](https://news.ycombinator.com/item?id=7015502)

~~~
jpatel3
Its no longer on the front page anymore..

------
yaddayadda
I find it interesting that this question is fresher (by a minute), has more
points (67 v 42 at snapshot), and has more comments (18 v 10 at snapshot) than
"HackerNews down, unwisely returning http 200 for outage message" but is
ranked lower (2 v 1 at snapshot).

snapshot -
[http://oi40.tinypic.com/2mmbv5y.jpg](http://oi40.tinypic.com/2mmbv5y.jpg)

~~~
Kronopath
Self posts are penalized so they don't clog the front page for long.

[http://jacquesmattheij.com/The+Unofficial+HN+FAQ#selfposts](http://jacquesmattheij.com/The+Unofficial+HN+FAQ#selfposts)

------
rhizome
Postmortem: it went down last night when people should have been going to
sleep before their first day back at the job after holidays. It stayed down
until the end of that day, with the last couple of days of vacation insanity
erased.

Appreciate the gift of perspective that has been given.

~~~
carljoseph
Interesting that your perspective is locked into one side of the globe. ;) HN
was down during the day my time, when we had already slept before returning to
work. :)

Appreciate the gift a new perspective gives you.

~~~
rhizome
I scoped it to a YC framing.

------
geerlingguy
Would like to read it too. And it looks like right now is a good time to get
just about anything in the front page. Front pretty much == new.

~~~
ewoodrich
Are you actually able to see 'top'? I'm still getting the error.

EDIT: (never mind, it was just cached)

~~~
geerlingguy
Yeah, the 200 response during the outage is playing with everyone, I think;
you have to do a hard refresh on any URL you had visited during the downtime
:/

------
rcfox
pg: I don't know how much you care to get back the data that was lost, but it
seems like it's at least partially available in the hnsearch.com API:
[http://api.thriftdb.com/api.hnsearch.com/items/_search?prett...](http://api.thriftdb.com/api.hnsearch.com/items/_search?pretty_print=true&sortby=create_ts%20desc)

------
joshuaheard
I'm not an expert in internet architecture, but shouldn't a site this
important be running on redundant servers? The irony of a tech site going down
due to technical issue is making me grin, however. Glad to see it back :)

~~~
alan_cx
"Important"?

Really?

Obviously Im a fan of the site, etc, etc, but "important"? On what level?

Im not even sure I'd call Facebook or Twitter important. Banking, yes. Weather
warnings, yes. Things like that, sure. But, Im also pretty sure "important" is
slightly over egging it for dear HN.

(No offence PG xxxx)

~~~
kamaal
>>Im not even sure I'd call Facebook or Twitter important.

Imagine Twitter or Facebook being down during Egyptian revolutions.

~~~
mkr-hn
IRC was the go-to before social networking. It's where I got up to the minute
updates as the events of 9/11 unfolded, despite being 800+ miles away. That's
also when I realized TV news is obsolete.

------
dschiptsov
Is there any plans to release a new version of Arc, if it exists or server
side code (without business-critical stuff)? I guess that there are lots of
improvements since last Arc release.)

------
nmc
Despite the website being back online, the root URL still redirects to the
error page (at the time of writing this).

So [https://news.ycombinator.com/news](https://news.ycombinator.com/news)
works, but [https://news.ycombinator.com](https://news.ycombinator.com) still
redirects to _" Sorry for the downtime. We hope to be back soon."_.

~~~
watermel0n
It's your browser cache.

~~~
nmc
Yes it was!

------
rainmaking
This must have been the most productive time for the tech industry in months.

~~~
ithkuil
No, I just kept wasting time reloading HN home page or following notifications
on twitter!

------
cenhyperion
I'm also interested in what the infrastructure of HN looks like. One of the
tweets via @HNStatus seemed to imply that the site runs off of one application
server.

~~~
sigvef
HN is indeed running on a single (10 month old) server, it seems [1].

[1]:
[https://news.ycombinator.com/item?id=5229364](https://news.ycombinator.com/item?id=5229364)

------
xmonkee
Social experiment

~~~
noblethrasher
You may jest, but I once suggested something like that:
[https://news.ycombinator.com/item?id=2403880](https://news.ycombinator.com/item?id=2403880)

------
ithkuil
I wonder how much effort would be reasonable to improve the resilience of HN
to this kind of issues, given that's a relatively rare issue and HN doesn't
really have a money loss in case of a downtime such as this.

~~~
ams6110
Probably little. There's no ad revenue being lost, no business transactions
that can't be completed, no life-saving information that can't be accessed.
When you boil it down, it's a social/entertainment site, nothing anyone can't
live without for a day or two.

~~~
pbhjpbhj
> _nothing anyone can 't live without for a day or two_ //

On this basis can't you shutdown pretty much anything the majority use day-to-
day?

------
DonGateley
If the outage was due to something malicious I don't really expect to see a
postmortem.

------
pearjuice
Do we get the karma we lost refunded somehow? I am certain I am missing around
30 points.

~~~
pbhjpbhj
Because?

------
royalghost
I am sure pg is going to write an essay on this :-)

------
stickhandle
It bothered me more than it (probably) should

------
vanwilder77
Damn I made it to 1300 karma's!

Now I'm back at 1273.

------
nhangen
Because it's not good enough that the site is back, we need to pile on and
complain too...

~~~
lukeqsee
I'm not complaining. I'm asking a legitimate question. A popular site has been
down for a significant period of time; I think a postmortem will be
insightful.

~~~
nhangen
If you're asking, it means you know that PG authors them in most instances.
The fact you felt the need to start a new thread, just minutes after the site
is back, tells me you're anxious and overeager.

~~~
lukeqsee
I don't disagree with your assessment. I think it's more of an opportunist
running an (admittedly) self-serving social experiment of my own. And I was
also curious (naturally) and well-aware _someone_ would ask the question.

I wasn't complaining, however. :)

Edit: clarification on motivation.

