Hacker News new | comments | show | ask | jobs | submit login

This was all me. I probably should have thought about it more, but just wanted to make it clear we knew something was wrong and were working on it. Load was not a concern.

Though the article is correct, with everything else that was going on response codes and cache headers were the least of my worries.

I think the best takeaway is that you will go down at some point, so it's best to have a reasoned plan in place for when you do. Handling it in the heat of the moment means you'll miss things.

Yep, I think your takeaway is exactly the right one, about planning in advance for outages.

I didn't post to try and make HN look bad, using the wrong http status code isn't a huge deal or anything. I just wanted to take the opportunity to discuss http response codes, an issue near to my heart. (In my day job at an academic library, the fact that most of the vendors we deal with deliver error pages with 200s does interfere with things we'd like to do better).

Thanks for the reply!

If it becomes a big deal, and you can't get the data from ThriftDB for some reason, I've got a copy of HN data (submission & comment content, user, points, item date) up to id 7018491, a comment by kashkhan, time-stamped 2014-01-05 23:56:34.

edit: Ack, I just realized that item_id got reset back to 7015126 on the reboot. My data matches HN up to 7015125, and then diverges after that.

Can you make it available somewhere? I'd be interested in that data, and sure that others would be too. Thanks!

Sure. This is very temporary, I'll be removing the links sometime tomorrow:



After a semi-random sampling, the comments file appears to contain nothing but comments pre-crash.

The submissions however got clobbered a little by the crawler at some point. There are some submissions in there pre-crash and some post-crash; I think everything's OK from 7015172 on, which only leaves 15 possibly damaged rows, and of those, I'd expect most of them didn't have id collisions. Sorting out the old stuff from the new stuff could be manually done.

(Please let me know if there's anything I should be concerned about in those, or if they shouldn't be posted for some reason, or something. I'm recovering from flu and am still not entirely all here.)


Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | DMCA | Apply to YC | Contact