Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

This may sound selfish, but github does such a great job of writing up post mortems, that I almost look forward to their outages just because I know I'm going to learn a lot when they write their follow up.


Came here to say the same thing. This is Mark Imbriaco's wonder twin power. It really is something to generate goodwill from an outage writeup. Part of it is just unflinching transparency combined with nerdy details; you never feel like they're hiding anything, and you get to learn about all the operational doodads they're working with to run at this scale.


Thanks, I really appreciate that.

For me, the motivation for transparency came from too many frustrating instances of being kept in the dark after things had gone wrong. The worst thing both during and after an outage is poor communication, so I do my best to explain as much as I can what is going on during an incident and what's happened after one is resolved.

There's a very simple formula that I follow when writing a public port-mortem:

1. Apologize. You'd be surprised how many people don't do this , to their detriment. If you've harmed someone else because of downtime, the least you can do is apologize to them.

2. Demonstrate understanding of the events that took place.

3. Explain the remediation steps that you're going to take to help prevent further problems of the same type.

Just following those three very simple rules results in an incredibly effective public explanation.


This sort of approach is the reason that when I need to upgrade to a higher plan on Github, I don't flinch. In fact, I love giving you guys more money, simply because you make my life completely painless; I don't think I can say the same about any other service. Keep up the awesome work.


GitHub and CloudFlare are my two favorites in this regard; if you like GitHub's post-mortems, you'll love CloudFlare's: http://blog.cloudflare.com/tag/postmortem


Git itself is a distributed VCS, so the GitHub downtime shouldn't have been too devestating to most people. However, speaking as somebody who uses GitHub Pages a lot and had it go down at one of the worst possible times for me, I can say that this postmortem definitely quenched any possibility of me considering moving somewhere else, since I am confident my data/uptime is as safe with them as it could be with anybody.

Everybody suffers from downtime. It's how you handle it that matters.




Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: