Hacker News new | past | comments | ask | show | jobs | submit login

After fixing a recent bug, I asked my client company what if any postmortem process they had. I informally noted about 8 factors that had driven the resolution time to ~8 hours from what probably could have been 1 or 2. Some of them were things we had no control over, but a good 4-5 were things in the application team's immediate control or within its orbit.

These are issues that will definitely recur in troubleshooting future bugs, and doing a proper postmortem could easily save 250+ man hours over the course of a year. What's more, fixing some of these issues would also aid in application development. So you're looking at immediate cost savings and improved development speed just by doing a napkin postmortem on a simple bug. I can't imagine how much more efficient an organization with an ingrained and professional postmortem culture would be.

For anybody into podcasts, I can recommend "Causality" https://engineered.network/causality/

John Chidgey digs into well known catastrophes, analyses what went wrong, and what was fixed afterwards. Not software related but promotes a safety mindset very well.

I love Chidgey's podcasts


Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact