Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Thank you! Opening an incident as soon as user impact begins is one of those instincts you develop after handling major incidents for years as an SRE at Google, and now at Anthropic.

I was also fortunate to be using Claude at that exact moment (for personal reasons), which meant I could immediately see the severity of the outage.





It's important for companies to use their own products.

Unless using your own dogfood prevents you from fixing it if it breaks

https://www.theguardian.com/technology/2021/oct/05/facebook-...

I have a memory that Slack fell into this trap too (I could be wrong)


Facebook notoriously had to cut open the doors to one of their data centers.

Google SRE still keeps IRC available in case of an emergency.


Now I’m imagining the folks at Slack gritting their teeth and using MS Teams

Take my condolences, Sunday outages are rough

Sweet. Hopefully it is more than instinct but a codified at Anthropic. I.e. a graduate engineer with little experience can assess and raise incident if needed.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: