Hacker News new | past | comments | ask | show | jobs | submit login

The answer is "Yes" to all of your questions.

Regarding the question about culture, yes, busy people often get sloppy. But when a P1 alert comes because a site reliability engineer could not resolve the issue by following the playbook, it looks bad on the team and a lot of questions are asked by all affected stakeholders (when a service goes down in Amazon it may affect multiple other teams) about why the playbook was deficient. Nobody wants to be in a situation like this. In fact, no developer wants to be woken up at 2 a.m. because a service went down and the issue could not be fixed by the on-call SRE. So it is in their interest to write good and detailed playbooks.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact