

Failure is an Option - mcfunley
http://www.indecorous.com/failure/

======
wpietri
Fantastic.

One of the biggest shifts in my thinking about deployment has been optimizing
for low MTTR (Median Time to Recovery) rather than low MTBF (Median Time
Between Failure). It really helps me push against all of those appealing-but-
harmful "solutions" where the theory is that if we all just think a little
harder we can be perfect.

I still would prefer zero errors. But when there's a tradeoff between that and
low error impact, I'll almost always take the latter.

