
The Calculus of Service Availability - kiyanwang
https://queue.acm.org/detail.cfm?id=3096459&__s=dnkxuaws9pogqdnxmx8i
======
nickpsecurity
"Thus far, this article has established what might be called the "Golden Rule
of Component Reliability." This simply means that any critical component must
be 10 times as reliable as the overall system's target, so that its
contribution to system unreliability is noise. It follows that in an ideal
world, the aim is to make as many components as possible noncritical. Doing so
means that the components can adhere to a lower reliability standard, gaining
freedom to innovate and take risks."

In information security, the pioneers discovered the concept of the Trusted
Computing Base. They noted that systems had a lot of attack surface. Problems
would show up everywhere. Verification cost went up and feasibility down as
size and complexity of the system increased. The solution was to design
systems where you could trust one or a few components to ensure the security
of the system while all others got breached. That's the TCB. It was required
to be NEAT: Non-bypassable, Evaluable, Always-Invoked, and Tamper-proof.

[http://www.landwehr.org/1983-bats-ieee-computer-
as.pdf](http://www.landwehr.org/1983-bats-ieee-computer-as.pdf)

High-availability engineers came up with similar concepts like redundant
systems w/ voters that made much of the hardware untrusted. Language
designers' TCB is their type system and runtime [if any] far as language
itself. For proof engineers, it's a tiny, proof checker that can spot problems
in complex, proof assistants. For distributed databases, most of it is how
storage is handled, and the protocols. This pattern should be remembered since
it keeps popping up over and over.

