Yep. There's a transition period where you can't rely on redundancy any longer because there are so many components that it's basically inevitable that at any given time somewhere something will be in a degraded state. So you design for that case, the degraded normalcy case. You make something failing somewhere a non-emergency. It takes a lot of work to do but when you have things working in that way then you can guarantee that you're in that state by testing it routinely in production.

