That I like. From the photos it's also obvious the rack is tipping.
I'm curious why someone isn't doing a regular walkthrough.
There could also be sounds of some hardware failures.
A remote SRE shouldn't need to monitor the hardware health, an on-site person should have caught this sooner.
I wonder how the person responsible for fixing/replacing the wheels felt about the follow-up
Sure - the issue is we (the public reading this) don't know when this actually physically failed. It could be that SRE picked up on this before a scheduled walkthrough (of which I'm sure occurs) happened.
Disclaimer: Google employee. No knowledge of this event beyond the blog post.
A remote SRE shouldn't need to monitor the hardware health, an on-site person should have caught this sooner.
I wonder how the person responsible for fixing/replacing the wheels felt about the follow-up