As the owner of the linked GitHub repo (also rendered on https://k8s.af --- than...

As the owner of the linked GitHub repo (also rendered on https://k8s.af --- thanks to Joe Beda), I highly encourage everyone to contribute their failure stories (I'm still looking for the first production service mesh failure story..).

Also be aware of availability bias: Kubernetes enables us to collect failure stories in a (more or less) consistent way, this was previously not easily possible (think about on-premise failures, other fragmented orchestration frameworks, etc) --- I'm pretty sure there are much more failure stories in total about other things (like enterprise software), but we will never hear about them as they are buried inside orgs..

BTW, I also have a small post on why I think Kubernetes is more than just a "complex scheduler": https://srcco.de/posts/why-kubernetes.html