
Kubernetes Failure Stories - yankcrime
https://github.com/hjacobs/kubernetes-failure-stories
======
dullgiulio
Nice, exactly what I needed today, after firefighting a latency spike because
of a compute node after a separate ingress node was restarted.

I would make the taxionomy a bit more precise: failures at running software on
the platform and failures at running the platform itself (which of course
affect the software running over it: high latency, network packets dropped,
DNS not reolved, ...)

In general I find that when the platform works as expected, it is not that
easy to make software run on it fail. That is, it is harder than without what
Kubernetes provides (granted: you can have it without Kubernetes, but many of
us didn't bother to have capable admins setting things up the right way).

What I find extremely fragile is the platform itself. It works, but you have
to start and stop things in the right order, navigate a sea of self-signed TLS
certificates that can expire, iptable rules and services and logs.

All have failure modes you need to learn: it takes a dedicated team. And once
you have that, you'll need another team and cluster to perform an upgrade.

But hey, when it works, it is really cool to deploy software on it.

~~~
tialaramex
> a sea of self-signed TLS certificates that can expire

I would like to know more about what's going on here. Is this just a sloppy
description and in fact Kubernetes uses a private PKI, so that the certs
you're using aren't in fact self-signed but signed by a private CA?

~~~
raesene9
There's (IMO) a mix.

Kubernetes and cloud native software make a _lot_ of use of TLS for mutual
auth.

A standard Kubeadm cluster (very vanilla k8s) has 3 distinct Certificate
authorities, all with the CA root keys online.

On top of that things like Helm, FluentD and Istio will make use of their own
distinct TLS certs.

One of the most "fun" pieces is that k8s __does not __support certificate
revocation, so if you use client certs for AuthN, then a user leaving
/changing job/losing their laptop can lead to a full re-roll of that
certificate authority :)

~~~
zxcmx
If you issue user certs, it is best to do it from software and have them live
for 8h max.

Also recommended: keep api behind a jumpbox.

~~~
raesene9
I've seen short-lived certs as a suggested workaround for the lack of
revocation and as a user that might be the best option.

That said the distributions I've seen that make use of client certs, don't do
that (typical lifetime for a client cert is 1 year), so I'm guessing a load of
people using k8s will have these certs floating about...

~~~
zxcmx
As an aside, if you want revocable, non expiring user creds you would be
better served by bearer tokens.

~~~
raesene9
True, the only real in-built option in k8s land for that is Service accounts,
which aren't designed for user auth. but can be used for that purpose.

------
msoad
You know what's worse than Kubernetes complexity? A shallow abstraction of it
by some mediocre team to get promotion and leave everyone else struggle with
something that you can't get help from anywhere else. This happened in past
two unicorns that I worked for. They template the templates and wrapped the
`kubectl` with some leaky abstraction.

At least raw Kubernetes is Google-able!

~~~
hjacobs
We try to keep it simple at Zalando, i.e. just using Mustache templating and
wrapping kubectl to add extra commands. See
[https://www.slideshare.net/try_except_/developer-
experience-...](https://www.slideshare.net/try_except_/developer-experience-
at-zalando-cncf-end-user-sigdx)

------
manishsharan
A long time back , I had gone down a rabbit hole to get Kubernetes manage my
Mongodb. Thankfully , I got bitch slapped into reality within a few days and
gave up because that rabbit hole goes deep. Now my mongodb stands outside of
the K8 and I have no issues with it that cant be solved with a rebuild of a
replica node.

~~~
elsonrodriguez
Glad I'm not the only one who almost got white-whaled into running stateful
services on K8s.

The API is so slick that it makes you feel you can do anything. But there's
can, and there's should.

~~~
pas
The API is there, but is there a stable MongoDB Operator? (
[https://github.com/Ultimaker/k8s-mongo-
operator](https://github.com/Ultimaker/k8s-mongo-operator) ?
[https://github.com/mongodb/mongodb-enterprise-
kubernetes](https://github.com/mongodb/mongodb-enterprise-kubernetes) ? )

~~~
elsonrodriguez
Honestly I'd only think of running Mongo on Kubernetes if my organization is
thoroughly bored with how well they've managed Mongo upgrades, scaling,
outages, migrations, and tuning over the last year on a traditional VM setup.

Even so, if everyone's bored with the stability, why change it?

I'm an advocate for databases in dev/test environments on K8s due to ease of
deployment, but there's too many moving pieces around storage for it to be a
great idea for production out the gate.

~~~
pas
Well. The benefits (monitoring, telemetry/observability) are nice enough that
I might think about running it on k8s in prod. But only after sufficient
number of dev/staging clusters have been sacrificed.

------
neya
In one of my ex-companies I worked with, we had a newbie coder who was just
getting fascinated by Kubernetes. Unfortunately, every project this person
touched would be left half-way by him and he'd move on to another project.

This company invested around $1 million into a very new Saas product that had
a lot of potential. Now, I'm an old school guy. I like systems where I don't
touch devops. The cost for such Paas is usually high, but it saves a lot of my
time, which I value more than money (and it is valuable, in many cases). I
advised the management to use something like AppEngine which is freakin'
awesome for such kind of large scale projects.

Unfortunately, this newbie coder was management's pet. They bit the bullet and
went on with his advice to use Kubernetes instead. The system randomly failed,
they spent tons of time doing devops and microservices on what should have
been a simplified monolith. The development time elongated to almost 2 years.
By this time, there were competing offerings in the market for much cheaper.

This costed an entire team's morale, which lead to missed deadlines and
product launches. This lead to many missed opportunities. For instance, there
was a HUGE client we were demo'ing this Saas product. It failed spectacularly
on a relatively light load (I've hosted Rails apps on AppEngine that can
handle FAR better) and we lost that product and the client.

It was still not too late and I insisted on management identifying the root
cause to switch to a managed Paas stack given how resource constrained we
were. But this newbie coder turned it into a political situation ('mine vs
yours'). I tried various ways to resolve this, but it didn't work out.

As a result, I said 'fuck this, I'm out of here' and I quit. In less than 3
months of my departure, about 10 people quit including one of the senior
managers. In about 6 months, the company shrunk from a double digit to single
digit company. The company almost went into bankruptcy.

The company lost its $1 million investment. Everyone left. The product
development was put on indefinite hold. Finally, this newbie coder left as
well. The founders had to rebuild their companies from scratch.

Co-incidentally, I went on to become an AppEngine consultant. I'm able to run
unbelievably large monoliths in production, with almost zero downtime for 3
years in a row simply because I chose to avoid devops, and most importantly
infrastructure complexity. I pay more, but it easily is worth my ROI. Most of
the time, microservices aren't required and we can get away with a simplified
monolith. And if you can get away with monoliths, you probably don't need
Kubernetes.

~~~
orthecreedence
> if you can get away with monoliths, you probably don't need Kubernetes

Right, especially if you're doing microservices the "right way" which means,
each service has not only its own app setup but its own data store. I am
responsible for (among other things) maintaining and scaling my company's
infrastructure, and while we do have some vague concept of microservices (app,
api (monolith), queuing system, various ML apis, headless browser, etc) it is
always _always ALWAYS_ "what's the _simplest_ way to do this?" Microservices
are a ton of moving parts, and adding in their own data stores is another can
of worms.

We have a need for a scheduler. We are using Nomad internally. I chose this
over Kubernetes. Why? I know that Kubernetes has more muscle behind it, can do
more "things" we might need in the future, etc. I chose it because of its
operational simplicity. It strikes a perfect balance of "scale these three
things up" without needing a dedicated ops team to run it.

So far it has performed great, and I really think it comes down to limiting
the number of moving pieces. Don't get me wrong, microservices are great if
you have an ops team and a 50+ dev team all working on different things. But I
often see microservices being pushed _without warning people of the inherent
costs involved_.

I'm sure kubernetes is great, and I'm continuously re-evaluating where we are
and what we need, but for now simpler is always better. If we do go with Kube,
it will likely be a managed version so I don't have to touch the
internals...not because I don't want to learn them (I definitely would want to
know all the grimy details) but because I just don't have the fucking time to
deal with it.

~~~
StavrosK
How do you use Nomad? Do you run Docker containers with it? I don't need the
scaling, I need the part where new versions of code are autodeployed
automatically, services are connected together, and basically things update
without having to run ops checklists every time.

Kubernetes feels too heavyweight and deploying machines feels too
snowflakey...

~~~
orthecreedence
We use nomad a few different ways. One is system jobs which run DNS and Fabio
(the networking fabric layer) on the host machines. Then, yeah, for all the
apps and services we use Docker containers which I've found runs pretty
well...with one caveat: make sure if you're spinning jobs up/down a _lot_ you
get a fast disk. For instance, we were running on 64gb ebs gpio volumes and
Docker was starting to grind to a halt on them (we spin workers for our queue
up/down quite a lot). We started using instances with ephemeral SSDs instead
which has a bit more operational complexity on init, but overall works really
nicely.

We also use Nomad for deployments as well. I wrote some deploy scripts by hand
that create the Docker containers and load the templated Nomad jobs ie

    
    
        image = "container-hub.myapp.com/api:{{ BUILD_TAG }}"
    

becomes

    
    
        image = "container-hub.myapp.com/api:master.20190613-153222.f996f1bd"
    

I'd say as far as "take this container and run N instances of it and load
balance requests at this hostname to them" Nomad has been pretty great. There
is definitely some work involved getting the comm fabric all set up exactly
how you want (Fabio does make this easier but it's still work). Consul now has
Connect
([https://www.consul.io/docs/connect/index.html](https://www.consul.io/docs/connect/index.html))
which I haven't looked at yet which might alleviate a lot of this. I think
some of our complexity also came fromt he fact that we do have TCP services we
needed to load balance and most fo the service fabric stuff forces HTTP onto
everything.

Overall my experience with Nomad has been great. It's capable and really not
too difficult to operate for one person who also has tons of other stuff going
as well =].

~~~
StavrosK
This is super useful, thank you! Do you think all of this is worth doing, now
that you can basically get managed Kubernetes for free from providers as long
as you use their machines?

It feels like this is much easier than rolling your own Kube, but not easier
than using the managed version...

~~~
orthecreedence
Not sure, honestly. I've never used Kube, just taken a preliminary look at the
docs and been scared away by how much abtraction there is. While providers may
manage it for us, I'm not sure to what extent they manage it. We're on AWS and
I haven't been super happy with the responses/response times of their support,
so when dealing with unknowns I'd rather not rely on someone else.

That said, Nomad hasn't been without problems. It's just that the problems
seem to be easier for one person to solve. I set all this up almost a year and
a half ago and haven't touched it much since, so it's possible both Nomad,
Kube, and and managed services have come a long way and now is a good time to
re-evaluate.

------
erikrothoff
I got enough of random 503’s, network errors, slowdowns on mid-to-high traffic
and dropped TCP connections yesterday. I setup dedicated VPS’s for our main
app servers using Ansible and good ol’ fashion Capistrano. It feels almost
old-school, but so much more stable.

------
grantlmiller
I'm torn on this list. It is important to learn from the mistakes from others,
so I like it (postmortems are great for this reason), BUT it feels like folks
are using these examples as reasons to stay away from Kubernetes. There could
be a significantly larger list of system failures where K8s is not involved.
Similarly there could probably be a list of "Encryption Failure Stories" but
that doesn't mean we shouldn't encrypt things.

As an industry, one of the things we do pretty well is identify the most
viable patterns to solve a problem and then develop and adopt the best
primitives of those patterns. This is what Kubernetes is for creating
reliable, scalable, distributed systems.

~~~
eeZah7Ux
k8s does not solve any problem around making your system distributed, scalable
and reliable.

You still have to implement your own architecture to implement quorum, leader
election, failover etc where needed.

k8s will only redeploy containers when a node fail.

~~~
geezerjay
> k8s will only redeploy containers when a node fail.

That's not true at all. Kubernetes monitors containers' health and
automatically scales deployments according to the demand. Thus quite obviously
kubernetes does help make your system distributable, scalable and reliable. In
fact, that was the design goal of kubernetes.

------
tthisk
In my opinion many of these issues with Kubernetes arise because it is not the
right abstraction for application developers. You are forced to think about
too many low level details to get a stable application running.

Many companies forget that they are incharge of getting way more intricate
details of the application runtime right when they shift their application to
Kubernetes.

In some of my recent experience I have seen a company shift an application
from app engine to Kubernetes. However they forgot that appengine is using an
optimized jvm to run your application. When they deployed to Kubernetes their
app grinded to a halt. They forgot about all the observability that comes out
of the box in app engine, thus resulting in an unobservable badly performing
application.

Hopefully more domain specific runtimes build on top of Kubernetes can help
application developers deploy to a kubernetes cluster in a sane way. I am
putting some of my money on knative, however it is still quite hard to
convince clients to invest in this area.

~~~
threeseed
Not sure what any of this means.

Kubernetes does not prevent you from using an optimized JVM to run your
application. In fact it doesn't even involve itself at that level rather it
focuses just on managing containers which can have anything inside them.

And Kubernetes actually has far more observability than most distributed
applications since you can get plugins which trace/monitor every connection
between containers as well as monitor the health of each container.

I think what you're after is a PaaS on top of Kubernetes ? Well there are many
of those around as well.

~~~
glennpratt
Yes, you can get a bunch of pieces that need a team to evaluate and operate.
You are saying the same thing and somehow making it a disagreement?

------
odiroot
These are all very informative and even amusing. But this is to be expected.
Kubernetes is an enormous system (or ecosystem of systems actually). It's
really hard to understand all the pieces and components, even the built-ins.

I still struggle (after having set up 2 clusters from scratch) with
understanding networking, especially Ingresses and exposing things to the
public internet.

------
cookiecaper
After a couple of years of massive frustration with the entire direction of
the "devops" segment, I think I'm resolved to get out of it and either move
further up back to ordinary application development or further down into the
actual kernel, preferably working on FreeBSD or some other OS that's more sane
and focused than Linux.

Kubernetes represents the complete "operationalization" of the devops space.
As companies have built out "devops" teams, they've mostly re-used their
existing ops people, plus some stragglers from the dev side. These are the
people you hear talking about how great Kubernetes is, because for them, they
see it as "run a Helm chart and all done!". Which makes sense, since they
were, not too long ago, the same guys fired up about all the super-neato
buttons to click in the Control Panel. 90% of "devops" people at non-FAANG
companies are operations people who just think of it as a new name for their
old job.

Among this set, there's no recognition of the massive needless complexity that
permeates all the way through Kubernetes, no recognition of the tried-and-
tested toolkit thrown away and left behind, no recognition of the fact that
we're working _so hard_ to get things that we've had as built-in pieces of any
decent server OS for decades. No recognition that Kubernetes _exists_ so it
can serve as Google's wedge in the Rent-Your-Server-From-Me Wars, and no
awareness that just in general, _there 's no reason it should be this hard_.

Of course, to them, it's not hard. They have an interface with buttons, they
can run `helm install`, they get pretty graphs via their "service mesh".
That's what I mean by "operationalized"; Kuberenetes is meant to be _consumed_
, not configured. You don't ask how or why. You run the Minikube VM image
locally and you rent GKE or EKS and go on your merry way. The intricacies are
for the geniuses who've blursed us with this death trap to worry about! Worst-
case, you use something like kops. Start asking questions or putting pieces
together beyond this, and you're starting to sound like you don't have very
much of that much-coveted "devops mindset" anymore.

"What happens if there's a security issue?" Oh, silly, security issues are a
thing of the past in the day and age where we blindly `FROM crazy-joes-used-
mongodb-emporium:latest-built-2-years-ago`. Containers don't _need_ updates,
you goose. They're beautiful, blubber-powered magic, and the Great Googly
Kubernetes in the sky is managing "all that" for us. Right on.

I'm picking on Kubernetes specifically here because it's the epitome of all
this, but really everything in "devops" world has become this way, and
combined with the head-over-heels "omg get us into the cloud right now"
mentality that's overtaking virtually every company, it's a bad scene.

Systems have gotten so much more convoluted and so much dumber over the last 5
years. The industry has a lot to be embarrassed about right now.

~~~
geggam
IMO .... AWS has been eating everyones lunch. K8s was a play by Google to sell
GKE / GCP and it worked.

Problem being not everyone has a team of Google SREs to manage it so when k8s
blows up the skillset to figure out the issues simply doesnt exist in the team
managing it.

~~~
pjmlp
They are still the 3rd cloud provider.

It is hard to state that it worked, specially when AWS and Azure also offer
Docker and Kubernetes deployments.

On Microsoft's case it is even supported directly from Visual Studio.

~~~
geggam
Windows is a 2nd class citizen when automation is concerned IMO.

I believe Azure is feeding off of enterprises who are 15 years behind everyone
else.

Windows as a platform cannot possibly keep up with the fast moving changes
required by containers and that ecosystem. The only way Azure is competing is
offering Linux based systems.

</2 cents>

~~~
pjmlp
Only for those that don't bother to learn how to do it the Windows way.

My Windows 2000 deployments already had plenty of automation scripts.

------
hjacobs
As the owner of the linked GitHub repo (also rendered on
[https://k8s.af](https://k8s.af) \--- thanks to Joe Beda), I highly encourage
everyone to contribute their failure stories (I'm still looking for the first
production service mesh failure story..).

Also be aware of availability bias: Kubernetes enables us to collect failure
stories in a (more or less) consistent way, this was previously not easily
possible (think about on-premise failures, other fragmented orchestration
frameworks, etc) --- I'm pretty sure there are much more failure stories in
total about other things (like enterprise software), but we will never hear
about them as they are buried inside orgs..

BTW, I also have a small post on why I think Kubernetes is more than just a
"complex scheduler": [https://srcco.de/posts/why-
kubernetes.html](https://srcco.de/posts/why-kubernetes.html)

------
miff2000
Some really interesting issues that hopefully, as a new Kubernetes Cluster
maintainer, I'll now be able to avoid!

~~~
hjacobs
Thanks, feel free to reach out on Twitter
([https://twitter.com/try_except_/](https://twitter.com/try_except_/)) --- we
are always happy to share experiences on running K8s.

------
emmelaich
Awesome. I'd like to see a list for other related products.

Such as Openstack. I bet there's a few stories there.

------
rurban
So the most used tags are: OOMKill, CoreDNS, ndots:5, NotReady nodes.

I'd seriously consider disabling the OOMKiller. Not sure about the DNS issues.

------
cryptica
This is just propaganda to drive companies away from Kubernetes so that they
use expensive hosted 'lock-in' PaaS solutions instead of free, open source, no
lock-in solutions.

~~~
elsonrodriguez
The tone of the repo doesn't really scream propaganda to me. It just looks
like a solid list of failures and what people can learn from them.

