
Kubernetes Best Practices - shifte
https://medium.com/@sachin.arote1/kubernetes-best-practices-9b1435a4cb53
======
thesandlord
This seems to be a straight copy-and-paste of one of my talks. With zero
credit to me :(

[https://speakerdeck.com/thesandlord/kubernetes-best-
practice...](https://speakerdeck.com/thesandlord/kubernetes-best-practices)

WTF...

(BTW I'm creating a full deep dive series to go into each of these bullet
points in much more detail)

~~~
kuschku
And not only is it a copy-paste, it also manages to introduce errors, and
still get thousands of views.

Well, had I known this post actually was based on your talk, I could have
avoided all the effort (see my comments below in this thread)

~~~
thesandlord
I really appreciated your deep dive comments, they were good! People who see
it will actually learn some best practices.

~~~
kuschku
Thanks! I’m also looking forward to your writeup of the talk. Do you know
where you’re going post it, so I could find it once it’s finished? (Maybe even
a platform with notifications, so I can subscribe?)

~~~
thesandlord
Not sure right now, but I'd say follow me on Twitter and Medium and I'll
definetly let you know! (both in my HN profile)

------
orf
Thank you for these 30 bullet points with no description or further detail
other than 'do this'. Some are pretty common-sense, others not, and for these
merely saying it's "best practice" with no extra detail, links or accompanying
reasoning is not enough.

~~~
kuschku
Not OP, but I’ve been running a kubernetes cluster for a few months myself, so
I’ll try to give context.

> Don’t trust arbitrary base images.

Kinda obvious, the equivalent of "don’t trust any random binary from the web",
as it can contain malware at worst.

> Use small base image.

Generally, you should look at using Alpine as base image, to reduce storage
size, and memory consumption of the overlay filesystem that docker uses. But
be aware, Alpine uses musl and busybox instead of GNU libc and utils, so some
software might not work there. Generally, this is also common sense. See more
at [https://alpinelinux.org/about/](https://alpinelinux.org/about/) and
[https://hub.docker.com/_/alpine/](https://hub.docker.com/_/alpine/) – but be
aware, often the projects you want to depend on already have an alpine image
(e.g., openjdk, nodejs or postgres images are all available in an alpine
version, reducing their size from 500M+ to around 10-20M)

> Use the builder pattern.

With containers, many people include all build dependencies in the final
container. Docker has a new syntax to avoid this, by declaring first a builder
container, with its dependencies, then building, and then declaring the final
container, and COPY'ing the build artifacts from the build container. This,
too, keeps container size a lot smaller.

You can find more here: [https://docs.docker.com/engine/userguide/eng-
image/multistag...](https://docs.docker.com/engine/userguide/eng-
image/multistage-build/)

> Use non-root user inside container.

This is basically common sense with regards to docker as runtime, and is bad
security practice (especially when combined with problematic filesystem
mounts). Also is recommended by the Docker team:
[https://docs.docker.com/develop/dev-best-
practices/](https://docs.docker.com/develop/dev-best-practices/)

> Make the file system read only.

I’m not sure what OP is referring to here, but if it refers to the container
filesystem, that is because writing to AuFS or OverlayFS is significantly
slower (and more memory intensive) than writing to a PersistentVolumeClaim or
EmptyDir volume in Kubernetes, so you should always mount an EmptyDir volume
for all log folders, temporary data, etc, and a PersistentVolumeClaim for all
persistent data.

Also is recommended by the Docker team: [https://docs.docker.com/develop/dev-
best-practices/](https://docs.docker.com/develop/dev-best-practices/)

> One process per container, Don’t restart on failure, crash cleanly instead,
> Log to stdout & stdderr

This is related to the logging system (which mostly looks at stdout and
stderr), which is a result of the fact that Kubernetes itself was mostly
designed to work with a single process per container. Yes, you can spawn
multiple processes, from a single shell, and print their combined stdout, but
then you also need to ensure that if one crashes, everything restarts
properly.

If you use a single process per container, logging to stdout/stderr, then
scaling is a lot simpler, and restarts are handled automatically (and this is
required for staged rollout).

> Add dumb-init to prevent zombie processes.

If you need multiple processes, and one that isn’t PID1 crashes, you’ll end up
with zombie processes. A single process per container obviously avoids this,
but if you have to use multiple, at least add an init system to reap dead
child processes, potentially restart crashed dependent processes, etc.

Normally, docker supports the --init parameter to do this, but the version
recommended for use with Kubernetes does not support this yet (EDIT:
apparently, since 1.7.0, Kubernetes actually does automatically do this for
you), so you could add e.g. [https://github.com/Yelp/dumb-
init](https://github.com/Yelp/dumb-init) or
[https://github.com/krallin/tini](https://github.com/krallin/tini) (both
officially recommended by the Docker team)

~~~
kuschku
Continuing further:

> Use the “record” option for easier rollbacks.

As explained in the documentation here
[https://kubernetes.io/docs/concepts/workloads/controllers/de...](https://kubernetes.io/docs/concepts/workloads/controllers/deployment/#checking-
rollout-history-of-a-deployment) this option records the changes you apply
with each version, allowing you to roll back to any previous version, and see
the changes.

> Use plenty of descriptive labels.

Not just descriptive, but you should also label by version, service, etc. You
can use labels also in selectors for loadbalancers and ingresses, and you can
also query the CLI by label. This not only makes it easier to find things, but
also can be very useful, for example when rolling out a new version – label
each with version=..., and then just change the label selector of the
LoadBalancer.

How to use selectors, and labels, is explained here
[https://kubernetes.io/docs/concepts/overview/working-with-
ob...](https://kubernetes.io/docs/concepts/overview/working-with-
objects/labels/) (I know that this explanation is very limited, but I don’t
know of a better documentation of this feature)

> Use sidecar containers for proxies , watchers etc. Don’t use sidecar for
> bootstrapping. Use init container instead.

For bootstrapping, Kubernetes will first execute init containers in order,
then start the main container. This ensures that they operate
deterministically. If you try to do this with sidecars, you might end up with
containers running when they aren’t necessary anymore, but also need to build
your own deterministic bootstrapping, and handle errors in one of them
yourself.

Also see [https://kubernetes.io/docs/tasks/configure-pod-
container/con...](https://kubernetes.io/docs/tasks/configure-pod-
container/configure-pod-initialization/) and
[https://kubernetes.io/docs/concepts/workloads/pods/init-
cont...](https://kubernetes.io/docs/concepts/workloads/pods/init-containers/)

> Don’t Use latest or no tag.

This is basically common sense, as for any dependency – the way projects
update significantly differs, some might never do breaking changes, others
might break their entire API in every minor release, and as result, your
service might end up down. This is the reason why a decade ago every sysadmin
used Debian Stable (no breaking changes ever). On the other hand, if you
specify fixed versions, make sure to check for bugfixes manually (e.g., I
recently saw a container from a major project that was built with an outdated
release of the JVM because they had never updated that version tag).

> Readness & liveness probes are your friends.

Readiness and liveness probes are especially useful for load balancing again –
they determine if a service is ready to serve, and automatically remove them
out of the pool used by the service, so that requests are only routed to
services that are up. You don’t have to use HTTP probes either – for example,
several helm charts for clustered databases use their CLI client as probe.

More about the probes here [https://kubernetes.io/docs/tasks/configure-pod-
container/con...](https://kubernetes.io/docs/tasks/configure-pod-
container/configure-liveness-readiness-probes/) and how they affect the pod
lifecycle here [https://kubernetes.io/docs/concepts/workloads/pods/pod-
lifec...](https://kubernetes.io/docs/concepts/workloads/pods/pod-
lifecycle/#container-probes)

    
    
        ________________________
    

And most of the rest seems pretty obvious.

Generally, for people interested in the topic, /r/kubernetes,
[https://kubernetes.slack.com/](https://kubernetes.slack.com/) and #coreos on
Freenode might be a much better place for a discussion than this HN post of an
article with bullet points, no explanation, and countless typographic errors.

~~~
orf
Thank you both!

------
MBCook
Well this was really disappointing. I imagine the number of the things in here
or really useful but without more details on WHY you should or shouldn’t do
some of these things… I’m not sure why I should follow.

This is essentially a simple bulleted list.

------
camdenlock
This would be interesting (and useful) if any of the assertions came with
explanations.

------
rileytg
anybody have some more detailed advice along these lines?

~~~
shaklee3
Kuschku has good descriptions above

