
A Manager’s Guide to Kubernetes Adoption - shuss
https://unixism.net/2019/08/a-managers-guide-to-kubernetes-adoption/
======
orev
While I know it’s not the main point of the article, I’m getting really tired
of the anti-sysadmin bias in pretty much anything related to devops. It seems
to be a favorite trope to paint sysadmins as a dying breed of monkeys who are
only capable of keeping pets and doing manual tasks when they get around to
lifting their knuckles off the floor. Who do you think is writing the
playbooks for ansible? How do you think you can even write those things
without first having a deep understanding of how the system works and having a
set of procedures around them? Who do you think runs the systems your
containers sit on top of?

Just because infrastructure as code appears to use the same tools that
developers use (text files, git) doesn’t mean developers can do the job. Me
having access to a pen and paper doesn’t make me Shakespeare.

This is especially harmful in an article that claims to be aimed at
management, ostensibly trying to set the future path for organizations.

P.S. Otherwise a nice overview of things

~~~
streetcat1
Developers did not took your job, it was the machines...

Kubernetes strength is its declarative nature. I.e. the actual "tasks" (the
imperative part), moved into the controller code and is being automated away.

As more and more operators/CRD (i.e. automatic sysadmins) get written, more
jobs will disappear.

For example, you can imagine a database operator that just by stating your
schema (as an CRD), will create the tables for you, the indices, do automatic
backup/restore, do automatic migrations, create the monitors and the alerts.
I.e. will completely replace the DBA.

~~~
dkhenry
There already is a database operator that can do everything you have listed
and a lot more ( I helped write it ), but it doesn't remove the need for
DBA's. It just changes what they do on a day to day basis. I think this is
what OP was saying. Just because you have an operator that can do most of what
a traditional DBA does doesn't mean you can replace all the world's DBA's.
Someone still needs to know _why_ a specific query managed to do a
multiplicative join and lock all your tables for hours, even if the operator
knows how to flag that query and reject it.

~~~
streetcat1
Sure, but in a day of a dba, or even over a month, how much does he deals with
understanding deadlocks?

BTW, the operator that I saw are not there yet. What I want to see is a CRD
for a set of input schemas, and an output schema, and let the operator create
the most efficient query.

------
scarface74
For the life of me I can’t figure out why I would recommend Kubernetes to any
company who is already on AWS. Except for the custom stuff, you should
probably used a managed equivalent and for the custom parts where you need HA,
scalability, etc. just use regular old ECS or Fargate for Serverless Docker.
Heck even simpler, sometimes is just to use a bunch of small VMs and bring
them up or down based on a schedule, number of messages in a queue, health
checks, etc and throw them behind an autoscaling group.

I’m not familiar with Azure or GCP, but they have to have more easily managed
offerings than K8s.

If you’re on prem - that’s a different use case but my only experience being
responsible for the complete architecture of an on prem solution was small
enough that a combination of Consul/Nomad/Vault/Fabio made sense. It gave us
the flexibility to mix Docker containers, raw executables, shell scripts, etc.

That being said, for both Resume Driven Development reasons and because we
might be able to find people who know what they were doing, if I had to do
another on prem implementation that called for it, I would probably lean
toward K8s.

~~~
empath75
> Consul/Nomad/Vault/Fabio

This is not very much easier to implement than kubernetes, in my experience,
and you end up with a less capable system at the end of it.

~~~
scarface74
And this was on Windows - for reasons.

None of these can run as Windows Services by themselves. I had to use NSSM.
That being said.

-Consul a three line yaml configuration to set it up in cluster. It’s a single standalone executable that you run in server mode or client mode.

\- once you install Consul on all of the clients and tell it about the
cluster, the next step is easy.

\- run the Nomad executable as a server, if you already have Consul, there is
no step 2. It automatically configures itself using Consul.

\- run Nomad in client mode on your app/web servers. If you already have the
Consul client running - there is no step 2.

\- Vault was a pain, and I only did it as a proof of concept. I ended up just
using Consul for all of the configuration and a encryption class where I
needed to store secrets.

Did I mention that we had a lot of C# framework code that we didn’t want to
try containerize and Nomad handles everything.

That being said, I wouldn’t do it again. If we had a pure Linux shop and the
competencies to maintain Linux I would have gone with K8s instead if I had to
do an on prem implementation.

But honestly, at the level I’m at now, no one would pay me the kind of money I
ask for to do an on prem implementation from scratch. It’s not my area of
expertise - AWS is.

~~~
empath75
Well you can spin up a working k8s cluster in 5 minutes with Kops, but that’s
obviously not the end of the story.

~~~
scarface74
How well would that work orchestrating a combination of a Docker containers,
C# (.Net Framework not .Net Core) executables, Powershell scripts, etc?

------
true_tuna
I think the author missed some key points:

1) It will take you longer than you think 2) It will be harder than you
imagined 3) It’s harder to find people who know it and non-trivial to get good
at it (this should have been closer to the top) your project can be done in
six months by five k8s experts but you only found one dude who knows it and
he’s more a’ight than pro.

It’s probably still worth it, just go in with your eyes open.

Be prepared for this unfortunate pattern: “this thing I want just doesn’t work
and probably never will.

Deploying k8s on a small, self-contained project that you just want to set up
and go forever is probably a good place to begin. If you try to move your
whole production workflow in one go... You’re going to have a bad time.

~~~
empath75
> 1) It will take you longer than you think 2) It will be harder than you
> imagined 3) It’s harder to find people who know it

Just so people have an idea of how hard it is to find people. I've got just
about 1 year of experience getting kubernetes into production at a very large
(multi-billion dollar) company. I have so many job offers coming in that I'm
not even talking to companies offering less than 350k total comp. I don't have
a college degree and 5 years ago I was making $50k a year. That might be just
the generally bubble-ish nature of the tech industry right now, but if they're
throwing around that kind of money for someone like me, I imagine that small
shops have no way to compete.

~~~
AWebOfBrown
Any chance you'd be happy to elaborate on what your role is? Are you primarily
part of a dev-ops team, or tackling Kubernetes as part of developing the
product? Did you obtain CNCF certification / think there's much value in
those?

~~~
empath75
devops and no certs.

------
lmeyerov
Yuck. We blew a ton of money and time, despite a great infra/hpc/etc team
(PhD, exGoogle, exNetflix, ex-scale video, etc), on a skilled but
overambitious infra subteam where we should have (and ended on) docker /
compose. K8s makes sense if you have a ton of servers or scale or devs, and
want to pay employees and team in perpetuity to focus on it, but for half the
folks out there, stick with the pieces or you have the cost of openstack and
no better off for the increment over just a few of the pieces you can carve
out without k8s itself.

The article that leadership needs, vs vendor / fanboy bullshit, is "only do
k8s if you have these exact scale and ops problems and your stack looks
exactly like this and your entire app team looks like this and you have
fulltime k8s people now and forever, and you are willing to tax each team
member to now deal with k8s glue/ops/perf problems, and xyz other things ,
otherwise if any of these simplifying or legacy assumptions apply, perfect:
you get to focus on these way faster and cheaper agile orchestration options
and actions".

~~~
reqres
> where we should have (and ended on) docker / compose

We ended up doing this too.

Do you have any links to deploying docker-compose in production? We've not
been able to find out much. However our solution seems to work well - am keen
to find out how other people are managing host setup, updating and remotely
controlling docker-compose

In our setup, we essentially use docker-compose as a process manager:

\- Updates with docker-compose.yml via git syncs

\- Logging via journald (which in turn is forwarded)

\- Simple networking with `network: host` and managing firewalls at the host
level with dashboard/labelling

\- Restarts / Launch on startup with `restart: always` policy

IMO, more straightforward in the past

1) Entrusting everything to a package manager or build script which might
break on a new version release

2) Maintaining ansible script repository to do the host configuration, package
management and updates for you - these too always need manual intervention on
major version updates

~~~
lmeyerov
Fairly similar. Docker/docker-compose takes care of launch, healthchecks /
soft restarts, replication, GPU virtualization & isolation, log forwarding,
status checks, and a bunch of other things. Most of our users end up on-prem,
so the result is that _customers_ can admin relatively easily, not just us,
despite weird stuff like use of GPU virtualization. I've had to debug folks in
airgapped rooms over the phone: ops simplicity is awesome.

Some key things beyond your list from an ops view:

\-- containers/yml parameterized by version tag (and most things we gen):
simplifies _a lot_

\-- packer + ~50 line shell script for airgapped tarball/AMI/VM/etc.
generation + tagged git/binary store copies = on-prem + multiple private cloud
releases are now at a biweekly cadence, and for our cloud settings, turning on
an AMI will autolaunch it

\-- low down-time system upgrades are now basically launching new instances
(auto-healthcheck) + running a small data propagation, and upon success, it
dns flips.

\-- That same script will next turn into our auto-updater on-prem / private
cloud users without much difference. They generally are single-node, which
`docker-compose -p` solves.

\-- secrets are a bit wonkier, but essentially docker-compose passes .envs,
and dev uses keybase (= encrypted gitfs) and prod is something else

Some cool things around GPUs happening that I can't talk about for a bit
unfortunately, and supporting the dev side is a longer story.

Some of these patterns and the tools involved are normal part of the k8s
life... which is my point: going incrementally from docker / docker-compose or
equiv lightweight tooling will save your team + business time / money /
heartache. Sometimes it's worth blowing months/years/millions and taxing the
folks who'd be otherwise uninvolved, but easily for over half the folks out
there, probably 90%+.. so not worth it. Instead, as we need a thing, you can
see how we incrementally add it from a as-simple-as-possible baseline.

------
Niksko
For our organisation (mid size, varied skill level across the org) Kubernetes
solves the problem of pattern drift. There is one way to do things, and when
there are multiple ways, we enforce a single one.

An example is databases. We offer Postgres (via an operator that provisions an
RDS instance). AWS lets you choose whatever you want, and frankly at our scale
and level of complexity, there is simply no reason to choose one SQL db over
another other than bikeshedding. So being able to enforce Postgres, and as a
result, provide a better, more managed provisioning experience (with security
by default and less footguns) and training and support just for one DB type is
incredibly powerful.

As Kelsey Hightower has often said, Kubernetes is a platform for building
platforms. Just deploying a Kubernetes cluster, giving devs access and calling
it a day is rarely the right answer. Think of it as a way to effectively
provide a Heroku alternative inside your company, without starting from
scratch or paying Heroku prices

~~~
jbmsf
Aren't you using a technology to solve a social problem?

I don't have the experience with Kubernetes to have an informed opinion, but
I've solved plenty of social problems in software organizations. Those are
usually approachable without making a long-term commitment to a technology
that may or may not be the right choice.

------
cfors
Good points in here. The one thing about Kubernetes that I think teams need to
be wary of is that upgrading needs to be treated as a first class citizen. In
order to use the awesome tooling that so many people are building in
Kubernetes (see basically all of the CNCF projects), the cluster API needs to
be kept up to date. Once you start falling behind, you run the risks of being
stuck in a version compatibility nightmare.

Other than that, I can't imagine not running non-stateful applications on
something other than Kubernetes anymore.

~~~
SPascareli13
> Other than that, I can't imagine not running non-stateful applications on
> something other than Kubernetes anymore.

How do you deal with stateful applications like databases?

~~~
markbnj
> How do you deal with stateful applications like databases?

The impedance issues between kubernetes and stateful applications have been
mostly solved. The main problems stemmed from the object model not yet being
mature enough. Deployments did not offer good solutions for assigning stable
identities to pods across restarts, or making it possible for them to find and
attach to persistent volumes. StatefulSets and PersistentVolumeClaims solve
these problems. Most of our k8s workloads are stateless, but we do run some
big elasticsearch workloads in it. Having said that, I think if you're a
small- to mid-sized business running on the cloud managed offerings are the
first place to look. GCP CloudSQL has worked extremely well for us.

~~~
jrochkind1
To someone who is not familiar with the technology beyond a high-level
overview like in OP, the sentence "The main problems stemmed from the object
model not yet being mature enough" sounds like "The main problems with
stateful applications stemmed from there being problems with stateful
applications".

~~~
shaklee3
The problem was that these features came a little later after many of the
stateless features were in there. Now we're talking about the stateful stuff
being stable for many years, so all the arguments against it are usually old
or uninformed.

------
stevenc81
DISCLAIMER:

I work for Buffer have been working on our production Kubernetes cluster.

When we started with Kubernetes we were faced with multiple challenges.
Fortunately, with time we managed to tackle most of them. I will list them
here and share our learnings.

Code Deployment:

[https://overflow.buffer.com/2017/08/31/buffer-deploy-code-
ku...](https://overflow.buffer.com/2017/08/31/buffer-deploy-code-kubernetes/)

Application Observability:

[https://itnext.io/when-istio-meets-jaeger-an-example-of-
end-...](https://itnext.io/when-istio-meets-jaeger-an-example-of-end-to-end-
distributed-
tracing-2c136eb335eb?source=friends_link&sk=b08b580596aea72b5404521df924a191)

[https://itnext.io/application-observability-in-kubernetes-
wi...](https://itnext.io/application-observability-in-kubernetes-with-datadog-
apm-and-logging-a-simple-and-
actionable-790ee8aefe29?source=friends_link&sk=c1fa55fd604d968c1869e7e1efff218f)

Cluster Upgrade:

[https://itnext.io/upgrading-kubernetes-cluster-with-kops-
and...](https://itnext.io/upgrading-kubernetes-cluster-with-kops-and-things-
to-watch-out-
for-8b5e7dff71c0?source=friends_link&sk=9f326510f264e7a2a2cac97ae70410d0)

[https://itnext.io/kubernetes-master-nodes-backup-for-kops-
on...](https://itnext.io/kubernetes-master-nodes-backup-for-kops-on-aws-a-
step-by-step-
guide-4d73a5cd2008?source=friends_link&sk=6968c6521efcf0875db0f9a07db1fbd9)

I do think it's a long term investment but you will eventually see the
benefits after 18 months or so.

------
flerchin
We just moved 34 microservices to EKS, along with associated migration of
datalayer and built out coupling to the few things remaining on-prem. It took
about 24 months and cost about $20M, about twice the optimistic estimates. We
are beginning to see the force multiplier. As a person who does things, it was
a slog, but it seems to have been worth it.

~~~
switch007
What cost $20M?

~~~
flerchin
The labor. The AWS bill has not been significant.

------
kweinber
I really dislike the "herds not pets" analogy. It implies lack of attachment
due to necessity of slaughter - it also implies lack of attachment to your
craftsmanship (as in you care less about each host). I find it jarring every
time I read it and think the analogy is fraught and totally unnecessary.

It is a really awkward way to indicate that you need to think of things on a
service and resource basis as apposed to a hand-crafted machine basis. Do we
need to bring animals into it at all?

~~~
pm90
It’s a simple term which uses a well known social phenomena as an analogy to
describe an architecture. You could have called it Maxwell-Hertz Philosophy of
infrastructure design, but then would need to spend a ton of time socializing
it. But ... why? Pets vs cattle is easy to understand, you get the concept
right away with little explanation, and it’s kinda tongue in cheek. Another
term I’ve heard used for it is to treat infrastructure as phoenixes: you can
burn it down but get a functioning one right away. But cattle vs pets lets you
refer to both kinds of machines.

I’m sorry if the references to animal cruelty (kinda?) makes you uneasy, but
let’s be honest, this is CS. Master-Slave replication, parent killing child
processes, canary deployments etc all seem distasteful if all jumbled
together. Their use is intended as an attempt at humor, but also a catchy and
vivid name to easily recall the underlying principles, which is kinda
brilliant.

------
ggregoire
Is Kubernetes all about scaling a service on a cluster of servers? Or is it
also used for managing the same applications on a bunch of servers?

e.g. 100 customers, 100 bare-metal servers, the same dockerized applications
on each one. I need to deploy a new version of one of these applications on
every server.

Or something like Ansible would be better suited and easier for this use case?
I never used Ansible but I understood that I could make a recipe that does a
git pull and a docker-compose down && docker-compose up -d on every server in
a single command, right?

~~~
ianburrell
You can use Kubernetes to run service on every machine. But that is usually
done for support services like logging that need to run on every node.

Is it one customer per server? That wouldn't be a very good fit for
Kubernetes. Kubernetes is about running containers on cluster of machines,
scaling multiple services.

Kubernetes would let you run service per customer on cluster of machines and
scale the number of containers for each customer independently. You wouldn't
have to setup new server for each customer.

~~~
shaklee3
I disagree about it not being a good fit. Even if you run one application per
host (and if we're honest, it will always be more when we find flaws). You
still get the benefits of very predictable deployments, failovers, lifecycle
management, etc.

------
slad
I’m genuinely interested to know when containers should be used. I’m not
ignorant and understand advantages - consistent env, run anywhere, fast
startup time, etc. But there is also considerable cost involved with increased
complexity, educating devs, hiring experienced DevOps, etc

I have over 15+ years of professional development experience and AWS
experience and I understand the scale.

Our company has 20-30 instances used by 5-6 services. We use terraform/ansible
to deploy infrastructure so deploys are pretty reliable and repeatable. So I
am genuinely interested to understand if it’s worth going container route?

~~~
pm90
I don’t think you should move unless you expect your company will expand a
lot. It seems like your scale is fairly modest, your setup works and users are
happy. So it’s probably fine.

Advanced orchestration systems like kubernetes are very valuable if you want
to do multiple deployments in the same day, across many different services,
without losing your mind. It lets you move very very quickly... the time to
build a docker container, push it up to a registry and switch your deployment
to start using the updated version is ridiculously simple.

So it depends on what kind of shop you are and what your devs want. If they
like to iterate rapidly, containers and kubernetes is a well known path to get
there.

------
dnprock
I find Kubernetes a funny tech experiment. It does not present any new
innovation. We first had Docker. It's too hard to use. Let's abstract Docker
with an orchestration layer. We have Kubernetes. It's now too abstract and
hard to use. Let's create kops to tame it.

I guess this whole experiment is backed by a deep pocket tech giant trying to
chip away market share from the other two giants.

If Kubernetes can make us less dependent on AWS, that's a good thing. Until
then, I just sit and learn about this silly tech battle. I'm not keen to move
production stuff into Kubernetes.

~~~
shaklee3
Things that are very general and powerful will always require some level of
skill to use. There are easy ways to use kubernetes (look at kubeadm), but you
still absolutely need to understand the fundamentals in case you hit a
problem.

------
billsmithaustin
Whether Kubernetes will really drive up your server utilization levels depends
on how effective your Operations staff were at bin-packing and whether your a
Kubernetes cluster uses cpu quotas. There are inefficiencies in how the Linux
scheduler deals with cpu cgroups that constrain CPU utilization.

------
dgoog
K8s is insecure, heavily marketed by non engineers, and overally complicated
for no reason.

Its basically openstack redux yet for the instagram generation.

Most devops people I know think it's a total piece.

If the fanboyism doesn't die down I'd like to see these articles heavily
moderated.

