
Common mistakes using Kubernetes - marekaf
https://blog.pipetail.io/posts/2020-05-04-most-common-mistakes-k8s/
======
xrd
I wish there was a way to upvote something 10x once a month here. This would
be the post I use that on.

When I was writing my book my editor asked me to remove any writing about
mistakes and changes I made in the project for each chapter. I had a bug that
appeared and I wanted to write about how I determined that and fixed it. They
said the reader wants to see an expert talking, as if experts never make
mistakes or need to shift from one tact to another.

But, I find I learn the most from explanations that share how your mental
model was wrong initially and how you figured it out and how you did it "more
right" the next time.

That's really how people build things.

~~~
fnord123
>They said the reader wants to see an expert talking, as if experts never make
mistakes or need to shift from one tact to another.

Your editor was very fucking wrong.

~~~
gridlockd
> Your editor was very fucking wrong.

The editor is completely right in what they were saying. You just want them to
be wrong, because you'd prefer to live in the fantasy world where they are
wrong.

Let's say you go to get a surgery. You _don 't_ want the doctor to tell you
about all the times they fucked up and what the awful consequences were. It
doesn't matter that they're probably a better surgeon now, having learned from
their mistakes. Psychologically, you need that person with the sharp tool
poking around inside your body to be a _superhuman_.

To a lesser degree, the same is true for any expert. Of course _everybody_
makes mistakes. Notice the de-personalization in the word "everybody". You can
talk about the mistakes _everybody_ makes, or those ones that _many people_
make. If you talk about _your own mistakes_ however, you lose the superhuman
status. There may be a few situations where that somehow helps you, but not
when you _want to sell books_.

~~~
sokoloff
I sure do want that surgeon to have been presented/instructed on the common
ways that surgeons before them have made mistakes and how to avoid or overcome
them.

That they won’t tell me (the patient) is quite a different question from
whether they got the material from someone more experienced in their primary
or continuing medical education.

~~~
gridlockd
The situation is different, the psychological effect is not.

------
speedgoose
In my opinion, the most common mistake is not in the article : using
kubernetes when you don't need to.

Kubernetes has a lot of pros or the papers but in practice it's not worth it
for most small and medium companies.

~~~
balfirevic
Do you also include managed kubernetes offerings, such as from Digital Ocean,
in that assessment?

~~~
jcrawfordor
Honestly my experience has been that managed k8s is often _more_ complicated
from a developer perspective than just k8s - sure, you don't have to deal with
setting it up, but you have to figure out how all the 'management' and
provider-specific features work, and they often seem pretty clumsily
integrated.

~~~
geerlingguy
And in many cases features that would help your use case aren't enabled on
that platform, or are in a release that's still a year or two from being
supported on that platform... I'm looking at you, EKS.

------
yongjik
Well, nobody asked me, and I'm no expert, but here's my list of what (not) to
do in Kubernetes (if I had the authority).

1\. There. Is. No. Machine. (Insert matrix meme here.) Before you open up your
cluster to the rest of company, drill it down to them. Maybe even create a
Google Form where they have to sign "I hereby acknowledge that there is no
machine in k8s and any attempt to tie my job to a particular machine means a
broken config by definition."

2\. Thanks to 1, don't let anyone use hostNetwork, hostIPC, hostPID,
hostPorts, host whatever, unless you have a really good reason to (with
explicit approval process).

3\. Don't let anybody start a job without memory/CPU limit. Make sure they
understand that, if the job goes over the memory limit, it dies, and it's not
k8s admin's problem.

4\. You can't log anything into the pod - when the pod dies the log is gone.
You can't log into the machine, either (see 1). Therefore, you really need
some kind of logging framework that takes the log from your pod and saves it,
in its raw form, somewhere safe (like S3). I don't know if there's any such
framework, but there had better be.

5\. Make sure every manual operation is logged (who did what to which job
when), unless you like asking "@here Does anybody know who owns fooservice?"
every month.

6\. Kubernetes is not magic: if it takes thirty minutes to provision your
service, fix that, instead of moving thirty minutes of manual provisioning
into k8s and somehow expect it to be magically reliable.

7\. Don't bring in existing dependencies uncritically. If your job connects to
a zookeeper server to find out its peers, don't bring it into k8s, but rewrite
it to use k8s service instead.

8\. Take extra extra care when writing down your first job specification,
because there are a lot of yaml files to write, and people will just copy
what's already there. If your first k8s job mounts host /tmp directory just
because you were testing something and forgot to delete the line, soon you
will have fifty jobs all mounting host /tmp directory. Good luck figuring out
which job actually needs it then.

Yeah, again, I'm by no means an expert - I'm not even an admin, so just
consider the list as a rambling of some poor soul who has seen some stuff.
Here be dragons, have fun.

~~~
user5994461
4\. Centralized logging is the basics in any company. A container simply logs
to stdout (kubernetes) and a fluentd/logstash agent can forward it.

7\. Sadly if existing software can't run in kubernetes, this severely limit
the benefits of having kubernetes, why use something that can't be used? If
the jobs already have service discovery, they might be better off running on
hostNetwork or whatever allow them to work as is.

~~~
yongjik
Re: 7, if you use a separate discovery like zookeeper, then it will have its
own idea of which jobs are alive and ready, and k8s will have its own idea of
the same thing - they may not necessarily agree with each other. They don't
even know the existence of each other. Maybe it would work, but it seems like
more hassle than rewriting the offending part.

From what I've seen, Kubernetes is a quintessential Google product - it has a
_very_ particular idea of how jobs should be run, and the farther you stray
away from it, the more it will cost your sanity. If that curtails the benefits
of k8s for you, then yes, I think you should revisit whether you really need
k8s, based on that limitation.

Just my two cents.

~~~
user5994461
If we're talking about zookeeper for service discovery. The registering
application can maintain a connection to zookeeper and ping every few seconds.
So zookeeper has a very good idea of what's running or not, usually more
accurate than kubernetes.

Actually, zookeeper has API to register only when ready and to listen to
events/changes. Not sure kubernetes has equivalent stable API so might be hard
to port over.

A rewrite might be a solution, except it's not because it's doomed to fail.
We're discussing service discovery, which implies multiple clients and
servers, probably managed by different teams and written in different
languages. The odds of completing a coordinated rewrite effort are abysmal. ^^

Well. I am thinking out loud. It's a real problem my company was facing. We've
got kubernetes clusters that are supposed to run applications and we've got
apps using zookeeper that can't run in kubernetes because it breaks the
service discovery. I will eventually have to hint people how to make it
happen, after one year of kubernetes hardly going anywhere.

------
zegl
Great post! If you're in the Kubernetes space for long enough, you'll see all
of these configuration mistakes happening over and over again.

I've created a static code analyzer for Kubernetes objects, called kube-score,
that can identify and prevent many of these issues. It checks for resource
limits, probes, podAntiAffinities and much more.

1: [https://github.com/zegl/kube-score](https://github.com/zegl/kube-score)

~~~
kakakiki
Excellent tool. Can this analyse the result from kustomize files rather than
actual k8s YAML?

~~~
zegl
Yes, kustomize is not supported natively, but you can achieve effect by piping
the kustomize output to kube-score.

    
    
        kustomize build | kube-score score -

~~~
kakakiki
Thanks. One more question. For Visual Studio Code, Microsoft has a plugin
called Kubernetes - which I currently use. Have you done a comparison against
that?

------
otterley
I actually disagree with the first recommendation as written - specifically,
not to set a CPU resource request to a small amount. It's not always as
harmful as it might sound to the novice.

It's important to understand that CPU resource requests are used for
scheduling and not for limiting. As the author suggests, this can be an issue
when there is CPU contention, but on the other hand, it might not be. That's
because memory limits are even more important than CPU requests when
scheduling: most applications use far more memory as a proportion of overall
host resources than CPU.

Let's take an example. Suppose we have a 64GB worker node with 8 CPUs in it.
Now suppose we have a number of pods to schedule on it, each with a memory
limit of 2GB and a CPU request of 1 millicore (0.001CPU). On this node, we
will be able to accommodate 32 such pods.

Now suppose one of the pods gets busy. This pod can have all the idle CPU it
wants! That's because it's a request and not a limit.

Now suppose _all_ of the pods become fully CPU contended. The way the Linux
scheduler works is that it will use the CPU request as a _relative weight_
with respect to the other processes in the parent cgroup. It doesn't matter
that they're small as an absolute value; what matters is their relative
proportion. So if they're all 1 millicore, they will all get equal time. In
this example, we have 32 pods and 8 CPUs, so under full contention, each will
get 0.25 CPU shares.

So when I talk to customers about resource planning, I actually usually
recommend that they start with low CPU reservation, and optimize for memory
consumption _until_ their workloads dictate otherwise. It does happen that
particularly greedy pods are out there, but that's not the typical case - and
for those that are, they will often allocate all of a worker's CPUs in which
case you might as well dedicate nodes to them and forget about how to
micromanage the situation.

~~~
jeffbee
If you ask for 0.001 CPU share, you might get it. I would advise caution. You
that pod gets scheduled on a node with another node that asks for 4 CPUs and
100MB of memory, it's not going to get any time.

~~~
otterley
It depends. If the second pod requests 4 CPUs, it doesn't necessarily mean
that the first pod can't use all the CPUs in the uncontended case.

A lot of this depends on policy and cooperation, which is true for any
multitenant system. If the policy is that nobody requests CPU, then the
behavior will be like an ordinary shared Linux server under load - the
scheduler will manage it as fairly as possible. OTOH, if there are pods that
are greedy and pods that are parsimonious in terms of their requests, the
greedy pods will get the lion's share of the resources if it needs them.

The flip side of overallocating CPU requests is cost. This value is subtracted
from the available resources, making the node unavailable to do other useful
work. Most of the time I see customers making the opposite mistake -
overallocating CPU requests so much that their overall CPU utilization is well
under 25% during peak periods.

~~~
jeffbee
Most people would be thrilled to get anything close to 25% CPU util. I guess
one of the big missing pieces fro Borg that hasn't landed in k8s is node
resource estimation. If you have a functional estimator, setting requests and
limits becomes a bit less critical.

------
bavell
Great article, I've learned many of these firsthand and agree with their
conclusions. I have some more reading to do on PDBs!

K8s is a powerful and complex tool that should only be used when needed. IMO
you should be wary of using it if you're trying to host less than a dozen
applications - unless it's for learning/testing purposes.

It's a complex beast with many footguns for the uninitiated. For those with
the right problems, motivations and skillsets, k8s is the holy grail of scale
and automation.

~~~
toshk
What would you suggest as an alternative, simpler form for docker deploy,
running and managing? Docker-compose?

~~~
ghaff
What do you mean by docker hosting? Kubernetes (and other related tools) are
container orchestration/management tools. As if often the case in the
management space, if you're just running at small scale, you may not need
anything beyond container command line tools and some scripts. You could also
use Ansible to automate.

~~~
toshk
Thanks we are running a few node servers, we now deploy command line. Dev we
use docker-compose. But we are looking for a way to easily share our servers.
We developed it for Amsterdam open source. Around 20-30 cities are in line to
start using it. Doesn't have to be scalable, or have high availibility. Ease
of deployment, easy way to update and basic security. All sysadmins are
pushing for kubernetes, although for the big cities it makes sense, it really
starting to feel like an overkill for small cities who will run 1-3 non-
critical sites with 0.5-5k users p/m. Heard a lot about ansible, will look
into it, thanks!

~~~
ghaff
So it sounds as if you've sort of outgrown the command line but aren't sure
you want to jump in on self-managed Kubernetes. You'd have to look at the
costs but maybe some sort of managed offering would work for you. It could
scale up for larger sites but would be fairly simple for smaller ones--
especially with standardized configurations.

You could look at the big cloud providers directly. OpenShift [I work at Red
Hat] also has a few different types of managed offerings.

------
triodan
The guaranteed QoS example in the article is wrong. Kubernetes only sets the
Guaranteed QoS if the CPU count is an integer (which 0.5 is not).

Also, to take full benefit of the QoS you need to configure the Kubelet with
"\--cpu-manager-policy static"[0].

[0]: [https://kubernetes.io/blog/2018/07/24/feature-highlight-
cpu-...](https://kubernetes.io/blog/2018/07/24/feature-highlight-cpu-manager/)

~~~
marekaf
Thanks to pointing that out! I will edit it.

------
fergonco
Shameless plug but on topic. I wrote recently about readiness and liveness
probes with Kubenetes. If you look for an educational perspective you can
check: [https://medium.com/aiincube-engineering/kubernetes-
liveness-...](https://medium.com/aiincube-engineering/kubernetes-liveness-and-
readiness-probes-with-spring-boot-185af0d5b5de)

------
haolez

        ...
        requiredDuringSchedulingIgnoredDuringExecution:
            ...
    

This instantly remembered me of this: [https://thedailywtf.com/articles/the-
longest-method](https://thedailywtf.com/articles/the-longest-method)

Kubernetes sometimes shows its Java roots.

~~~
ithkuil
> its Java roots

Citation needed.

AFAIR Borg is implemented in C++ and k8s has been implemented in Go from day
0.

Am I missing some crucial steps in k8s's history?

~~~
detaro
Apparently the first version was based on a Java prototype, but it's unclear
to me how much that's visible:

[https://kubernetes.io/blog/2018/06/06/4-years-
of-k8s/](https://kubernetes.io/blog/2018/06/06/4-years-of-k8s/)

> _Concretely, Kubernetes started as some prototypes from Brendan Burns
> combined with ongoing work from me and Craig McLuckie to better align the
> internal Google experience with the Google Cloud experience. Brendan, Craig,
> and I really wanted people to use this, so we made the case to build out
> this prototype as an open source project that would bring the best ideas
> from Borg out into the open._

> _After we got the nod, it was time to actually build the system. We took
> Brendan’s prototype (in Java), rewrote it in Go, and built just enough to
> get the core ideas across_

------
darkwater
Really nice article, it shows a lot of the small details you have to take into
account to go from "deploying Kubernetes" to "deploy a production-grade
Kubernetes" where production means some real, trafficked site.

------
dang
I'm glad readers are liking the article, but please read and follow the site
guidelines. Note this one: _If the title begins with a number or number +
gratuitous adjective, we 'd appreciate it if you'd crop it. E.g. translate "10
Ways To Do X" to "How To Do X," and "14 Amazing Ys" to "Ys." Exception: when
the number is meaningful, e.g. "The 5 Platonic Solids."_

The submitted title was "10 most common mistakes using kubernetes", HN's
software correctly chopped off the "10", but then you added it back.
Submitters are welcome to edit titles when the software gets things wrong, but
not to reverse things it got right.

~~~
marekaf
Oops, sorry. I thought I made a typo that is why I corrected it. Thanks for
pointing that out.

------
apple4ever
"more tenants or envs in shared cluster"

This is what I'm trying to convince my current company about. They want
everything in a single cluster (prod, test, stage, qa).

Of course self hosting makes this more difficult to justify, since it is
additional expenses for more machines.

~~~
LambdaB
Have you considered using OpenShift instead of Kubernetes? It comes with
vastly improved multitenancy features, as well as other aspects, in regards to
plain Kubernetes. OKD, the open sourced package of OpenShift allows full self-
hosting: [https://www.okd.io](https://www.okd.io)

~~~
apple4ever
OpenShift comes with its own headaches from my understanding. And we are too
deep into Kube to switch now.

------
config_yml
What would a good liveness and readyness probe do for a rails app? What kind
of work and metrics would these 2 endpoints do in my app?

~~~
Nextgrid
For the readiness probe a simple endpoint that returns 200 is enough. This
tests your service’s ability to respond to requests without depending on any
other dependencies (sessions which might use Redis or a user auth service
which might use a database).

For liveness probe I guess you could check if your service is accepting TCP
connections? I don’t think there should ever be a reason for your service to
outright refuse connections unless the main service process has crashed (in
which case it’s best to let Kubernetes restart the container instead of having
a recovery mechanism inside the container itself like supervisord or daemon
tools).

~~~
kevindong
> For the readiness probe a simple endpoint that returns 200 is enough. This
> tests your service’s ability to respond to requests without depending on any
> other dependencies (sessions which might use Redis or a user auth service
> which might use a database).

If the underlying dependencies aren't working, can a pod actually be
considered ready and able to serve traffic? For example, if database calls are
essential to a pod being functional and the pod can't communicate with the
database, should the pod actually be eligible for traffic?

~~~
Nextgrid
The article explicitly warns against that:

> Do not fail either of the probes if any of your shared dependencies is down,
> it would cause cascading failure of all the pods.

The idea would be that the downstream dependencies have their own probes and
if they fail they will get restarted in isolation without touching the
services that depend on them (that are only temporarily degraded because of
the dependency failure and will recover as soon as the dependency is fixed).

------
zomglings
Really great article.

I have used Kubernetes pretty heavily in the past, and didn't know about
PodDisruptionBudget.

------
garadox
One thing to watch for with pod antiAffinity - if you use required vs
preferred, and your pod count exceeds the node count, the remainder will be
left in Pending and won't spin up anywhere.

~~~
dharmab
There's a new feature which does a better job of spreading Pods without
blocking scheduling quite as badly:
[https://kubernetes.io/docs/concepts/workloads/pods/pod-
topol...](https://kubernetes.io/docs/concepts/workloads/pods/pod-topology-
spread-constraints/)

------
hinkley
> You can't expect kubernetes scheduler to enforce anti-affinites for your
> pods. You have to define them explicitly.

Why isn't this the default behavior? Why don't I have to go in and tell it
that it's okay to have multiple instances on the same node? Why? So that I
somehow feel like I've contributed to the whole process by fixing something
that never should break in the first place?

I know of a few pieces of code where I definitely want to run N copies on one
machine, but for all of the rest? Why am I even running 2 copies if they're
just going to compete for resources?

~~~
jeffbee
It's quite possible that you have a machine with 192 CPU cores in it, but it's
very unlikely that you are able to write a service that scales to that level
... and if you write it in Go it's really unlikely that you can scale even to
8 CPUs. There's nothing weird about having multiple replicas of the same job
on the same node. If you look through the Borg traces that Google recently
published you can find lots of jobs with multiple replicas per node.

~~~
hinkley
This is not how defaults work.

When you are talking about the realm of the _possible_ , you provide settings
that allow you to reach the scenarios that you feel are reasonable, desirable,
or lucrative (or commonly enough, some happy combination of the three).

Defaults are the realm of the probable. And nobody is requisitioning a 192
core machine without a good bit of due diligence, which would include deciding
how to set server affinity.

~~~
jeffbee
You're suggesting that preventing multiple replicas of the same job to
schedule on the same machine as a good default. There's no evidence to support
your conclusion, and my experience it quite the opposite. It is much better if
people running batch jobs just schedule 100000 tiny replicas, and let the
scheduler sort it out. This provides the cluster scheduler with plenty of
liquidity. Multiple small processes are more efficient than a shared-nothing
single process.

~~~
hinkley
Still the same question.

Do you think that batch processing is the default activity in Kubernetes, or
something that people find after they are familiar with the system?

~~~
jeffbee
Yes, I think batch workloads are the most common workloads, in resource-
weighted terms, among k8s users.

~~~
hinkley
You're being slippery, which comes across as dishonest.

Why does the resource weight have anything to do with the choice of defaults?
Settings don't care how often they are read, they only care how often they are
set. Large jobs use a disproportionate amount of total resources, sure, but
they are tiny uptick in total configuration.

The stakes are higher, but so is the 'budget' for getting things right. I can
deploy 5 servers and just wait to see what happens. If I'm doing an overnight
job to process a billion records, I'd better be doing some due diligence
beforehand, or I have nobody to blame but me. And the failure mode here is
that I didn't spend money fast enough to get the job done.

With the current defaults what happens is I blow my monthly budget in one
night. Which is very convenient for the vendor, but not convenient for my
company.

"It is difficult to get a man to understand something when his salary depends
upon his not understanding it." \- Upton Sinclair

------
whatsmyusername
TBH, for me it's usually "Using Kubernetes."

Maybe on GCP (I don't see a lot of companies on GCP) it makes sense, but ECS
is AWS native and on bare hardware I immediately go to docker swarm since it
ships with the container runtime (instead of a bolted on sidecar container
thing).

I like the primitives and features of Kubernetes, but the implementation
doesn't give me warm fuzzies and it always gets passed over for safer bets for
me. Even very early on in it's development I always went to Mesos over
Kubernetes (though Mesos is P dead at this point).

------
aganame
Don't use Kubernetes unless you Know you need it.

------
kirstenbirgit
Lots of good advice in this article.

------
musicale
Oh I thought this was

Common mistakes: using Kubernetes

------
mlthoughts2018
I think there is more to the story for some of these points and it can be
dangerous to just take this at face value of best practices.

For example on the liveness / readiness probe item, the article says,

> “ The other one is to tell if during a pod's life the pod becomes too hot
> handling too much traffic (or an expensive computation) so that we don't
> send her more work to do and let her cool down, then the readiness probe
> succeeds and we start sending in more traffic again.”

But this is often a very bad idea and masks long term errors in
underprovisioning a service.

If the contention of readiness / liveness checks vs real traffic is ever
resulting in congestion, you need the failure of the checks to surface it so
you can increase resources. If you set things up so this failure won’t
surface, like allowing the readiness check to take that pod out of service
until the congestion subsides, you’re only hurting yourself by masking the
issue. It basically means your readiness check is like a latency exception
handler outside the application, very bad idea.

The other item that is way more complicated than it seems is the issue about
IAM roles / service accounts instead of single shared credentials.

In cases where your company has an enterprise security team that creates
extremely low-friction tools to generate service account credentials and
inject them, then sure, I would agree it’s a best practice to ruthlessly split
the credentialing of every application to a shared resource, so you can
isolate access and revoking.

But if you are on some application team and your company doesn’t have a mature
enough security tooling setup managed by a separate security team, this can
become a bad idea.

It can lead to superlinear growth in secrets management as there will be
manual service account creation and credential propagation overhead for every
separate application. Non-security engineers will store things in a password
manager, copy/paste into some CI/CD tool, embed credentials as ENV permanently
in a container, etc., all because they can’t create and maintain the end to
end service account credential tools in addition to their job as an
application team engineer. It’s something they think about twice per year and
need off their plate immediately to move on to other work.

Across teams it means you end up with 20 different team-specific ways to cope
with rapid growth of service accounts, leading to an even worse security
surface area, risk of credential-based outages, omission of important testing
because ensuring ability to impersonate the right service account at the right
place is too hard, etc.

Very often it is a _real_ trade-off to consider that one single service
account credential that has just one way to be injected for every service is
_safer_ in the bigger picture.

Yes it means a credential issue for any service becomes an issue for all, and
this is a risk and you want automated tooling to mitigate it, but it very
often will be less of a risk than insisting on a parochial best practice of
individual service account credentials, resulting in much worse and less
auditable secrets workflows overall _unless_ it is completely owned and
operated by a central security team in such a way that it doesn’t create any
approval delays or workflow friction for application teams.

~~~
jeffbee
You of course should monitor the rate of liveness flapping for your services.
The need to monitor it does not imply that it's a bad feature.

~~~
mlthoughts2018
You can’t have it both ways. If you need to monitor it and take corrective
action (which you do) then you shouldn’t rely on it.

This is an argument _for_ making your liveness probe == readiness probe. It
should just check pod availability in a minimal way, and if continuing to send
the pod traffic based on this indicator turns out bad because of congestion,
you want to see that causing errors and react, not let the scheduler take it
out of service for new traffic.

You want liveness & readiness to check the same thing, and it should be a non-
trivial check of service health that is also very low latency. And as long as
that check is passing, keep sending traffic.

When the check fails, it should always be for a “hard down” reason that tells
you the pod could not, regardless of traffic levels, accept traffic because
it’s fundamentally internally down.

~~~
jeffbee
I don't want the pager to go off just because of some slight non-liveness.
That's a likely outcome of high utilization (usually viewed as a good thing,
isomorphic with low cost). If you're running really hot and a few tasks are
shedding load by playing dead intermittently, that's OK up to a point; if a
large portion of pods are doing that at a high rate, that might be bad. You
might not even alert on it, just throw it up on a dashboard as informative
indicator for operators.

~~~
mlthoughts2018
> “ I don't want the pager to go off just because of some slight non-
> liveness.“

That’s just bad engineering. Really, one _should_ want the pager to go off for
that and be really pedantic to actually sniff out the root cause and actually
fix it.

Hiding that type of issue by letting something like liveness/readiness policy
tacitly conceal it is just going to result in a far worse or more systemic
issue later with far worse pager disruptions to your life.

You’re skipping flossing every now and then only to need serious root canals
later.

------
sixhobbits
This also needs a companion post called "common mistakes: using kubernetes".

I feel like it's a weekly occurrence now where I hear of a startup launching
their mvp on kubernetes having spent 8 months too long on Dev as a result.

The other day in an interview someone bragged to me how he had convinced his
team to spend 12 months moving to K8s. Upper management thought it was a waste
of time but eventually agreed. I asked him if there were any measurable
benefits and he said no.

I totally understands why Google needs it. Do you?

~~~
snupples
Yes we need it.

This is absolutely becoming a tiresome trope. K8s is a huge benefit to tons of
companies and none of them are Google.

Yes some people are using K8s when they don't need it. Just like many are
using cloud managed services when they don't need them. Or vms. Or insert any
technology here.

This article has nothing to do with whether K8s fits some particular use case
but may be of help (although I disagree with the entire section on resources
which reflects a lack of long term experience with K8s in production) to those
who do want to use it.

You're the 10th person in this thread saying the same thing and it doesn't
appear you even have that much experience with operations in general.

Sorry to go off on you but I'm really seeing these types of tropes and quick
depthless one liner comments and offtopic snipes lately as the downfall of the
hn comment section.

~~~
recursive
Kubernetes doesn't benefit Google? How not?

~~~
ghostpepper
I think the intention of the post you replied to was to say that many
companies other than google have a legitimate need for kubernetes.

------
chx
It misses the biggest one: using it. I ranted about the cloud a decade ago
[http://drupal4hu.com/node/305](http://drupal4hu.com/node/305) and there's
nothing new under the Sun. Still most companies doing cloud and Kubernetes
doesn't need it... practice YAGNI ferociously.

~~~
tcbasche
I think you _may_ be on the wrong side of history here

~~~
chx
Nothing new with that one. I still think git was the wrong choice for DVCS and
yet, I have been using it since for a decade or more now. I am still still
feeling I have an uneasy truce with it but not a friendship. I still think
github is a shitty choice for hosted git -- at least most large open source
projects have went with gitlab so I am not utterly alone with that. I am using
now Kubernetes because my primary client is using it. Doesn't mean I am happy
with it or that I think it's necessary by any means. It's fine. I am getting
old but I still can learn. Doesn't mean I can't be grumpy about it.

