
Blending complex systems made my latency 10x higher - wheresvic3
https://srvaroa.github.io/kubernetes/migration/latency/dns/java/aws/microservices/2019/10/22/kubernetes-added-a-0-to-my-latency.html
======
bavell
Very misleading title, was hoping for a more substantive read. Kubernetes
itself wasn't causing latency issues, it was some config in their auth service
and AWS environment.

In the takeaways section, the author blames the issue on merging together
complicated software systems. While absolutely true, this isn't specific to
k8s at all. To specifically call out k8s as the reason for latency spiking is
misleading.

~~~
tunesmith
I think the point is that this was a real problem that happened because of
combining k8s and aws, which is a pretty common scenario. And it underscores
that the bug was hard to find - I'm not sure how many people on my team would
be comfortable looking deeply at both GC and wireshark. It required asking
"Why" a few more levels deep than bugs usually require, and I think a lot of
developers would get stumped after the first couple of levels. So it's another
piece of data just counseling that a proper k8s integration is not as easy as
people might expect.

I also get the sense that that team has a better than average allocation of
resources. Some teams I've been on, this type of problem would be the
responsibility of one person within an afternoon, or with impatient product
people and managers checking for status after that.

~~~
StreamBright
This is exactly what happens when you abstract away anything, in this case,
infrastructure. Most of the time people focus on the value-added by the
abstraction. This time somebody had to face the additional burden that was
introduced by it, making it harder to track down the bug.

------
01CGAT
"Once this change was applied, requests started being served without involving
the AWS Metadata service and returned to an even lower latency than in EC2."

Title should be: My configuration made my latency 10x higher.

~~~
donaldihunter
We solved our configuration problems, and most rewardingly, we managed better
performance than the original EC2 baseline.

~~~
paulddraper
Too long to fit in the title, I suppose.

------
lacker
The problem wasn't in Kubernetes at all! The problem was in KIAM and the AWS
Java SDK.

It would be more accurate to criticize AWS's Kubernetes support. Both KIAM and
the AWS Java SDK are specific to AWS.

~~~
pojzon
We found EKS to be really disappointing in comparison to Self hosted solution.
Not only you simply cannot tweak extremely important kubelet configuration,
you also cannot run real HA. Most of AWS implementation around EKS was simply
terrible and outclassed by community driven projects. For me personally EKS is
the same failed service as Elasticsearch Service. Good for low to medium size
workloads but terrible for anything 1st world class.

~~~
threeseed
Why can't you run "real HA" with EKS ?

It is after all running the control plane in multiple AZs.

~~~
nielsole
Also why can't you modify kubelet params? You are completely in charge of the
nodes and can configure them freely.

~~~
pojzon
There are some cluster parameters you simply cannot change because api refuses
or are not available (node cluster dependant parameters also). Example can be
HPA downscale grace periods..

------
TimMurnaghan
What grinds my gears is hard-coded magic timeout numbers. Somehow microservice
people seem to think that these are good (e.g. for circuit breakers) without
seeming to realize the unexpected consequences of composing like this. Your
timeout is not my timeout. So firstly - don't do it. Time is an awful thing to
build behaviour on, and secondly if you ignore that then make it a config
parameter - then I've got a chance of finding it without wire level debugging
(if you document it).

~~~
kevan
The timeouts this post is talking about are related to credential expiration
and when to refresh, not request/connection timeouts like you'd see in
microservices. In this case, not expiring credentials isn't a great option
because you'd lose a useful security property: Reducing the time window when
stolen credentials can be used.

For service behavior (e.g. request/connections), timeouts provide value for
services and clients. For services, if you never time out then under failure
conditions you either end up saturating your max concurrent request limits or
growing your concurrent connections indefinitely until you hit a limit
(connections, threads, RAM, CPU). Unless all of your clients are offline batch
processes with no latency SLA there's a good chance that the work clogging up
your service was abandoned by your clients long before it completes.

Timeouts also help clients decide if/when they should retry. Even if the
service never times out, clients can't really tell if their request is just
taking a long time or if something is wrong and the request will never succeed
(e.g. network partitions, hardware failure). There's at least implicit latency
SLAs on most things we do (1 second? how about an hour or week?). Given that
there is a limit somewhere, it makes sense to use that limit to get benefits
like resiliency in services.

>>>Your timeout is not my timeout.

Absolutely. Client deadlines are a great way to reduce wasted work across
distributed systems. e.g. service has 60s timeout, but client has a 5s timeout
for an online UI interaction. The client can tell the service to give up if
it's not completed within the client's SLA.

------
beat
And this is why I find DevOps work more interesting than programming. System
integration is just endlessly challenging. I always enjoy reading a well-
documented integration debugging session!

~~~
auggierose
In my opinion DevOps shouldn't exist at all.

~~~
icedchai
Okay. So based on 20+ years of experience, I can say that most developers have
no interest in automating deployment, configuration, monitoring, performance,
logging, etc... Who should do this work?

~~~
learned2code
Yeah, operations-focused engineers will continue to have a niche carved out
for them because too many devs black-box infrastructure.

Companies can either choose to have their devs take on ops responsibilities or
continue having dedicated ops jobs.

In either case, whether or not dedicated ops jobs exist, ops responsibilities
always will. I'll be there to pick up the slack because designing and
maintaining systems is an interesting job that has to be done that a lot of
people can't/won't do and it pays accordingly.

~~~
tunesmith
People overlook that there's a common systematic belief that looks like this:

    
    
      1) Developers don't code on prod
      2) Prod needs to be protected
      3) Therefore, developers need to be restricted from prod
    

One of my teams went through the process of "let's do DevOps!" with the intent
of giving developers the ability of pushing something all the way through to
prod on AWS. Months later, this resulted in having a poorly-supported dev-only
VPC with IAM/policy restrictions, and other "official" VPCs that devs are
locked out of in various ways. Since then, devs had little incentive to learn
and are again reliant on Ops for any deployment problems.

~~~
lostcolony
There's a common systematic belief of that because that's the sort of thing a
lot of actual compliance regulations de facto require (i.e., they demand
controls around software deploys, and putting enforcing that in the same hands
as those wanting to deploy it, i.e., devs, will fail an audit).

Source: My employer is currently undergoing SOX compliance

~~~
beat
And there's a good reason that separation of duties is in every compliance
standard...

~~~
lostcolony
Not saying whether it's a good or bad idea, just that it's a common systematic
belief because it's a required thing in many organizations.

------
bboreham
This issue has been hit by quite a number of people in the last year [1]

AWS have recently added the feature natively [2], so once that version of the
SDK is in use by all your pods you won't need Kiam any more.

[1]
[https://github.com/uswitch/kiam/issues/191](https://github.com/uswitch/kiam/issues/191)
[2] [https://aws.amazon.com/blogs/opensource/introducing-fine-
gra...](https://aws.amazon.com/blogs/opensource/introducing-fine-grained-iam-
roles-service-accounts/)

------
antoncohen
We have hit similar issues with GKE. GKE has a soon to be deprecated feature
called "metadata concealment"[1], it runs a proxy[2] that intercepts the GCE
metadata calls. Some of Google's own libraries made metadata requests at such
a high rate that the proxy would lock up and not service any requests. New
pods couldn't start on nodes with locked up metadata proxies, because those
same libraries that overloaded the proxy would hang if metadata wasn't
available.

That was compounded by the metadata requests using DNS and the metadata IP,
and until recently Kubernetes didn't have any built-in local DNS cache[3] (GKE
still doesn't), which in turn overloaded kube-dns, making other DNS requests
fail.

We worked around the issues by disabling metadata concealment, and added
metadata to /etc/hosts using pod hostAliases:

    
    
        hostAliases:
          - ip: "169.254.169.254"
            hostnames:
              - "metadata.google.internal"
              - "metadata"
    

[1] [https://cloud.google.com/kubernetes-engine/docs/how-
to/prote...](https://cloud.google.com/kubernetes-engine/docs/how-
to/protecting-cluster-metadata#concealment)

[2] [https://github.com/GoogleCloudPlatform/k8s-metadata-
proxy](https://github.com/GoogleCloudPlatform/k8s-metadata-proxy)

[3] [https://kubernetes.io/docs/tasks/administer-
cluster/nodeloca...](https://kubernetes.io/docs/tasks/administer-
cluster/nodelocaldns/)

------
wpietri
I really enjoy find-the-issue stories like this. Bug hunting can be such fun.

------
jeffdavis
"We are blending complex systems that had never interacted together before
with the expectation that they collaborate forming a single, larger system."

I suggest more people read the first few chapters of _Specifying Systems_ by
Lamport. Maybe the rest is good also, but that's as far as I got.

It works through a trivial system (display clock) and combines it with another
trivial system (display weather).

Nothing Earth-shattering, but it really stuck with me. Thinking about it at
that level gave me a new appreciation for what combining two systems _means_.

------
brainflake
Isn't this more of an AWS-specific issue?

------
kodablah
Agreed about poor title, but:

> DNS resolution is indeed a bit slower in our containers (the explanation is
> interesting, I will leave that for another post).

I would like to see this expanded upon or hear if anyone else has suffered
similar.

~~~
stuff4ben
Exactly! On our OpenShift production cluster we ran into ndots problems with
DNS and slow DNS resolution overall. This blog post was very helpful in
understanding the issue and ways to fix it, [https://pracucci.com/kubernetes-
dns-resolution-ndots-options...](https://pracucci.com/kubernetes-dns-
resolution-ndots-options-and-why-it-may-affect-application-performances.html)

~~~
smarterclayton
Yeah, Tim Hockin and I still regret not designing the DNS name search process
in Kube better. If we had, we would have avoided the need for 5 and could have
kept 90% of the usability win of “name.namespace.svc” and “name” being
resolvable to services without having to go to 5. And now we can’t change it
by default without breaking existing apps.

Forwards compatibility is painful.

~~~
kodablah
Pardon my lack of Kubernetes knowledge, but any regrets supporting the
hierarchical lookup where they don't have to qualify their dns requests (and
maybe could have used some other way to find their "same namespace")?

~~~
smarterclayton
Good question. I certainly use the “name” and “name.namespace.svc” forms
extensively for both “apps in a single namespace” and “apps generic to a
cluster”.

I know a small percentage of clusters makes their service networks public with
dns resolution (so a.b.svc.cluster-a.myco is reachable from most places).

The “namespace” auto-injected file was created long after this was settled, so
that wasn’t an option. I believe most of the input was “the auto env var
injection that docker did we don’t like, so let’s just keep it simple and use
DNS the way it was quasi intended”.

Certainly we intended many of the things istio does to be part of Kube
natively (like cert injection, auto proxying of requests). We just wound up
having too much else to do.

------
btown
Isn't this a KIAM bug? The default configuration of any piece of software
should not cause pathological cases in other pieces of software _that are
commonly used with it._ Maybe I'm just a bleeding heart, but I think good
software delights its users; the deployment and configuration story is a part
of this.

------
nova22033
this is good info but the title is misleading. This could have easily been "my
latency increased because my local time was off" or something like that. This
had nothing to do with kubernetes..it was a problem with their set up.

------
nilshauk
It was a nice read. The real lesson here is how to diagnose issues like this.

------
yskchu
> Kubernetes made my latency 10x higher

The title is a bit misleading, kub didn't cause the 10x latency - also latency
was lower after they fixed their issues

TL;DR version - Migrate from EC2 to Kub; due to some default settings in Kiam
& AWS Java SDK, latency of application increased, fixed after reconfiguration
and kub latency lower than EC2

~~~
rb808
It is relevant though as k8s makes everything more complicated so you have to
deal with stuff like this. Also if it was a brand new app theyd maybe not
notice the problem in the first place.

~~~
yskchu
I'm not saying it's not relevant - it was actually a good technical read, they
went through alot of detail with their troubleshooting.

I'm just saying the title is misleading, and clickbaity.

------
crb002
Listen to Grace Hopper. Count the feet (nanoseconds) data has to travel at
speed of light so the Admiral respects the laws of physics.

