
Connecting Kubernetes services with linkerd - dankohn1
https://lwn.net/SubscriberLink/719282/d8444c95089cd014/
======
alpb
Great article, I've been able to learn more about why a cluster setup could
utilize linkerd. However this statement confused me a little:

    
    
        However, Finagle is written in Scala, which Gould concedes is "not for everyone".
    

linkerd is written in Scala [1] and linkerd is written in Rust [2]. If he is
optimizing for other people’s contributions, is this really the best way to
go?

[1] [https://github.com/linkerd/linkerd](https://github.com/linkerd/linkerd)
[2] [https://github.com/linkerd/linkerd-
tcp](https://github.com/linkerd/linkerd-tcp)

~~~
daenney
> linkerd is written in Scala [1] and linkerd is written in Rust [2]

Though your links are correct your statement isn't. Linkerd is written in
Scala, but the Rust bit is linkerd-tcp, a TCP loadbalancer, which linkerd (the
Scala one) doesn't handle as that covers HTTP and gRPC with optionally
encrypting connections between linkers using TLS.

> If he is optimizing for other people’s contributions, is this really the
> best way to go?

But is he optimising for other people's contributions? Using a C/Rust for
something like a TCP proxy makes a lot more sense to me than having a JVM do
that. And if he perceives Scala as "not for everyone", maybe adding Rust means
he might get more contributions to those components?

I think that statement actually refers to Finagle itself being written in
Scala, so if you want to use it you have to at least hoist your own apps on
the JVM and integrate Finagle in them. Finagle is essentially client-side and
requires integration into your app tightly coupling the two. By spinning it
out as a standalone component that acts as a proxy, like linkerd is doing,
Scala doesn't get in the way and what Finagle offers can be used with a mesh
of microservices written in a myriad of languages.

~~~
alpb
Sorry, not typing -tcp on linkerd-tcp was completely a typo on my end.

> I think that statement actually refers to Finagle itself being written in
> Scala, so if you want to use it you have to at least hoist your own apps on
> the JVM and integrate Finagle in them. Finagle is essentially client-side
> and requires integration into your app tightly coupling the two.

Thanks, this is the answer I have been looking for.

------
Artemis2
Lyft has a tech similar to linkerd called Envoy:
[https://github.com/lyft/envoy](https://github.com/lyft/envoy). It is a bit
lower level than linkerd (based on APIs rather than already integrated with
common services) and written in C++.

I'm working on an adapter which should offer a better integration with
Kubernetes. The end goal is to have envoy be able to replace our Ingress
controllers. Not having to run the JVM on every machine is also nice.

~~~
stock_toaster

      > Not having to run the JVM on every machine is also nice.
    

Personally, Envoy compiling to a binary makes it much more likely that I would
experiment with it than with linkerd. I have hesitated trying out linkerd
because we don't run anything JVM based in any of our stack, and we aren't
currently excited about introducing any.

I am definitely going to take a look at Envoy. Thanks for referencing it!

------
webo
The content of the article does not have anything to do with the title. There
is not mention of how linkerd and k8s work together.

------
jimktrains2
> Facilities that linkerd provides to assist with this include service
> discovery, load balancing, encryption, tracing and logging, handling
> retries, expiration and timeouts, back-offs, dynamic routing, and metrics.

Why not use SRV records and TLS? Retries, expiration, timeouts, and back-offs
can all be placed into the networking library, and are, commonly.

Also, regarding [https://linkerd.io/](https://linkerd.io/), what's with
landing pages having almost no useful information on them. "linker∙d is a
transparent proxy that adds service discovery, routing, failure handling, and
visibility to modern software applications" what's that even mean?

> Fast, lightweight, and performant > Handles tens of thousands of requests
> per second per instance with minimal latency overhead. Scales horizontally
> with ease.

Well, since you're calling yourself "transparent" I would hope it'd be fast!

> Any language, any environment > Runs as a transparent proxy alongside
> existing applications, integrates with existing infrastructure.

So, it's just proxying and retrying things and doing tls for me? Why isn't
this better handled in the library I'm using to make the call in the first
place?

> Latency-aware load-balancing > Balances request traffic using real-time
> performance, reducing tail latencies across your application.

Sounds like it requires a lot of cross-talk on my network, eating up
bandwidth. That doesn't sound lightweight.

> Runtime traffic routing > Provides dynamic, scoped, logical routing rules,
> enabling blue-green deployments, staging, canarying, failover, and more.

OK, some of that does sound useful, but some of it still feels like something
the caller should handle.

> Drop-in service discovery > Integrates with most service discovery systems,
> decoupling applications from specific implementations.

What's this mean? Why not just use SRV records?

> Production-tested and proven at scale > Powers the production infrastructure
> of banks, artificial intelligence companies, social networks, government
> labs, and more.

I know why this is here, but it feels like _everyone_ claims the same thing,
so, to me, it just becomes noise.

> "linkerd allows you to drop in a transparent layer of application resilience
> and provides the operational affordances critical for modern, cloud native
> environments." > —Oliver Gould, CTO, Buoyant

This sounds doesn't make sense to me, you can't "drop in" resiliency. It needs
to be built. The whole thing just feels like it's all marketing and doesn't
tell me exactly what problems it's solving for me as a developer.

\------------

Maybe I'm just an old fogie, or being dense, but I still don't see how this
solves any problems I can't and aren't already solving.

~~~
daenney
> > Latency-aware load-balancing > Balances request traffic using real-time
> performance, reducing tail latencies across your application.

> Sounds like it requires a lot of cross-talk on my network, eating up
> bandwidth. That doesn't sound lightweight.

Why does that require eating bandwidth? It's a proxy. It can derive statistics
from past and current connections and make decisions based on that data for
new ones without talking to anyone else.

> Maybe I'm just an old fogie, or being dense, but I still don't see how this
> solves any problems I can't and aren't already solving.

So? If you're happy with your solution, keep it! But maybe others haven't
solved all these problems, or don't like your approach to how you've solved
them for other reasons or don't even want to spend time solving these problems
in the first place. For many a component like this means it's 10 less things
to deal with and more time to focus on building features that'll matter to
their end users.

Besides, just because it can already be solved doesn't mean it's not worth
trying out other approaches.

~~~
jimktrains2
> Why does that require eating bandwidth? It's a proxy. It can derive
> statistics from past and current connections and make decisions based on
> that data for new ones without talking to anyone else.

Wouldn't those stats get out of date really quickly if you're not constantly
talking to that machine?

> For many a component like this means it's 10 less things to deal with and
> more time to focus on building features that'll matter to their end users.

But now you have a new component that you need to test and get to know. You
didn't get rid of those 10 things, you just compartmentalized them in
something else that needs to be maintained.

> Besides, just because it can already be solved doesn't mean it's not worth
> trying out other approaches.

Very true, I just don't see the particular advantages to using more basic and
well-known tools like DNS and TLS.

------
cookiecaper
I appreciate the sentiment behind linkerd et al but I think we are _really_
over-complicating things and tying ourselves into knots by going head-first
into this world. I can't help but think we're going to regret it in a few
years.

The complexity introduced not only by Docker, but also k8s and all of the
other things competing for a spot in the middle of your cluster, is immense
and _widely_ underestimated.

Your applications need to be aware and adapted to the k8s model of automatic
pod termination, no hard storage, etc. Google has had over 10 years with the
best computer scientists money can buy to perfect their stack. Most companies
don't.

For most companies, k8s/linkerd/Docker deployments are a case of
overengineering run _frighteningly_ amok.

There are many-a-snafu lying down this path. I'm very worried about how
casually everyone is willing to jump aboard.

One particular worry: most Dockerfiles are built on opaque images registered
with an image registry, and many of the publishers are not well-known. If you
want Ruby, the habit is to go find someone's "pre-configured" alpine-ruby
image (for instance). These are frequently distributed by complete strangers.
Building "from scratch" off the base image distributed by the distro itself is
considered "the old way", at least among those I've worked with.

My understanding is that most public image repositories are unmoderated. The
moment a widely-used image gets pwned, there is going to be panic.

Source: been converting entire company to Docker/k8s

~~~
rubiquity
Yes, it's insane. I can only attribute it to that I don't work on problem of
any scale (amount of apps, servers, deployments per day, etc.) and all of
these people do. Who really wants to spend their time and effort running a
Kubernetes master and nodes, 3+ Consul/etcd nodes, and now some fancy load
balancers? The ratio of companies that would benefit from that to the number
of companies working on products in the space is way out of wack.

~~~
cookiecaper
I would not take the claims of scale at face value. People make this claim to
boost their own egos, and the reality is that many of the people who do this
would be more than fine on a handful of servers. They just want to feel
important.

My frank take is that k8s was developed at Google by the seat of their pants
(and yes, I know it's their third-gen orchestration platform), they talked
about it with outsiders here and there, there was demand for it because
there's demand for anything and everything with Google's name attached to it,
regardless of applicability or propriety or any good sense (case in point:
companies pretending to be smart by "integrating" TensorFlow), and they
released k8s in response to the clamor for the mysterious Google-backed
orchestration platform, much to the detriment of non-Googlers everywhere.

Maybe it's a coordinated plan to harm young startups that may challenge them.
Or, alternate theory: perhaps it's a coordinated plan to get people to buy far
more cloud instances than they actually need, since k8s literally cannot run
on a single instance; to run it locally, you have to run virtual machines.

I understand that k8s has improved and that many people use it (myself among
them), but it maintains that vibe of an internal tool; something so convoluted
and complex and inside baseball that you'd only voluntarily undertake it if
someone said your paycheck was at risk.

~~~
TheIronYuppie
Just FYI, over 50% of contributions come from !Google, and that's part of the
reason that so many people (CoreOS, Redhat, Apprenda, Deis, thousands of
internal folks) have rebased on it.

I saw some of your other comments - I'd love to understand better what we
could do to either a) make it feel like not an internal tool and b) if you
didn't use Kubernetes or containers, how you'd prefer to run distributed
workloads.

TBC, I totally agree with you - if you're running 1-2 nodes, and don't care
about downtime, you should NEVER use Kubernetes or any orchestrator. Once you
get to 3+, however, I can't imagine using anything else.

Disclosure: I work at Google on Kubernetes

~~~
cookiecaper
So I actually answered this question for you when you asked it in December:
[https://news.ycombinator.com/item?id=13241681](https://news.ycombinator.com/item?id=13241681)
. ;)

Not to toot the same horn.

I have more experience with Kubernetes now than I did then, though I think a
lot of those interface snafus, which is most of what the old post discussed,
are still valid complaints.

You mentioned back then that if people aren't aware of some kube featureset,
they end up reimplementing it. You mentioned logging. Can you tell me what you
recommend for logging?

Right now, we have everything writing logs to stdout in the container, which
gets recorded by kube in something like /var/log/kube/containers/ * . Then we
have a fluentd container that reads the logs and uploads them to an external
machine, which is running fluentd containers that receive the input stream,
transform it according to rules, and pipe it out to a cloud-based log
aggregator.

Is that how it's supposed to work? I know we used to do it somewhat
differently, but had trouble with resource consumption that was disrupting our
pods.

And here's a potentially legitimate Kubernetes issue: the scheduler is only
reading the load _generated from containers administered by Kubernetes_ , so
even though the box is being thrashed, k8s will not be aware of the problem
and continues to report the node's resources as healthy. In our case, the
dockerd process was using 4.5 cores. Kubernetes seemed unaware and it was
affecting our performance because another busy pod was on the server and was
not getting clamped by its quota (since it, technically, was not hitting the
quota for itself). Shouldn't these be defined relative to overall system
resources, and not just the containers that kube can see?

It's possible Kubernetes _was_ aware, but that we didn't know how to see it. A
colleague told me the interface I was looking at for pod performance data
reflects only the _quotas_ , not the actual utilization, so maybe k8s knew
(why isn't there a `kubetop` that can be run on the host to supervise?). I
also hear that automatic rescheduling is still in draft stage, so if k8s
detects high load on a kubelet, instead of moving the pod, it will just not
assign new pods to it.

About addressing. Right now, we have dev, stage, and prod. We have a container
that takes all traffic targeting the kubelet on port 80 and proxies it out to
the pod using servicelb. It's set up to forward from [http://app-label.env-
name.example.com](http://app-label.env-name.example.com) to a pod matching
that label. It's annoying that we have to run an external container for this
in the first place and do extra configuration on it, but it's also annoying
that since servicelb is not aware of the beta label attempting to simulate
hostnames, we can't individually address instances of apps. Is that how this
is supposed to work, or is there a ready-built solution that makes this much
easier that we're overlooking?

\----

>I saw some of your other comments - I'd love to understand better what we
could do to either a) make it feel like not an internal tool and b) if you
didn't use Kubernetes or containers, how you'd prefer to run distributed
workloads.

The best way to improve Kubernetes would be to:

a) simplify the terminology;

b) simplify the kubectl command, and particularly make it follow semi-standard
Unix utility conventions (examples: -f should mean force or not be used (and
yes, I know some other tools commit this sin too, but the association with
"force" is nerve-wracking on a production environment), should probably be -r
or -o; --from-file is ambiguous, and should be --read-literal or -r --raw or
something like that);

c) simplify the configuration syntax and files;

d) provide simple ways to get real storage and real IPs (I haven't thoroughly
investigated StatefulSets, so it may do this). I know that Google has long ago
gone into a plane where these are irrelevant, but normal companies haven't,
and IMO there's no reason they necessarily should;

e) provide simple backup and restoration methods (afaik, this involves
multiple steps, including backing up the etcd cluster (not allowed to run on
the master, or just conventionally doesn't?)).

My suggestion for distributed workloads is:

a) script your machine in Ansible or similar, and make a base image from that;

b) use your cloud provider of choice to deploy new instances under the
circumstances considered necessary;

c) register the new node with haproxy or whatever sits at your front end
(often a cloud-provided LB);

d) use conventional administrative and monitoring tooling to administer.

All of that is fully scriptable, in far less time than it takes to convert to
Docker/k8s.

\-----

>TBC, I totally agree with you - if you're running 1-2 nodes, and don't care
about downtime, you should NEVER use Kubernetes or any orchestrator. Once you
get to 3+, however, I can't imagine using anything else.

Why is the above-recommended distributed systems strategy inadequate, even for
people with more than 3 nodes who care about downtime? What's fundamentally
wrong with it? What unique value does k8s bring to the table?

What's the k8s replacement for htop? tcpdump? df? and all the other utilities,
and why are they better, and what benefit does k8s provide, that we didn't
already have, that warrants giving up everything? I know it's not TRYING to
replace all of that per se, but the practicalities make kubernetes the
authority for such information, and you're not supposed to have to get onto
the kubelet itself to diagnose these types of issues so you can't even use
root on the Docker host to try and do some of this.

In our deployment, the main k8s jockey just kills pods if they're acting up
and hopes that will magically fix it, because the devs don't want to get near
the setup with a 10-foot pole. Before, they would SSH in and go over the
issues. Now, since not only can't you SSH in but there's a convoluted process
to get a shell, and then the container will not have any diagnosis tools in it
anyway, it's very hard to have them collaboratively troubleshoot a problem
we're seeing in the wild.

I can't tell you how many times I've had someone tell me "They _must_ have a
better answer for that; Google uses it after all...". These people are under
the mistaken impression that if Google is using it, it must be robust and
stable, when in fact the opposite is usually true, and I don't think Google
tries to pretend it's not. It's probably just misinterpreted the signal from
the developer community as meaningful approbation, when it's really just
blindly following the cool kid on the block.

It's not that k8s isn't a neat thing. It's just that it's solving a problem
that pretty much only Google had, and now everyone else is plummeting down the
rabbit hole.

We've spent probably 1.5 man-years getting our infrastructure containerized
and kubed up. We're not at 100% in prod yet but getting pretty close. It's
clear that I'm really dumb but I haven't even been the main one doing this, so
can't blame it on me. That's a _huge_ project, and how does k8s/Docker justify
the cost? It lets us cut down on AWS usage? Sure, but nowhere near the
proportion of cost. It "makes it easier" to administer the cluster? Nope, not
a chance. It's fun to kill pods but that's living on the edge, most apps
written by non-Google can't take that. Solution? Write an "operator", which
appears to be any random program that intercepts the k8s API and does stuff to
support pods, etc. That sucks, why don't I just keep the init script that
works fine?

One could argue that k8s is about resource utilization, which is sort-of-but-
not-really-true. And that has a downside; we've essentially turned our
environment into a shared host now, and despite k8s quotas, we still have pods
that are bad neighbors; this is turning out to be quite difficult to control.

So I'm just really at a loss here. Google wanted it and they built it and it's
working for them, that's great. That's an internal tool. How does this help
anyone who wants to run a normal site this way? I haven't heard a single
success story that had a moral or goal behind it other than "I'm cool like
Google now too."

And on top of all of this, before k8s even comes into the picture, you have to
make a Dockerfile and run your application in Docker, which is a PITA itself,
in terms of stability, security, _and_ configuration (see above-mentioned
dockerd hammering the box, dockerd hangs or breakages that render a box
useless, having to deal with port forwards, name collisions, large Docker
caches, docker's inability to remove old images or containers on its own,
Dockerfile lameness, culture of importing unknown images, etc.).

I like the ideas of containers (and have liked them when they were called
"jails" in past eras ;) ) and I like the idea of orchestrators like
kubernetes. But I think it's very early days right now for both, and that they
will need radically improved simplification and stability before they are
appropriate for general use.

