
Knative = Kubernetes Networking++ - jlward4th
https://ahmet.im/blog/knative-better-kubernetes-networking/
======
drdaeman
> Knative is about to hit v1.0 and become «stable», has a solid community.

I'm sorry for the negativity but I'm skeptical about the reliability.

AFAIK it still doesn't really address instabilities. Software crashes,
hardware fails, power goes out. It is trivial to lose data, and it is trivial
to be blissfully unaware that there is a possibility of data loss because when
everything's green, things work perfectly fine.

My anecdote is, I was at a conference, listening to the "Introduction to
Knative" talk. Knew just the name and that serverless is a hot new thing.
There was a simple app that has sourced earthquakes and displayed a map, or
something like that. Things looked neat and simple, so I wanted to jump on
that bandwagon, but... My first (and quite obvious) question was, "what
happens if during the event processing a hardware node that runs a service
instance suddenly goes dark?".

I was surprised that there wasn't a meaningful answer, so I've tried to
research it myself and found that - in my understanding - the event is just
lost. Unless someone had taken extra care to implement such guarantees by
adding more and more statefulness. As I get it, it's still K8s, and I can
deploy my own message bus/queue, but this devalues Knative for me.

[https://github.com/knative/eventing-
contrib/issues/656](https://github.com/knative/eventing-contrib/issues/656) is
the issue that tracks it, and it was shoved away to the eventing-contrib
repo...

It could be that I don't understand something, or that I've made the wrong
conclusions.

\---

Update: Found
[https://github.com/knative/eventing/pull/1949](https://github.com/knative/eventing/pull/1949)
\- seems that they've merged in something just _yesterday_. Seems that things
are improving. That's good.

~~~
jrockway
I am not sure if an event queue is the primary focus of Knative, or at least
this particular blog post. This is focusing on the problem of Kubernetes
networking, which is honestly pretty useless.

By default, a Kubernetes "service" exists to create a single IP address for a
group of replicas. You see that your "foo service" is at 10.2.3.4, you open a
TCP connection to it, your connection is created to one of the available
replicas. You open another connection, it goes to another replica. This
accomplishes some load balancing and lets new TCP connections avoid unhealthy
replicas.

The problem is, a TCP connection per request is relatively uncommon these
days. You open up one connection to MySQL and send it multiple queries. gRPC
sends multiple requests and responses with one TCP connection. HTTP/2 sends
multiple requests with one TCP connection. The days of one request = one TCP
connection are over.

So there is a lot of work being done to try to make load balancing transparent
at a level above the TCP connection. Your app asks to connect to foo-service,
and each request is routed to another healthy replica. Replicas come and go,
but your app never has to handle reconnects. You want to send 10% of requests
to a canary instance without having to tell your application that; you want
the networking layer to handle that for you.

There are several approaches to handling this problem.

The first is having smart clients; if your gRPC library knows how to ask the
Kubernetes API for a list of replicas, it can do its own load balancing and
health checks. It gets a list of valid endpoints, opens up a connection,
subscribes to the streaming health service that each backend provides, decides
on a load-balancing algorithm, and goes! This works quite well, but it is
difficult to setup and difficult to maintain. The C++ gRPC client has to do
this, the C++ HTTP/2 client has to do this, the Go gRPC client has to do this,
the Go HTTP/2 client has to do this, the Python gRPC client has to do this,
the Python HTTP/2 client has to do this, the node.js gRPC client has to do
this... you get the picture.

I am most familiar with this approach from my days at Google, where load
balancing typically consisted of a smart client for every language that
connected to a coordinator, asked for backends and a rule for splitting
traffic among them, and stayed connected to receive updates. It worked
extremely well. You could drain a cluster very quickly, and obviously traffic
kept flowing if the load balancer died. (Not sure I ever saw the load balancer
die, though.) The disadvantage is that a malfunctioning smart client can break
everything; every piece of code you deploy to production in every language has
to work perfectly. (From reading job ads, I feel like few companies are as
disciplined as Google at saying "you have to use one of these 3 supported
programming languages". So in the real world, the "smart client" approach will
never work. You can check out the existing gRPC client libraries to see this
in action; best case, there is round-robin load balancing based on DNS
records, with some metadata that can also be retrieved from DNS. C++ supports
it best, Go is next, the rest of the language bindings... questionable. Raw
HTTP/2? Good luck.)

The second approach is to have a smart proxy that your apps use to get to the
outside world. You run this smart proxy as a "sidecar", receive all traffic
through it, and send all outgoing traffic through it. It knows where all the
backends are and how to load balance, so the app doesn't have to. Personally,
I think this is the most practical approach right now. Run Envoy next to your
apps, make your services "headless" (so Kubernetes just returns all endpoints
over DNS, rather than trying to do TCP-level load balancing), and go! Even
your outdated PHP app will now have detailed metrics and participate in
distributed tracing. The disadvantage is that you have to manually add an
Envoy container and minimal config to everything you run, which people
apparently hate.

I like this approach, it's what I personally do. Envoy has its quirks, but it
does provide a lot of observability with very little effort. You can code
whatever custom tooling you want with an xDS control plane, and even with a
static config file and some DNS records, it does a great job of ensuring that
every request you make hits a working backend. (And alerting you when there
are none.)

The last approach is to redesign Kubernetes networking so that it
transparently provides the same thing that a sidecar would. That is where
things like Istio, linkerd, and knative's networking stuff fit in. They take
your deployment specification, inject a sidecar, and adjust iptables rules to
force your application to talk through the sidecar. Some of them, I believe,
come with a control plane so that you can perform management operations
cleanly (canarys, draining a region or node, etc.).

My personal opinion is that they do too much; if you name your syslog port
"grpc" by accident, goodbye traffic! What was supposed to be documentation is
now just an undocumented part of the Kubernetes API. And since it's all non-
standard, what works with your Istio cluster may not work with your Linkerd
cluster. You can actually write well-formed config files that will work on one
cluster but not on another. So I don't think this is the approach that will
eventually win.

Approach 1, a smart client, is the most likely to be reliable and
understandable. It's also the least likely to work in a world where people's
Excel spreadsheets run in production and need to load-balance their HTTP/2
requests. Approach 2 is probably the most difficult to set up right now, but
it's a good compromise for the real world. You type in the config that you
want, and then that is executed in an observable and maintainable way. You
have to type stuff though, so it's unpopular. Approach 3 is the most popular
and most likely to cause problems. But I'm sure that if we keep bludgeoning it
with enough hammers and hire enough Kubernetes Certified Operators at $300,000
a year, we can make it work!

Sorry for the rant but it just blows my mind at how much time people will
spend to make something transparent that is actually fine to see. All we need
from Kubernetes is to tell me where the replicas are. Then we can write our
own load-balancing that works for our application.

~~~
shaklee3
>The disadvantage is that you have to manually add an Envoy container and
minimal config to everything you run, which people apparently hate.

Correct me if I'm wrong, but I think envoy (or istio, at least), uses a
mutable admission webhook to inject the sidecar into your pod. This means you
never have to enter anything into your pod spec.

Edit: I noticed you pointed this out with istio.

~~~
jrockway
Yeah, Istio is the Kubernetes glue for Envoy. Envoy is just a proxy; it can
exist without Kubernetes and as far as I know, knows nothing about it.

------
jcmontx
As a developer started his career during the cloud/everything-as-a-service
boom, thinking about Kubernates networking gives me anxiety

------
KidComputer
Still waiting for it to support basic, but crucial functionality like
tolerations and node affinities. And no, using admission plugins like
PodNodeSelector is not viable on managed clusters like GKE.

------
outside1234
Its not governed under the CNCF, so really, what Knative equals is Google
lock-in.

~~~
wstrange
That seem premature and unfair. KNative has representation from Google, IBM
and Redhat, and the steering committee is working on a more inclusive
governance structure. (I have no inside knowledge of this - just what I have
gleaned from slack ;-) )

~~~
outside1234
What's premature (it has been around for at least a year)?

And why is it unfair to ask for open governance from companies other than
Google and IBM (redhat is part of IBM)?

~~~
evankanderson
Pivotal is also on the steering committee.

Representation is based on contributions to the project, and Google has
publicly stated that they look forward to not being a majority on the steering
committee (when other companies have exceeded Google's contributions).

There are CNCF projects whose governance is entirely within a single company.
I get that there's a great narrative here that "Google doesn't share", but it
seems like Google is being held to a higher bar here.

