
Etcd, or, why modern software makes me sad - Spellman
https://www.roguelazer.com/2020/07/etcd-or-why-modern-software-makes-me-sad/
======
mvanga
This is one weird comment section.

There are people attacking the author for a statement made about CoreOS, and
for some hate towards Kubernetes.

The key point of the article is not really being addressed here: vested
interests from large companies are able to introduce huge complexity into
simple, well-designed projects. While the complexity may be good for some end
that said vested interest has in mind, they are also in a position to absorb
the cost of that increased complexity.

In the meantime, the simpler version of the software is long gone, with the
added complexity placing a large burden on the solo developer or small team.

It's almost like there's no good representation in the open-source world for
the solo developer or small team. Funding and adoption (in the extreme, some
might say hijacking) of open-source projects from large corporations dictates
the direction of all major software components today. Along with the roses
come some very real thorns.

Just my 2c.

~~~
dynamite-ready
Agreed. I saw it as a general lament against over-engineering. I don't think
the point got lost in the super specific example...

You could just as easily level similar rants against the likes of React and
it's wider ecosystem, Tensorflow, Typescript (many will disagree), Docker...
I'm sure others have their own bugbears.

Much of this is subjective, of course. But to me, it feels like software
development is trending towards unecessarily complicated development processes
and architectures.

And the only beneficiaries appear to be the large technology companies.

I suppose in exchange, you're getting a guarantee of maintenance. But is that
really worth the additional complexity associated with the common use of these
tools?

~~~
IggleSniggle
TypeScript is a funny one. I love the language, but at the same time, I
totally agree with the premise that it is unnecessary complexity! And yet I
swear by it. I can't explain why there's not more cognitive dissonance there.

JavaScript taught me to love async, then functional programming, and
TypeScript taught to me to love static types. I'm now desperately wishing for
a world of OCaml/Haskell, but where are you going to find teammates using
those? And so I'm back at TypeScript.

I think all of these "higher order" languages that do transpiling (including
Scala, Closure, Kotlin, Groovy, etc) fit this same scenario. Increased
toolchain complexity for a decrease in development/test/maintenance
complexity.

With any toolchain, though, it can be hard to know where to draw the line for
"worth it for this project" until you're already an expert in using the
toolchain, at which point, why wouldn't you use it?

~~~
flatline
What is unnecessarily complex about TypeScript? It's JavaScript, with static
typing plus type inference, and pretty nice generics. The ecosystem of modern
JS surrounding it is horribly complex but TypeScript itself seems like a
fairly straightforward programming language.

~~~
jazzyjackson
It hasn't been a big detriment for me as someone learning Typescript on their
own, but it is another moving target for looking up "how do I do x..." and
finding most of the forum posts are a little outdated and the latest version
of Typescript has a different/better way of doing things than just a year or
two ago. I find myself scrolling through github issues comparing my tsconfig
to figure out why my stack behaves differently than someone else.

~~~
rubber_duck
That was my experience with TS maybe two years ago - at this point project
scaffolding tools are good enough to generate sane output that I spend a
little bit of time upfront but then keep plowing away. Maybe I got better at
it as well - but I haven't kept up with TS news in a long time and I don't
feel like I'm missing out on stuff or encountering things I don't understand.

I've written >50k LoC of TS in last few months for sure (doing a huge frontend
migration for a client) and I can't remember the last time I googled anything
TS related. Actually I remember - a month ago I wanted to know how to define a
type with a field excluded, took 30 seconds of google.

Meanwhile the project started out as mixed TS and ES6 because most of the team
was unfamiliar with it and there are a few dynamic typing evangelists - we
ended up going back and just using TS all over the place, the complexity
introduced is minimal and the productivity boost from good tooling on product
of this scale is insane.

~~~
IggleSniggle
Typically for me the time cost is in going down rabbitholes to attempt to
improve implicit static types for getting closer to "whole program" functional
type inference (TypeScript repeatedly seduces me into this), and the decision
inflection point is generally not for application code but for the space
between application and script code, things that you might also write Perl or
python scripts to accomplish...the types are _especially useful_ in this
context because they tell you a lot more about the script than your typical
script, but they also introduce a bunch of overhead for a few lines of code.

------
connor4312
> the simple internal data model was replaced by a dense and non-orthogonal
> data model with different types for leases, locks, transactions, and plain-
> old-keys

I maintain a (/the only?) etcd3 library for Node.js[0], and used etcd
extensively on my former team.

None of these things are new to etcd3 API. All of these are present on v2 as
well[1], whose API the author extolls, or are built within clients on etcd's
base APIs (e.g. there's no 'lock' API, only leases). However, with etcd3 we
get stricter typing, better performance, and better semantics (e.g. watch
streams and lease streams over polling) thanks to GRPC.

In general these rich APIs allow 'average' engineers to build complex
distributed apps more correctly. I've built reliable sharding, hash rings,
elections, and so on based on etcd's API--none in more than a hundred or two
lines of code (more in Go, less in Node.js). All of these are classic hard
problems that etcd makes easy. Sure, there's innumerable standalone services
for each of these things, but often there's no need to take the cost of many
tools when one would work.

0\. [https://github.com/microsoft/etcd3](https://github.com/microsoft/etcd3)

1\. [https://etcd.io/docs/v2/api/](https://etcd.io/docs/v2/api/)

~~~
mst
I feel like the blog post could largely be summed up as "stop threatening to
remove the normal HTTP API, some of us find that a lot easier to debug and
ease of debugging is essential for a core operational component of a
distributed system" \- which is, I would argue, an entirely reasonable thing
to want.

I quite enjoyed the rantiness on a "being entertained" basis but it did rather
work against effectively making the core point.

~~~
umvi
> stop threatening to remove the normal HTTP API, some of us find that a lot
> easier to debug

I've seen the trend toward complexity in other projects too, and it harms not
just ease of debugging, but ease of hacking.

Take, for example, Swagger UI[0]

v2 was so simple. It was vanilla JS using jQuery. I, as an embedded systems
developer, was able to easily hack it so it could read in the OpenAPI JSON
from a database and I even added a little search box so you could filter down
the APIs you wanted to see. Super fast and easy and worked just the way I
wanted it to!

Starting with Swagger UI v3, it became... extremely labyrinthine by
comparison. It was completely re-written in React JS and now I need a bunch of
new tooling to make changes and everything was broken out into dozens of
different modular files so I couldn't find where I needed to make a given
change, also not to mention I've never used React so it felt like the barrier
to hackability was dramatically increased.

I'm sure full time React folks love the new architecture because it's so much
<cleaner/safer/scalable/etc>, but for me the change was extremely confusing
and made the tool unhackable (I tried for a few hours to get it to do what I
wanted, but it started looking like I was just going to have to learn all of
React and I threw in the towel), and so I'm permanently stuck on v2 for now.

[0] [https://github.com/swagger-api/swagger-ui](https://github.com/swagger-
api/swagger-ui)

~~~
threeseed
Complaining about an open source project changing technologies because you
can't be bothered to learn them is a bit rich.

Fork Swagger v2 and make your own improvements. No one is stopping you and
it's what open source is all about after all.

~~~
umvi
> Complaining about an open source project changing technologies because you
> can't be bothered to learn them is a bit rich

Web technology is its own beast. I invested a long time learning and mastering
AngularJS only for all that work and knowledge to be flushed down the toilet
over the next few years. Web tech has terrible ROI so that's why I "can't be
bothered" to learn the latest one. I'm salty specifically about Swagger
jumping on <insert latest framework> instead of sticking with simple vanilla
JS that everyone understands.

> Fork Swagger v2 and make your own improvements

Yeah, that's exactly what I did...

------
alpb
Author is really good at discrediting themselves straight off the bat in one
sentence:

> for a ~~bullshit~~ unsuccessful project called CoreOS Container Linux that
> was EOL'd several years ago

CoreOS was actually quite successful to me as an outside observer. It had
decent paid user base as well as people using it without paying. It has showed
people that Chrome OS can be used to build atomically updating host OS with a
readonly fs for container hosting. It has inspired many other container hosts
to come like RancherOS and Google’s Container-Optimized OS. Similarly, it was
probably one of the reasons why Red Hat was interested in the acquisition.

Furthermore, CoreOS was EOL'ed last month; not several years ago.

I can keep going on why etcd API was switched to gRPC and what benefits this
offers, even at small scale. But I don't think it's worth anyone’s time
convincing the author otherwise. Based on their tone, it's clear to me that
they have trouble with using software when things get a tad bit complicated.
Usually, there's community decision-making behind these decisions, and they're
often deliberated for months, backed with prototypes and data. I'm pretty sure
the author doesn't care, however.

~~~
SNosTrAnDbLe
The author is talking about keeping things simple which is not a bad idea. I
am genuinely interested in why etcd switched to gRPC?

~~~
alpb
First of all etcd API is still available on a JSON interface.
[https://github.com/etcd-
io/etcd/blob/master/Documentation/de...](https://github.com/etcd-
io/etcd/blob/master/Documentation/dev-guide/api_grpc_gateway.md) and some
historical discussion here: [https://github.com/etcd-
io/etcd/issues/1980](https://github.com/etcd-io/etcd/issues/1980)

Many reasons to switch from JSON to gRPC: * gRPC uses HTTP/2 which means you
can concurrently make multiple requests on a single TCP connection, while on a
typical JSON API which probably uses http/1.1, you can't. * Type safety of
protobuf types. * Bi-directional streaming e.g. Kubernetes controllers use the
Watch API which notifies the object changes/add/deletes in etcd to the control
plane. * Client libraries are automatically generated, and not error prone. *
RPCs are already optimized for bytes on the wire efficiency, whereas JSON is
not. This matters a great deal as Kubernetes objects get large in
size/quantity over time but controllers still work effectively by not spending
too much CPU on encoding/decoding like they do on JSON. * Similar to the
previous point, most json decoders don't reuse objects, so every decoded
object is a new alloc, whereas gRPC can Reset() and reuse the same object
while decoding/encoding. * Builtin authentication primitives (such as
JWT/tokens or even adding TLS to client and/or server). * gRPC has support for
interceptors which are like middleware functions you can inject to
requests/responses on both client and server-side for logging, authorizing
etc.

The list goes on, but something to note is that etcd was not developed by
Google (and as far as I know, not by ex-Googlers). Both etcd and gRPC are
owned by the same open source foundation, so it's natural that they make use
of an existing technology.

~~~
brabel
> gRPC uses HTTP/2 ... while on a typical JSON API which probably uses
> http/1.1, you can't.

What are you actually talking about? An HTTP API by itself doesn't actually
care about which http version is used, that's something only the HTTP server
that serves the API should care about (nowadays, pretty much any http server
in any language supports HTTP/2 and 1.1).

> Bi-directional streaming e.g. Kubernetes controllers use the Watch API which
> notifies the object changes/add/deletes in etcd to the control plane.

Even HTTP/1.1 has mechanisms for that, like long polling and chunk encoding
(which allows infinite streams to keep a dual channel of communication open
where each chunk can be treated as a message - with "headers" and all).

> Client libraries are automatically generated, and not error prone.

This is true, but we've had similar technology since the SOAP days... you
could easily do the same with a XML-based HTTP API.

By the way, most of your points are against JSON, not a HTTP API per se, which
usually would allow a number of formats , including XML and JSON at least.

> Builtin authentication primitives...

HTTP also has that when you include cookies and something like OAuth/OpenID.

> gRPC has support for interceptors which are like middleware functions

This kind of thing is better done by using the specific platform you're
running on (depends on the language) so I don't see something like this as
something desirable on the specific RPC framework you're using. With HTTP,
caching, logging etc are trivial to do and one of the strongest advantages of
using HTTP in the first place, so I think you're confusingly making a point
for HTTP here, unless your focus is on the RPC side of things? In which case,
you would have to make a case on why RPC is a better fit than HTTP for etcd,
which you haven't.

So all in all, I found your points utterly unconvincing, but presented with so
much conviction that you're actually right that I could not resist to respond
(even if I don't want at all to get into a pointless discussion on the merits
of HTTP VS RPC or JSON VS Protobuffers)!

~~~
sercand
By using gRPC, we get much more without thinking too much. Also, bi-
directional streaming is instantaneous and it is not like afterthought long
polling.

For example, whole etcd API is located at [https://github.com/etcd-
io/etcd/blob/4c6881ffe4b3bae257c0720...](https://github.com/etcd-
io/etcd/blob/4c6881ffe4b3bae257c07205c2334f84c0f4577b/etcdserver/etcdserverpb/rpc.proto)

It is pretty straightforward and not too hard to understand the API. Designing
APIs with protobuf makes things convenient and it brings lots of already
written tools with it.

Since gRPC built on top of HTTP/2, we thought gRPC as easiest and performant
way of writing a HTTP API with good defaults.

~~~
cmrdporcupine
"we get much more without thinking too much" is precisely one of the reasons
for the original push to REST (and against RPC at that time SOAP and CORBA and
RMI etc.) was made in our industry 10-15 years ago. By falsely representing a
_remote_ resource as if it was a _local_ one we open a whole can of worms; not
just performance, but resilience, infrastructure issues, etc. Transferring
documents over HTTP with a common language of HTTP's verbs was supposed to get
programmers to model their applications as the sets of resources that they
are, discourage them from making excessive round trips, make debugging easier,
and make use of standard HTTP load balancing hardware.

Yes it's easy to make that RPC call. But should you?

I've been off doing other things in the intervening period, but while I had my
back turned the industry seems to have turned its back on REST and gone whole
hog on RPC. Again.

At first I was thinking this was just internally here at Google, where
protobufs and gRPC reign supreme. But it seems to have taken hold everywhere.

What did I miss? Why have we swung this way. Again. Is the pendulum going to
go back?

~~~
apta
> By falsely representing a _remote_ resource as if it was a _local_ one we
> open a whole can of worms

I've heard this argument before, but how does gRPC itself cause this issue to
manifest? I'm curious to hear what your opinions are on a better alternative,
and how not to represent a remote resource as a local one.

~~~
cmrdporcupine
You could use gRPC responsibly, yes.

But the fact that it presents the remote resource in an API which resembles a
local object means that programmers often get lazy in the manner of which they
think about these things. The REST semantic is supposed to make this more
explicit.

A remote object is not an object in your program or even your computer. It's
something you're taking from something that is computationally miles and miles
away. Compared to the microseconds it takes to dispatch a local call, it's an
eternity away, even on a local network of the highest speed.

Accumulate those latencies over thousands of dispatches, and trouble can
ensue.

I am reminded of an observation from when I first joined Google, coming out of
their acquisition of the scrappy awesome ads company I worked at (Admeld).

We had a little service that kept track of ad impression caps / budgeting. We
didn't want to serve an ad a single time more than the customer wanted us to,
etc. Serving many thousands of ads per second, a process distributed across
multiple machines in multiple data centres, this is a bit of a tricky shared
state problem. The people who came before me had designed a rather clever
solution which used a form of backoff to trickle down the number of ads served
as they got closer and closer to budget cap, and to synchronize this state
across clusters (this was before there were Rafty services to make this kind
of thing easier, BTW).

We had a bit of a show and tell with Google when we first joined. I wanted to
know how they were handling this problem, since their scale was many times
ours, so I asked the question and got a puzzled/annoyed look:

"Oh we just make an RPC call to our budgeting server."

Summary: if you're at Google you don't have to worry as much about these
problems. You still do, but there's an insane amount of infrastructure and
horsepower and an army of SREs to help make it happen.

So, yeah, my point is -- just because Google does something or has invented
something doesn't mean it's the best way to do it, especially in a smaller
more cost conscious organization.

~~~
apta
Thank you for taking the time to write this up, and the interesting anecdote.

What I don't see though, is how is making a (g)RPC call any different from
making a REST call? Like you said, the REST call is supposedly more explicit,
but at the end of the day, it seems more of a convention and some hard
underlying difference. What's the difference between `httpClient.get("...")`
and `grpcClient.foo(...)`?

~~~
cmrdporcupine
The latter encourages you to model things as remote procedure calls, the
former, well, in my experience it's open to incompetence and abuse, too, so...
ehh... but done properly... well, go read the Roy Fielding paper :-)

I mean, internally at Google we have load balancing for grpc (I'm sure the
outside world does now, too, but it was new to me when I joined) -- but load
balancing HTTP requests containing readable JSON or XML documents, that's far
more sysadmin friendly, wouldn't you say? Off the shelf infrastructure,
nothing exotic, easier to monitor. Same goes for caching, for proxies, etc.

Being able to just stick a URL for a given resource in your browser, or hit it
with wget/curl to read it, that's a serious bonus.

In general URLs follow conventions similar to those down by our Unix
forefathers, when they designed the filesystem API. We are all familiar with
this model. And in some ways REST done right is very Unix philosophy --
provide a consistent model upon which a bunch of little tools can
interoperate.

I could go on... have to go put my daughter to bed tho

~~~
apta
Thanks again for the detailed response.

------
tannhaeuser
> _HTTP /2 a.k.a. SPDY is a comically bloated Layer 4/5/7 mega-combo protocol
> designed to replace HTTP. It takes something simple and (most importantly!)
> comprehensible and debuggable for junior programmers and replaces it with an
> insanely over-complicated system that requires tens of thousands of lines of
> code to implement the most minimal version of, but which slightly reduces
> page load time and server costs once you reach the point of doing millions
> of requests per second. I am filled WITH RAGE just thinking about how we
> took a fundamental part of the Internet, simple enough that anyone can
> implement an HTTP server, and replaced it with this garbage protocol pushed
> by big megacorps that doesn't solve any real problems but will completely
> cut out future generations from system programming for the web_

This resonates with me, and I'd like to add HTTP/2 can only bring advantages
if you actually go all the way to push/bundle resources into responses _and_
have a strategy/priority when to push eagerly vs serve lazily; I'm even much
more worried about upcoming QUIC (and DoH) because there's no impl in sight.

~~~
derefr
> HTTP/2 can only bring advantages if you actually go all the way to
> push/bundle resources into responses

Compared to well-optimized HTTP/1 (e.g. using minified CSS and sprite-sheets),
sure. Compared to most HTTP/1 deployments, though: no. HTTP/2 gives you tons
of advantages "for free" that you need build-time processes to attain in
HTTP/1\. With HTTP/2, you can do "the naive thing" that'd you'd have done on
the 1995 Web in Notepad, and it'll _be_ the optimal thing.

Also, HTTP/2 means less OS packet-switching overhead server-side if you have
ancillary connections (e.g. websockets) open against the host, since those
_also_ get muxed into the same carrier socket.

Also, mobile clients wake up less, because there's only one TCP socket to do
idle-keepalive on.

HTTP/2 _also_ means that TCP's Nagling has more to work with, and so is less
likely to end up needing to waste bandwidth on emitting many undersized
packets—it can just pack N short requests into the _same_ TCP jumbo frame,
since they're all going to the same place.

I would also point out an indirect "advantage": HTTP/2 makes it cheaper to
serve ads proxied through the first-party host (as HTTP/2 flows) than for the
client to hit the third-party ad servers directly. People can still block the
ads/trackers either way, but served inline to the origin like this, the people
who _don 't_ block ads, will get a better experience.

~~~
tannhaeuser
I agree these things can be useful, but again none of these come out of the
box (if I haven't overlooked or misunderstood something). Including "doing the
naive thing"; I mean how do you expect your web server to automatically push
CSS or SVG sprites/fonts unless you're relying on the server to
intercept/parse your HTML for delivery-time optimizations a la PageSpeed and
make heuristic scheduling decisions? Unless you're putting in the effort to
optimize your payloads you will just result in as many roundtrips with
HTTP/1.1 + keep-alive.

~~~
derefr
Web browsers are limited in the number of concurrent socket connections
they'll open to a given origin. This matters not-at-all in HTTP/2, since
everything is going over a single socket; while mattering quite a lot in
HTTP/1, since dependent resources being loaded _in parallel_ must be loaded on
separate sockets. If you only have six parallel sockets to work with, then if
your page is, say, an image gallery, then the Javascript file that makes it
work (loaded at the bottom of the body) might be blocked waiting behind the
loading of e.g. some large image higher up in the body. The previous requests
need to entirely finish (= a round trip) before the next requests _on the same
socket_ can start. Keep-alive does nothing to fix that.

HTTP/1.1 _pipelining_ partially mitigates this, allowing the client to "queue
up" a list of all the dependent resources it wants from each socket; but it
suffers from head-of-line blocking. Which sounds like some arcane thing, but
in practice it means that _big_ things might block the loading of _small_
things. (The browser doesn't know how big things are, so it _can 't_
effectively schedule them; and the server _must_ dumbly queue results up in
the same order the client requested them, because that's the only way the HTTP
pipeline's implicit flow sequence counters will match up.)

HTTP/2 is a full mitigation for this problem, since—even without a heuristic
"prioritization strategy" for the delivery of dependent-resource flows—the
"oblivious" strategy is still a good one: if you attempt to deliver _all_ the
resources in the queue concurrently; and you do so by delivering one fixed-
size chunk of each flow per iteration, in a round-robin fashion; then you'll
end up finishing delivery of resources smallest-to-largest—which means you'll
_usually_ deliver the most-critical resources first, _no matter where in the
dependent-resource queue they started._

Or, in short:

HTTP/1.0 = O(N) required roundtrips for a page with N resources.

HTTP/1.1 with pipelining = O(1) required roundtrips for a page with N
resources (followed by O(N) bytes streamed half-duplex), but the page can
still take nearly the same amount of time to become interactive as if it were
O(N) roundtrips, because of effectively worst-case scheduling.

HTTP/1.2 = O(1) RTTs + O(N) half-duplex bytes for N resources, loading
"intelligently" such that the page becomes interactive in O(log N) time.

~~~
tannhaeuser
I really appreciate your going into these details, but I still don't
understand the "oblivious" strategy thing which supposedly improves baseline
performance OOTB when the criterion is the time to first render of above-the-
fold content given otherwise same conditions. You say the effect of delivering
all resources concurrently is that "the most-critical resources [are delivered
first], no matter where in the dependent-resource queue they started". But
this isn't proven; it's just a different heuristic to apply to traffic shaping
assuming small resources are needed early vs assuming dependent resources (+
HTML markup itself) are loaded asynchronously in the order the browser parses
an HTML DOM (the default behaviour, and the one authors can influence
directly). You're not magically increasing the bandwidth by multiplexing so
something has to give.

~~~
derefr
The "oblivious approach" _is_ "just a different heuristic to apply", yes. It's
just one that happens to work especially well for HTML rendering, given that
_large_ resources tend to be depended upon in a way (e.g. an img or video tag)
that gives them a pre-download bounding-box size, allowing the rest of the
page to render around them and to not reflow once they're loaded; while the
types of resources that tend to be small (like CSS or JS files) are mostly
depended on with the semantics that they _can_ potentially reflow the page
entirely when they finish loading, which means the browser completely inhibits
interactivity (and/or rendering, depending on the browser) _until_ those
resources finish loading.

The only things that kind of break this heuristic are:

• large single-file Javascript SPAs. These are usually just marked with the
`async` attribute, such that the initial DOM of the page can first render,
then be gradually enhanced when the SPA loads. But with HTTP/2 + ES6, you can
_also_ just _not pack the SPA into a single file_ , instead relying on ES6
modules, which will each be individually smaller and therefore will end up
being delivered first by the content-oblivious round-robin chunk-delivery
strategy.

• web fonts, which are large _and_ will necessarily cause a complete reflow of
the page once loaded. Currently, browsers make a special loading-precedence
exception for web fonts; though it doesn't matter as much right now, as
they're still mostly served from third-party CDNs rather than as first-party
HTTP/2 flows.

AFAIK, for regular web-page retrieval, these are exactly the total set of
things that currently benefit from being server-pushed along with the first
response; everything else just gets handled well even without server-push.

(If you're curious, server-push is _really_ designed for the use-case of
pushing _secondary API responses_ along with an initial API response; to allow
for GraphQL-like "single-round-trip resource-hierarchy/-graph walking" in a
way that's more friendly for caching layers than comingling the results into a
single resource-representation. It makes the most sense in a Firebase-like
system, where un-asked-for server-pushed resources can be obliviously accepted
and dumped into the client-side in-memory set of synced entities
asynchronously to the parsing of the initial response; and then, once the
dependency is parsed out on the client side, the client can discover that it
already has the entity it wants in its in-memory entity store, and doesn't
even need to make the request.)

~~~
MauranKilom
Apologies if I've missed some part of the explanation that answers this. The
way your previous reply described it, the benefit of HTTP/2 is loading smaller
things earlier. But given that goal, couldn't the browser just prioritize
JS/CSS files over images/videos for the same effect, without any new protocol
that sets in stone some heuristic purely based on content size?

~~~
derefr
> the benefit of HTTP/2 is loading smaller things earlier

No, the benefit of HTTP/2 is a lack of head-of-line blocking. Head-of-line
blocking can be _easily seen_ when big things block small things, but that's
not what it _is_. What it _is_ , is when something doesn't _make progress_
because another thing is being waited for.

Imagine a multimedia container file-format where you can't interleave audio
frames with video frames, but rather need to put _the whole_ audio track
first, or _the whole_ video track first. This format would be unsuited to
streaming, because downloading the first chunk of the file would only get you
some of _one track_ , rather than useful (if smaller) amounts of _all the
tracks required for playback_. Note that this is true no matter which way you
order the tracks within the file—whether the audio (smaller) or video (larger)
track comes first, it's still blocking the progress of the other track.

HTTP/2 is like a streaming multimedia container format: it _interleaves_ the
progress of the things it loads, allowing them to be loaded _concurrently_.

This doesn't _just_ mean that small things requested later _can_ be
prioritized over large things requested early (though it _does_ mean that.) It
also means that, for example, if you load N small Javascript files that each
require a compute-intensive step to parse+load (GPU compute shaders, say),
then you won't have to wait for the compute-heavy load process of the previous
files to complete, before you begin _downloading_ the next ones; but rather
you can concurrently download, parse, and load all such script files at once.
Insofar as they don't express interdependencies, this will be a highly-
_parallelizable_ process, much like serving independent HTTP requests is a
highly-parallelizable process for a web server.

One benefit of HTTP/2's lack of head-of-line blocking, that would be more
talked-about if we had never developed websockets, is that with HTTP/2, you
get a benefit very much _like_ websockets, just using regular HTTP primitives.
You can request a Server-Sent Events (SSE) stream as one flow muxed into your
HTTP/2 connection, and receive timely updates on it, no matter what else is
being muxed into the connection at the same time. Together with the ability to
make normal API requests as other flows over the same connection, this does
everything most people want websockets for. So the use-case where websockets
are the _best_ solution shrinks dramatically (down to when you need a _time-
linearized, stateful, connection-oriented_ protocol over HTTP.)

> new protocol that sets in stone some heuristic

Note that there's actually _no_ explicit specification of the order in which
HTTP/{2,3} flows should be delivered. What I'm calling "content-oblivious
round-robin chunk scheduling" is just _the simplest-to-implement strategy that
could possibly meet HTTP /2's non-head-of-line-blocking semantics_ (and so
likely the strategy used by many web servers, save for the ones that have been
highly-optimized at this layer.) But both clients and servers are free to
schedule the chunks of HTTP flows onto the socket however they like. (They can
even impose a flow concurrency cap, simulating browsers' HTTP/1.1 connection
limit and starving flows of progress. It'd make the client/server a non-
conformant HTTP/{2,3} server, but it'd still _work_ , as what progress
"should" be being made is unknowable to the peer.)

It's a bit like saying an OS or VM has a "soft real-time guarantee" for
processes. Exactly _how_ does the OS scheduler choose what process will run
next on each core? Doesn't really matter. It only matters that processes don't
break their "SLA" in terms of how long they go without being scheduled.

------
robszumski
Early employee of CoreOS here. I wanted to float a few ideas out there for
discussion.

First, while etcd did get a bit more complicated over the years, it stayed
relatively stable compared to Consul. This was a point of pride of the etcd
team and why it ultimately was picked for Kubernetes. Yes, a gRPC API is not
as easy to use as HTTP, but this gained massive scale for Kubernetes when it
was released. Something like a 10x increase in the number of nodes in a
cluster, without changing any Kubernetes features. Just enabling etcd v3's
gRPC API. That's pretty awesome.

As a member of the product team, I think we could have made different choices
that would have massively impacted the ecosystem. Consul might not exist
today. We did not want to add in a DNS server to etcd in order to keep it lean
and focused. We had an integration with SkyDNS, but it wasn't on by default.
This birthed Consul and its built in DNS server for service discovery. Consul
is very popular so clearly this was the right move for some consumers.
Kubernetes didn't need this, nor did Container Linux for its update
coordination. For us, having this possible on top of etcd was the right trade
off. Consul has added a complete service mesh into the later version. Again,
etcd has stayed lean and focused. I think you are focusing on pebbles when you
take in the overall ecosystem.

Would you rather etcd have turned into Consul's breadth? Would it have been
forked?

The second point is around the success of Container Linux. I only include this
because I am proud of it. We updated and secured a fleet of machines that
measured larger than some cloud providers. That is a huge impact on the
container and security ecosystems. Flatcar Linux and Fedora CoreOS remain as
successors, along with other projects like Google'c container optimized OS and
I think AWS announced one as well.

This work continues at Red Hat on OpenShift, etcd, Fedora/RHEL CoreOS, and
tying Kubernetes more closely with the OS to keep the combined whole secure,
up to date and successful for engineering teams shipping software.

~~~
grey-area
Thanks for CoreOS, I happily used it for many years.

------
cwyers
> In 2015, an unrelated tool called Kubernetes was released by Google (but,
> really, by Xooglers). I would go so far as to say that Kubernetes (or, as
> the "cool kids" say, k8s) is the worst thing to happen to system
> administration since systemd.

This, I think, is the key to understanding this whole rant. It is entirely of
a piece with the anti-systemd crowd, and I think understanding what was really
going on there helps explain this rant, and a lot of other agita in the
community these days.

Like, if you followed the drama on the Debian mailing list, it could sometimes
feel very surprising that they were willing to go to war over the least sexy
thing ever, an init system. Part of the reply is that systemd is more than
that, it's a bunch of low-level plumbing all lashed together. But, who cares
about low-level plumbing that much?

The answer (other than 'the exact sort of people who run Debian') is this.
Linux started off as being written by hobbyists, for each other. It was very
often a labor of love. Even the people making money off it weren't doing so in
Big Business sort of ways. Linux and the associated stack around it is being
taken over by a lot of big corporate interests: Intel, Red Hat, Google, even
Microsoft. Are they bad? That's a matter of opinion. Are they inept
technically? Probably not, but a lot of it revolves around how you measure
things.

But what is absolutely happening is that people who are paid to maintain these
projects on the behalf of large corporate interests are more plentiful, both
in headcount and in personhours, than people who are maintaining these things
for each other. For people who were used to Linux as this anti-corporate
space, that's a huge and distressing change. And they can feel powerless
against it, because it's really hard to reach a critical mass of people who
feel the same way as you, and are willing to do all the things to not just
support a fork but to keep the rest of the ecosystem they live in open to the
fork.

It's not about any one piece of software, and it's not _really_ about pure
technical merit. It's about the culture and the system of values and the way
that decisions are made and it's about who's in control. The author's
complaint isn't so much that the Google style is bad (although I'm sure he
believes that very strongly), but that it's harder and harder to find
somewhere to escape its reach.

~~~
GekkePrutser
You hit the nail right on the head here.

I too am sick of people telling me to accept ubiquitous containerisation
(snapd), locking down of filesystems (macOS), overly complex init systems
(systemd (1)), lack of server access (manage it via ansible/k8s, not ssh).
Usually the arguments given are "it's necessary 'for security'", "this is
where everything is going", "you have to trust the vendor". This is not the
spirit of Linux or open source. I totally see the effect you mention of paid
maintainers forcing projects into their employer's direction. As such I found
it sad when even Arch fell to the systemd camp. And it's one of the few things
it really enforces, it supports 6 different network initialisation methods for
example, where all other distributions just pick one and go with it.

(1) I'm not all anti-systemd by the way. I agree init was bad. But I think
other systems like OpenRC are way easier to work with.

~~~
Foxboron
> As such I found it sad when even Arch fell to the systemd camp. And it's one
> of the few things it really enforces, it supports 6 different network
> initialisation methods for example, where all other distributions just pick
> one and go with it.

Arch has only supported one init system at all times. We could still have
stuck with the sysvinit with hacks to look like openrc, but nobody liked that.
The initscripts where a mess to maintain and systemd enabled us to not having
to care about these details.

If we stuck with the old system, what would the argument be today? "Arch only
supports sysvinit! That is not the UNIX philosophy!"? You don't see any
complaints on distributions that only support one init which is either runit,
s6 or openrc. But when it's systemd somehow people get sad?

~~~
GekkePrutser
I agree sysvinit was no longer sufficient for today's needs.

The issue I have with systemd was mainly related to its heaviness not really
fitting in with Arch's simplicity. And with the products of corporatism
seeping through even one the most noncommercial variants of Linux. Basically
what cwyers mentioned above.

I've been putting up with systemd as all the distributions I use have moved to
it, but I never really liked it.. I don't like the way the scripts are so far
hidden in the filesystem and I don't like journalctl, I prefer having simple
logfiles.

I discovered Alpine linux recently as a server distro (previously I thought it
was mainly developed for docker containers) and I really took a liking to
OpenRC. It reminded me that things could still be simple even in a modern init
system. I think it would have been a good choice for Arch, for me it would
have been better than systemd. I don't even know the other systems you
mention, I'm not an expert on init systems.. My opinion is just that of a
user.

~~~
Foxboron
>The issue I have with systemd was mainly related to its heaviness not really
fitting in with Arch's simplicity.

Arch is a pragmatic distribution first and foremost. If we can build systemd,
and ship it as-is to have a complete init, and more features along with it,
that is much more enticing then the alternative.

> I think it (openrc) would have been a good choice for Arch, for me it would
> have been better than systemd.

It would still entail maintaining some form of initscripts. I don't think most
users realize how much of a burden this is, they don't really deal with that
part of the system.

~~~
GekkePrutser
I understand, and it was not my intention to drag the systemd discussion up
again. It was just one of the many things I thought of after reading cwyers
post which really resonated with me, and the parent article as well.

And there are things I do like about systemd, I also have to say. Especially
the way you can pass a variable to a server with @, like run 2 instances on
different ports.

I know very well that the rc scripts were a pain to maintain as I've written
some as well in the past. So I really understand the benefit of upstream-
provided service configs.

The transition to systemd for me as a user was just a bit more difficult. It
deviates so much from earlier conventions, and adapting it I found difficult.
It enforces a lot more than just the init system.

For some reason I always end up having to compile some of the software I use
myself, and having to make scripts then. I found this a lot easier with OpenRC
than with Systemd (having pretty much no knowledge of either).

But anyway I'm just one user and you support millions, I know you're making
the right choices for the platform. Thanks for taking the time to reply to me
in fact!

~~~
Foxboron
>and you support millions

I'll be amazed if Arch has a million users. Our napkin math using steam user
statistics put us on around 500k-600k users.

------
ixtli
I run k8s in production and I think this is a really bizarre axe to grind that
sort of smells like someone who got upset by how steep the kubernetes learning
curve is. Which, in a way, is understandable.

> 1) Add hundreds of new failure modes to your software

In my entirely anecdotal experience, it removes error modes. It turns out that
just because k8s offers a feature (it offers many!) doesn't mean you're
required to use it.

> 2) Move you from writing portable software configuration to writing
> thousands of lines of k8s-specific YAML

Portability, put another way, is a requirement to maintain n different
deployment mechanisms because you have no meaningful control over the system
on which your software is deployed. This is unavoidable depending on what it
is you're deploying and where, but for those of us who have the ability to
make those decisions the old notion of portability described by automake and
others is actually a bad contract to agree to.

That said, the YAML but is horrible and should be abolished. This is a great
criticism.

> 3) Ensnare you in a mesh of questionably-good1 patterns like
> containerization and software defined networking

The author isn't considering that I am not, and never will be, Google's entire
SRE team. I consider it a general win to be able to apply some subset of their
encoded knowledge to the problem of keeping my services running with minimal
downtime even if you pick up some k8s-specific baggage along the way.

~~~
jonfw
What's wrong with the YAML? It's easy to read and write, concise, and
generally has sane defaults meaning you don't have to be overly verbose.

~~~
outworlder
> What's wrong with the YAML?

Nothing at all. Until you start 'templateizing' it (looking at you, Helm).

It's definitely better than JSON (it can even have comments, imagine that).
Most people don't bother reading the spec or even examples though, and don't
realize how good it actually is.

~~~
ithkuil
I wish more people stuck with templating yaml would just issue a toJson
instead of fiddling with indentation levels

------
gnur
It's mostly a rant about someone not accepting that extra performance can come
at the cost of complexity.

While I could argue with all the points that he's making, my main counterpoint
is this: junior devs don't care that their http/2 server uses way more
"complex" code then their http/1 server, it's just a flag away (or in most
cases, automatic). Senior devs worth their salt also don't care, if I design
an application that is going to run on kubernetes I now know it will run in
the big cloud providers and on premise without major changes. It forces you to
accept that your app will die and needs to be able to run from a cold start
without any issues. I can't count the number of machines I've encountered over
the years that couldn't be rebooted because the maintainers monkeypatched the
crap out of it while running EoL distributions, libraries and web servers.

And now I've realized I've become a geek yelling at the cloud as well.

~~~
EvanAnderson
The performance benefits are aimed squarely at the large, entrenched interests
who can absorb the increased costs as a rounding error. HTTP/2 and SPDY/QUIC
look, to me, to be about decreasing costs and increasing efficiency for large
web hosts and erecting barriers-to-entry for competitors.

~~~
pimterry
Where's the barrier to entry?

HTTP/2 is more complex to implement from scratch (but still quite doable, even
as an individual), but it has built-in support in every language & framework
worth its salt now, so you don't need to do that.

If you're using an existing implementation, that's usually just as easy to do
as HTTP/1.1, because they have almost exactly the same semantics (that's an
explicit goal from the spec), and so most implementations have almost exactly
the same API.

In practice, it's a syntax & connection management change on the wire that's
mostly invisible as a developer building on top of it, plus a set of optional
extra features (like Server Push) that you can use if you want or ignore if
you don't.

Can't speak for QUIC/HTTP3, since I haven't touched them yet, but I'd be a
surprised if that's a hugely different story.

------
peanut-walrus
> the worst thing to happen to system administration since systemd

Cool, I need to look into Kubernetes way more then! There hasn't been any
single development in the Linux world that has made my job as a sysadmin
easier than systemd. Services, Timers and Networking that just works, is
concisely defined and can be used across all relevant distros? Hell yeah, I
don't ever want to go back into init.d/ifupdown/crontab world filled with bad
configuration files, strange footguns and bugs that won't ever get fixed
because someone's legacy system might depend on them.

------
kminehart
I really dislike articles like these. I don't think the author is interested
in having any kind of productive discussion or criticism.

There's a million reasons to hate Kubernetes, the author couldn't be bothered
to venture beyond the lowest hanging fruit (yaml)?

If I could downvote this I would.

~~~
epse
But it's not about kubernetes. It's about simple API's becoming complicated
for no reason (according to the author, I have no experience on etcd) using
etcd as an example

~~~
cmckn
The author doesn't detail _what_ is more complicated about the gRPC API, other
than the fact that it's gRPC. One could implement the exact same API in HTTP
or with gRPC; so without specific examples, it's kind of a meaningless
critique.

Surely, most consumers of a database like etcd are using a client library, in
which case why does it matter if the API is HTTP or gRPC?

Tangential: after reading some of the comments, I was surprised that the blog
post was only like 250 words; the author really says very little.

~~~
keeganpoppen
"detail what is more complicated"...

------
sisk
Curious why the author declares Container Linux bullshit/unsuccessful? I
thought it was a wonderful project that got better over its (brief) lifetime.
The active/passive upgrade was an absolute blessing. If the declaration is it
was unsuccessful because it no longer exists, Flatcar and Fedora CoreOS are
pretty straightforward successors. fcos is basically functionally equivalent
if you don't use rpm-ostree layers and if you use podman in place of any rkt
containers. If the complaint is "systemd," that's fine—it's not the OS for
you—but I don't think that makes it bullshit.

I didn't like having to move cl machines to fcos but it really wasn't that bad
and I still get coordinated, active/passive upgrades. ¯\\_(ツ)_/¯

------
solumos
There's a lot of inflammatory buildup to the chief complaint:

> With the massive influx of Kubernetes users came, of course, a large number
> of Xooglers who decided to infect etcd with Google technologies, as is their
> way23. Etcd's simple HTTP API was replaced by a "gRPC"4 version; the simple
> internal data model was replaced by a dense and non-orthogonal data model
> with different types for leases, locks, transactions, and plain-old-keys.
> etcd 3.2 added back a tiny subset of the HTTP API through the "gRPC
> Gateway", but not enough to implement any of the rich applications built on
> top of the original API. The v2 API lives on for now, but upstream threatens
> to remove it in every new version and there will surely come a time when
> it'll be removed entirely.

This is the interesting part of the post to me. I can understand the
disappointment - but also, nobody's making the author upgrade to the gRPC-
based etcd v3. This reminds me of other "traditional" sys-admin folks I've
worked with who seem to want to stick to 2010-style administration rather than
work with newly available abstraction layers that eliminate some of the
overhead (e.g. Docker, k8s). If they want to continue linux administration in
that way, that's their choice, and there's some merit in that - but they
shouldn't expect their tools and other developers to necessarily follow them.

~~~
masklinn
> nobody's making the author upgrade to the gRPC-based etcd v3.

They're stating that the v2 API keeps being slated for termination. Surely
once that happens they won't be able to use it anymore, and will have to
either use an unmaintained etcd, fork it entirely, or be forced to use the v3
API?

~~~
shadowgovt
That actually raises the oldest question in open-source engineering:

If the HTTP API has value, why _not_ fork etcd and maintain a version where
the HTTP API is the primary interface and gRPC is an afterthought or missing?

~~~
KaiserPro
because then you'd have to maintain it....

~~~
shadowgovt
As it stands, the version with a gRPC API has to maintain the HTTP API.

If the HTTP API has value, the cost to maintain it should self-justify, right?

~~~
Brian_K_White
The point was not value, but value for whom, or in the service of what end.

Everything anyone ever did had some kind of value to someone, even murdering
babies.

~~~
shadowgovt
So if maintaining an HTTP API has insufficient value to pull enough people
together to do, why are we worrying about it?

------
finnthehuman
>The compression scheme in HTTP/2 is so shitty that the "compression table" in
RFC 7541 [appendex a] is just a list of the 61 most popular headers from
Google properties.

I thought this was moderately-funny cheeky banter, but I wanted to see what
implementation decision they were making fun of with this silly
misrepresentain. Quote RFC 7541: "The static table was created from the most
frequent header fields used by popular web sites." Oh.

~~~
grey-area
Why do you think that is a bad header compression scheme for a static protocol
where the majority of traffic contains those headers/values repeated over and
over? That indexed table shaves about 30% off header size.

[https://www.keycdn.com/blog/http2-hpack-
compression](https://www.keycdn.com/blog/http2-hpack-compression)

------
miked85
This just reads as a systems engineer who is angry about new tooling/processes
replacing parts of their job.

~~~
dilandau
Nice projection my dude.

It's a lamentation of the introduction of complexity into a formerly-simple
set of APIs.

~~~
intro-b
Lamenting about complexity just for the sake of lamenting about complexity and
then adding a few potshots at Google/FB/etc. doesn't exactly make for a
thought-provoking, nuanced, or interesting argument, though

~~~
dakiol
But it does not make the argument less true.

------
williamstein
This seems rude and also dishonest/revisionist, eg "In 2015, an unrelated tool
called Kubernetes was released by Google (but, really, by Xooglers)".

~~~
hoebbz
What is dishonest/revisionist about the statement you have quoted?

~~~
listennexttime
How tedious to have to constantly refute FUD when it's easy to find the
answers with Google.

Anyone involved in Kubernetes, near CoreOS at the time, or really anywhere in
the space at the time (instead of looking back at it with anger), knows this
all to be false. CoreOS was setting direction for etcd, and understandably
adding features for one of its bigger users (and in fact, some of those
features are used by things of larger scale than k8s).

Kubernetes itself was started by Googlers, many of whom are still there or
left to go... do Kubernetes at Red Hat (IBM) or as a startup, or at Microsoft.
But to act like it was an outside project started by people who had previously
quit, or are somehow unqualified to work on an orchestrator, is just an an
angry untruth. Every major committer to Kubernetes besides a handful of RH
folks were at Google when Kubernetes 1.0 came out. I'm happy to be corrected
but I know it's hip af to hate k8s (just like two days ago
[https://news.ycombinator.com/item?id=23807556](https://news.ycombinator.com/item?id=23807556))

~~~
hoebbz
Thanks. I just wanted to know what the author meant; it wasn't obvious that I
needed to Google the quote to figure out the meaning. I certainly hope that I
wasn't adding any FUD to the discussion.

------
sam_lowry_
"Kubernetes is the worst thing to happen to system administration since
systemd."

I'll take that quote into my fortune file.

~~~
dimgl
What are some of the arguments against `systemd`? I've used it in several
production systems and don't really have an opinion on it.

~~~
tehalex
Here's a good article that I saw on HN a few months ago:
[https://blog.darknedgy.net/technology/2020/05/02/0/index.htm...](https://blog.darknedgy.net/technology/2020/05/02/0/index.html#utopia)

Also:
[https://suckless.org/sucks/systemd/](https://suckless.org/sucks/systemd/)

Personally I avoided it for a while because of the hate, but haven't really
had issues with it since using it.

~~~
dekhn
Ditto. my personal experience with systemd has been "wow, it's easier than
ever to write an init script that runs when it's needed".

Looking at the components that make up a modern UNIX system, it's definitely
time to think "how can we make a _new_ operating system, evolved from UNIX,
but with a more coherent core running the majority of run time orchestration".

------
cerberusss
I don't experience the sadness that the writer of the blog mentions. However I
do understand what they mean. Personally, I've decided I don't really like
huge projects, and instead have focused on smaller iOS apps. They don't have
dependencies, except for what the Apple platform offers. There's a backend,
but it's really abstracted away for me through a RESTful API. And I'm able to
continually and without fail, deliver something of value to my customer.

------
mjw1007
I don't know anything about what's happened with etcd in particular, but I
think there is a genuine problem in this area with the model of "the people
doing the work get to make the decisions".

If you have a piece of software which is basically finished, and a group of
people come along who are interested in extending it far beyond its original
purpose, then it's very easy for those people to end up as the official
maintainers of the software simply because they're going to be the most
active.

So when it comes to deciding whether expanding the software so much is a good
idea (where the alternative is to make a fork to add all the new stuff in),
the people who are happy with it as it stands don't get as much voice as
perhaps they should, because for obvious reasons they aren't contributing many
patches.

~~~
fightme
Hi, thank you for admitting in your first sentence you're not qualified to
comment on the subject at hand, yet felt compelled to share your two cents.
Here's the context for your totally uninformed opinion:

I was a major contributor for etcd 2.3-3.2 at CoreOS. The people who extended
etcd to support gRPC included the original etcd author (hi Xiang). gRPC
support was necessary for good performance; we had benchmarks to justify the
decision. Likewise, v3 brought about a key-value model change that was
incompatible with v2 to better support binary data, ranging over the keyspace,
transactions etc. A v2-style gateway for v3 with a pretty JSON API was planned
but never completed due to lack of resources; the ugly gRPC json gateway
turned out to be good enough for most people. Similarly I wrote a proxy to run
v2 requests over v3 instances, which does support the v2 JSON API. This isn't
as if a new group of people showed up and ruined the software without caring
about existing users. None of us were "Xooglers".

It seems what you're proposing is what the author both argues against and
wrongly believes is what happened. We constantly pushed back against k8s
influence. If we didn't, etcd would be a k8s sub-project right now. I'd also
like to point out that removing the people who do the difficult work of
actually writing the software from the decision process is incredibly
insulting and devalues their labor. How dare you.

~~~
mjw1007
I am saying:

\- The problem the author describes can happen (I have seen it happen)

\- I do not know whether it happened in the case of etcd

I am not saying:

\- I have a proposed solution to this problem

\- The people who do the difficult work of actually writing the software
should be removed from the decision process

Apologies if that wasn't sufficiently clear.

------
lmilcin
Regarding some hate against Kubernetes.

The problem is not with the Kubernetes but with management/architecture teams
of companies that decide to use k8s for projects that have no real scaling
requirements or when overhead of just having a constant fleet of nodes would
be less than overhead of maintaining working k8s deployment.

Kubernetes is a hugely complex piece of software that is handling hugely
complex cases that arise when you try to deploy and maintain a large number of
applications with hugely differing scaling needs.

In my experience, k8s deployment _absolutely_ requires a dedicated team with
top notch k8s knowledge and debugging skills. You also need to understand k8s
is a huge constant overhead as it requires actual knowledge and experience to
use and has complex faults that require ability to debug really complex and
interdisciplinary problems.

Google could do that because they invest in their teams and because they have
exactly the type of problems that greatly benefit from a single complex
solution that can handle all of them (meaning your engineers can migrate
between projects but they still can use same tools they already know).

The trouble with management/architecture teams is they do not understand they
are not Google. They are Google-wannabees. They like to promote how great they
are and how great their projects are but are either deluding themselves or are
being deluded by their lower echelons and have not set up themselves to
understand what is actually going on.

And that's how you land at a situation where a team that has simple
application that would require just two nodes (second for redundancy only) and
no knowledge of Kubernetes is required to deploy to k8s instance and deal with
a host of problems they never had to deal before that are beyond their
capabilities.

~~~
pas
k8s is great even if your requirements are almost static. because it
standardizes so many small but important details

~~~
lmilcin
It is. I even mentioned it is nice to have engineers be able to move through
the company and be familiar with the tools (this is also called
standardization).

But if you have _only_ simple applications you will be surprised how complex
it is sometimes to debug k8s problems compared to what you would expect from a
simple application.

If some of your applications already have very complex deployment needs then
you are replacing one complex homemade process for another complex GP
solution.

------
thyrsus
I'll pile on: I'm working through Marko Luksa's "Kubernetes in Action", where
he introduces this quaint line of code:

    
    
      etcdctl ls /registry
    

He notes in an aside that you may have to do

    
    
      etcdctl get /registry --prefix=true
    

for later versions of the protocol. But here I am with k8s 1.16, and etcd has
been locked in a pod (seriously, why?) so before you contact it you need to
"kubectl exec ..."

We're not finished. Apparently the pod was insufficiently secure, because now
we need to decorate the etcdctl with a couple certificates, a key, and do SSL
encryption to make contact, so in the end I have this wrapper (names changed
to protect the guilty):

    
    
      $ cat ~/bin/etcdctl
      #!/bin/bash
      declare -A EtcdHost=(
        [clust_prodA]=clust1-leader
        [clust_prodB]=clust2-leader
        [clust_qualA]=clust3-leader
        [clust_testA]=clust4-leader
      )
      if [ ! -z ${EtcdHost[$cluster]:-} ]; then
        export KUBECONFIG=$HOME/.kube/$cluster
        exec kubectl exec etcd-${EtcdHost[$cluster]} -n kube-system -- sh -c "ETCDCTL_API=3 etcdctl --endpoints https://${EtcdHost[$cluster]}:2379 --cacert /etc/kubernetes/pki/etcd/ca.crt --key /etc/kubernetes/pki/etcd/server.key --cert /etc/kubernetes/pki/etcd/server.crt $*"
        exit 0
      else
        echo "No known etcd host for cluster $cluster";
        exit 1;
      fi
    

The entry points for beginners have been nearly barricaded, and the ladders
charred and dangling high.

~~~
q3k
> (seriously, why?)

Ask whoever deployed this cluster. This is not something that Kubernetes
mandates.

> and do SSL encryption to make contact

And that's a bad thing?

------
outworlder
> In 2015, an unrelated tool called Kubernetes was released by Google (but,
> really, by Xooglers). I would go so far as to say that Kubernetes (or, as
> the "cool kids" say, k8s) is the worst thing to happen to system
> administration since systemd.

Oh for crying out loud. This Kubernetes bashing is getting so old.

Frankly, I think it's the best thing that has happened in the past few years.
Boohoo it requires some YAML. Yeah. And then you get lots of value out of the
box.

Systemd bashing is a bit more deserved, but this horse has been dead for a
while now.

And there's also some CoreOS bashing too. CoreOS is/was spetacular.

~~~
dakiol
Your comment is as valid as what you have quoted. Kubernetes can bring lots of
value, but not to everybody.

------
peterwwillis
> If you are running a truly enormous system and want to have off-the-shelf
> orchestration for it, Kubernetes may be the tool for you.

Not even then. If you're an enterprise using EKS, you still have to work
around the pain points, missing features, incompatibilities, and do a metric
shit-ton of custom integration. And that's before even touching CI/CD, and
ignoring how you build the infrastructure.

It's like a Formula 1 engine. Really complicated, you still have to assemble
the rest of the vehicle, and you need a team of engineers to understand what
it's doing when you drive it.

I think the author is right. gRPC is bullshit that you only use if you're
dedicating your whole organization to one protocol. They actually re-added
that wacky backwards-REST API-gateway because people really needed a REST API.
On the other hand, Consul not only has a REST API since the beginning, but it
incorporates more useful (some would say necessary) features so you don't have
to write them from scratch. So you can go with Consul and get more
compatibility and features, or go with Etcd for the opposite.

But at the same time, a "lightweight database written around Raft consensus"
is like saying "clogs for runners". Most people shouldn't use it, and those
that do won't be happy with the consequences later.

------
loriverkutya
I tried to find something valuable in this post, but this is just a few
paragraphs of rant.

------
marcrosoft
What a breath of fresh air. The author is spot on and cuts right through the
bullshit.

------
amyjess
When I see comments like this:

> cancerous user-hostile protocols of HTTP/2

I know not to take the article seriously. I'm simply not interested in hearing
from someone with that kind of attitude.

------
jupp0r
etcd switching to grpc makes complete sense. The author seems to have limited
experience with the downsides of http long polling and all the implementation
and operational complexities it comes with.

As a counter example of Xooglers making things more complicated, I wanted to
bring forward the removal of the protobuf exposure format for prometheus 2.0
after it was discovered that it's actually more performant to parse
prometheus' text format.

------
jb3689
It's free open source software. Take it or leave it. Fork it if you want. Use
an old version if you want. At the end of the day the maintainers (presumably)
decided to do this to etcd. They could've rejected the changes but didn't

------
SrslyJosh
I think the takeaway here is that if you don't want a project co-opted by
$MEGAHUGECORP, release (or fork) it under GPL 2/3.

------
lmm
Funnily enough, the part that this author seems to hate the most - moving to a
well-defined protocol that has an actual typed description rather than some
ad-hoc pseudo-plaintext format - is exactly what's wrong with
systemd/kubernetes/etc - reinventing the wheel with new protocols rather than
reusing what already exists. You don't have to like gRPC - I don't
particularly like gRPC - but it gives you an interface in a well-defined
format that's easy to play around with. That's head and shoulders above having
to figure out the edge cases in yet another custom pseudo-plaintext protocol.

------
knorker
This doesn't sound right:

> In 2015, an unrelated tool called Kubernetes was released by Google (but,
> really, by Xooglers)

1.0 was released in 2015, but seems to me the relevant year was 2014:

[https://www.wired.com/2014/06/google-
kubernetes/](https://www.wired.com/2014/06/google-kubernetes/)

And how was it Xooglers? Wikipedia certainly seems to be saying this is a
Google-initiated product, through and through:
[https://en.wikipedia.org/wiki/Kubernetes](https://en.wikipedia.org/wiki/Kubernetes)

------
mooted1
bai

~~~
dijit
> This guy used to run infra at uber and was incredibly salty about every
> single new technology. There were a lot of bad ones, but every conversation
> was about as constructive, free of evidence, and bitter as this blog post.

I'm going to assume you're a developer?

Infra people are usually much more apprehensive to take on new technology.
Crucially I would describe classically trained sysadmins as 'pessimists to the
core'. This is why there's memes of operations saying 'no'.

This is what devops was all about, the shared responsibility of it all. I'm
going to assume that uber was perfect and got devops exactly right- but adding
technology should in my mind always be met with the absolute most critical eye
imaginable; and if he's a classically trained sysadmin then it probably comes
from that place of being once bitten twice shy.

~~~
pm90
The apprehensiveness of sysadmins May have been justified in the world 5 years
ago but today it sounds somewhat out of place. Note that critically evaluating
new technologies is still an important skill and many infrastructure people I
work with are extremely cautious about adopting new tech without
spiking/getting to know it. But the sysadmins with the penchant for saying no
is probably one of the reasons the devops movement actually kicked off:
developers and infrastructure folks could see the immense productivity gains
from being able to ship code quickly to production and got onboard the
technologies that enabled this after being frustrated with the amount of time
that traditional software deployment processes took.

So in today’s world, that kind of attitude is hardly productive. Skeptical?
Absolutely. But open minded.

~~~
Nextgrid
> the devops movement actually kicked off

And this is why every single project out there is now a house of cards (or
should I say house of YAML files) using insanely complicated technologies
(like Kubernetes) with very "interesting" failure modes to say the least.

This attitude works today because of engineering-driven-development; the whole
purpose of engineering _is_ engineering and business priorities took a
backseat in favor of buzzwords on the careers page and an obligatory
"engineering blog" (describing how they solve self-inflicted problems),
however when it comes to reliability and solving problems a large majority of
projects can get away with much simpler, old-school technologies.

~~~
pm90
Almost everything in your comment is wrong. Kubernetes has enabled a whole
host of observability tooling (eg opentracing), promoted a culture where app
logs are easily and always accessible, enabled 0 downtime deployments for
teams without dedicate infra specialists and so much more. It has made
deploying reliable applications a lot more easier than ever before.

Services today scale to handle a lot more users and traffic than they did not
so long ago; and these reliability guarantees are the norm rather than an
exception.

~~~
Nextgrid
Kubernetes does indeed have advantages but also brings a whole layer of
complexity, overhead and moving parts. From my experience, in many cases the
theoretical advantages don't end up being worth the tradeoff and/or don't even
end up being implemented. Furthermore the particular things you mention
(tracing, centralized logging & no-downtime deploys) can be done just as
easily without Kubernetes.

I disagree about not needing dedicated infrastructure specialists. Kubernetes'
complexity, learning curve and failure modes would make me uncomfortable
operating without having a dedicated "devops" person (or sysadmin as we used
to call them) while I am perfectly comfortable managing a few virtual machines
(or even bare metal hosts) with a load-balancer in front of it. I recommend
building systems in a way that can easily fit in your mind, and there's only
so many abstraction layers and moving parts you can fit in there before you
overload.

When it comes to scaling, not every application _needs_ to scale and even when
it needs to, it's trivial to scale stateless app servers without Kubernetes.
You can scale quite far without Kubernetes, and when you're past that point
you'll realize your main bottleneck is your data store and Kubernetes (or
similar) can't magically solve that.

~~~
pm90
Your answer to everything I said is “actually, it’s not that hard without
kubernetes”. Maybe it’s not for you. For most developers, it absolutely is.
And that’s why kubernetes is popular.

Fads don’t form out of thin air. There’s always some value that they provide.
To someone experienced with setting up infrastructure, the tasks may seem
trivial, and the value add is low. For others who don’t, having a dead simple
way of easily adding tooling around their applications is a godsend. Why is
this so fucking hard for you to understand?

~~~
Nextgrid
I am approaching this from a developer's perspective. Kubernetes and similar
(even local Docker) introduces an extra layer of indirection that you often
have to fight with. Sometimes it's worth it, sometimes it's not. For me, if my
application is misbehaving in production, I prefer being able to just SSH into
the machine and figure out what's wrong than fight with the container layer,
its authentication system, command-line syntax just to obtain a shell inside
the container. When I am developing locally, I prefer having all my files on
the local filesystem instead of having to worry about volume redirection and
"docker exec".

I am not an expert in setting up infrastructure by any means. In fact if I
were I would probably use and promote these technologies. But in my opinion,
adding another layer of abstraction doesn't magically solve the problems of
the underlying stack (it won't protect you against obscure Linux kernel
behavior, but now you have yet another moving part and potential variable
which you need to account for when troubleshooting) but still gets in the way
when you're trying to do something simple that doesn't even require any of the
advantages the container technology is offering.

When it comes to "adding tooling around their applications", I am not sure
what you mean but I will assume you refer to your previous examples, in which
case I do not see how container technologies change the game at all. Tracing
and centralized logging require your application to talk to a centralized log
server (for logs, you can also output to stdout and have systemd/syslog
collect and send them to the logging server) and container technologies don't
change anything here.

I am not saying that container orchestration technologies provide no value. I
am saying that they are often overkill for the task at hand and introduce
extra complexity, moving parts and management overhead.

~~~
dijit
I'm an infrastructure type and everything you said rings true for me.

Kubernetes and abstractions of its ilk (shipping containers for example) have
a place, but every abstraction comes with some form of trade-offs, be that
performance or transparency.

Dealing with a node brown-out in kubernetes is much worse than dealing with a
network, host or service outage because the troubleshooting steps involved
evolve fractally.

That said, obviously there is value- but it's good to critically assess the
value instead of just jumping in.

------
jrockway
As far as I can tell, the only thing this article is saying is that the author
doesn't like gRPC, preferring hand-rolled APIs. (The rest of the rant doesn't
really talk enough about the problems to respond to. The author doesn't like
Kubernetes. The author doesn't like systemd. The author doesn't like software-
defined networking. No reason is given as to why, so there is really no way to
have a constructive conversation about it.)

Hand rolled APIs are easier to understand, but harder to maintain. It's great
if you're only ever going to have one client, but once you need more than one,
it's sure tedious to write and rewrite it for every language you want to
support. Using gRPC means that you can auto-generate the client, and while
they might not be as wonderful as writing each one of them by hand, at least
you can get a client for whatever language you're using. And, the clients all
behave the same way -- trying to figure out how to add interceptors to every
bespoke client you need is quite tedious. (Look at how long it took AWS to get
contexts in go, or how hard it is to add OpenTelemetry to the random hand-
rolled HTTP client, etc. With gRPC, you just do those things once!)

Using protos as the transport layer lets you make backwards-compatible changes
smoothly; adding fields is safe, renaming fields is safe, etc. The same is not
true of using JSON -- if you call something "foo", you can't just one day
rename it to "bar". Clients won't know what "bar" is. So you have to update
clients and servers at the same time, and you can never "make before break".
You see this all the time when someone rolls out a client/server update for a
browser app -- your browser cached the Javascript, and it can't talk to the
server anymore until that cache expires. It's nasty. I don't understand why
people do that to themselves.

gRPC also adds defined semantics for TCP connection length; with HTTP/1.1,
maybe you can reuse your connection, maybe you can't, it depends. You can't
have multiple requests in flight on the same connection, even if you can reuse
it. HTTP/2 fixes this, but gRPC has first-class channels and behavior is well-
understood for request/response, streams, etc.

It is unfamiliar and not as easy to debug on the command-line as "curl
[http://api.example/foo"](http://api.example/foo"), but once you get up and
running, easy things are easy and hard things are possible.

As for Kubernetes, I dunno, it hasn't been bad for me. I tend to use the
managed offerings, so I don't have to spare 5 machines for an etcd cluster /
masters, or maintain them. I build a container and Kubernetes ensures that it
runs forever. If it dies, it's restarted. If more replicas are added, they
start receiving traffic. I can manage 100% of the configuration in Git, so if
my cluster or cloud provider blows up, I can re-apply somewhere else and have
a 99.9% chance of it all working within 15 minutes. Before containers and k8s,
production felt very much like a "yolo" thing to me. Most of the world set up
some VPSs, logged in, configured them, and prayed that everything would work
well. Your website went "down for maintenance" every time you did a release.
You needed to distribute root credentials, hoping that you could fully trust
everyone on your team to not mess anything up. It mostly worked, but through
sheer brute force rather than any system working behind the scenes to make
things run smoothly. With k8s, you can delegate this tedium to software. A
developer can update a config file, have the PR approved, and software will
ensure that the new version of the software starts, is assigned some load, and
the old version is shut down. It's smooth, hard for someone to manually mess
up, and quite productive.

I get that people have made their own ad-hoc orchestration and they like it. I
guess that's OK. What I like about k8s is that it's a common language for this
sort of thing -- I've used it for 3 projects, and they've all looked about the
same. I've never had to learn anything company-specific; I can just run my
software. I don't think that's a bad thing.

(The solid alternative to k8s, it seems, is just paying someone else to run
everything but the one application that your company makes. Can't figure out
how to run Prometheus? Just buy a Datadog subscription. Can't figure out how
to run a frontend proxy? Just buy an Application Load Balancer. Can't figure
out how to search and retain application logs? Just buy a Splunk subscription.
Can't figure out how to install MySQL? Just buy it from your cloud provider.
That is all great, but you end up spending a lot of money because you can't
efficiently manage software and instead spend all your time writing rants
about orchestration frameworks. Sometimes I wonder.)

Whenever I see articles like this, I have to ask what the transition from "big
UNIX" to Linux would have looked like on HN. I am sure some people hated it,
and I'm sure some of those people had good reasons. But things got smoothed
out and it turned out that Linux and its ecosystem was pretty good. If you
maintain a production application today, it probably runs on Linux, and that's
good because that's one less thing you have to teach your team. I kind of see
k8s in the same place. Lots of people have trouble running multiple pieces of
software in production. k8s is a common language for doing those things. You
can build it yourself or you can buy a managed service. You can extend it to
do the crazy things you need it to do. It's not a bad thing, and I think it's
pretty disingenuous to compare it to systemd.

~~~
wh-uws
thanks for the succinct explanation of gGRPC over HTTP / REST / whatever.

I've been trying to figure that out for a while.

Think your analysis is spot on overall.

The author strikes me as the person who was saying "why can't I just keep
writing Assembly?" in the 90s .

The answer is you can keep using <X technology at a lower abstraction level>.

But don't be mad when people are using the new abstraction layer to build
interesting stuff because they don't have to worry about the lower level as
much.

------
megapatch
One should not forget that a big corporate is not made out of great people,
but actually only of normal sized people... just, well, a lot of them!

If you hire a lot of people, they all will need to occupy a space each. To
deserve that space, everybody needs to be either quite good... or good at
faking being good.

Truly innovative things are done by great individuals. And corporates are not
the natural habitat for great individuals.

------
tambourine_man
Isn’t Etcd Apache licensed? Don’t like the direction it’s taking? Fork it.

Is your fork not getting much traction and community support? Maybe you’re the
edge case.

~~~
juliend2
People think it's easy to create a durable fork that will gain enough traction
to live on many years (in a stable manner) and attract contributors. It's not.
Just look at the tons of forks on github for abandonned opensource projects.

Let's say there are 10 people willing to fork it to keep it simple as it was
previously. Where is the go-to place for that to happen in the open source
world?

~~~
tambourine_man
It's very hard, that's what I was trying to say.

In your 10 people example, it seems that there are a few problems to me: \-
finding like minded peers \- getting critical mass (are 10 enough to maintain
the fork?) \- getting them all to agree on what subset of features to keep

If I were in the market for a fork, I would search Github first before forking
my own, so I think the first item is basically solved. The second varies on a
case by case basis, of course. The third seems more critical to me, probably
dispersing the effort more than anything else.

------
reggieband
When I read these anti-kubernetes articles, of which there is one every couple
of months, I think of an analogy with tailors. I mean, a made-to-measure
garment is strictly superior to what you can buy from Old Navy. If you take a
tailor into Old Navy you would probably get a similar rant about how badly
made the clothing there is. I also recognize that everything that
Kubernetes/Docker is doing could be replicated more simply and with greater
craftsmanship.

I think the unnoticed problem is that the tailors (old school system
administrators and dev ops) are being asked to become the managers of Old Navy
(kubernetes cluster administrators). Their entire skill set is opposed to this
new role. I can't really blame them for being frustrated.

To keep stretching my already thin analogy, there are still tailors in this
world. However, most people buy their clothes at outlets like Old Navy.
Whining about the quality of the clothes and the crummy manufacturing process
at Old Navy won't change that.

~~~
zten
> When I read these anti-kubernetes articles, of which there is one every
> couple of months, I think of an analogy with tailors. I mean, a made-to-
> measure garment is strictly superior to what you can buy from Old Navy. If
> you take a tailor into Old Navy you would probably get a similar rant about
> how badly made the clothing there is. I also recognize that everything that
> Kubernetes/Docker is doing could be replicated more simply and with greater
> craftsmanship.

But down in reality, most companies are probably running their own made-to-
measure deployment and operation schemes with lower quality and consistency
than Old Navy.

~~~
reggieband
Of course, just as many tailors back in the day were likely to make clothing
worse than Old Navy makes today. We fetishize the top 1% of masters as if the
average quality is anywhere near that.

Again, it reminds me of a carpenter friend who hates Ikea. Yeah, I get it,
Ikea is really bad compared to custom built furniture. No one who buys it
expects anything else.

This story is older than the industrial revolution. Craftsman being replaced
by technology. We even instinctually know this is going to happen to knowledge
workers but we seem blind to it when it happens to us.

------
emilecantin
I don't really use k8s nor etcd, but one sentence in the article really stood
out to me:

> I would go so far as to say that Kubernetes is the worst thing to happen to
> system administration since systemd.

I know it's popular to shit on systemd, but as a casual user, I honestly don't
get it. I've written about a dozen systemd unit files over the years, and each
time it's been a pleasure. Write a few lines describing what to launch when,
one command to make systemd aware of your new file, one command to launch it.

In contrast, I've written a couple sysvinit files, and the experience was
horrible every time. Having to manage pid files and logging myself, setting up
the proper symlinks, etc. It'd always end up being a ~100 line script that'd
be 99% copy-paste.

There might be some use-case where systemd is worse, but for my (admittedly
simple) use-case, it's far superior.

~~~
GordonS
I personally have no beef with systemd and completely agree with you that
having a simple and standard way of composing services/daemons is a
_wonderful_ thing, and one that was sorely lacking for a veryong time indeed.

My impression from the systemd debate is that those that are against it feel
that way because of the breadth of systemd - it's much more that "just" about
service units.

------
kureikain
I think this is the problem for open source software. It's not a one size fit
all.

We all do complex software and you all know its complexity. Sharding,
failover, retry, back off, schema etc...everything in a high scale systems.

The problem is user who used it but doesn't need other features feels
frustrated. At the same time, user who used it at scale need feature others
don't like? What can we do? I don't know.

A good example is Sentry: [https://blog.sentry.io/2019/05/14/sentry-9-1-and-
upcoming-ch...](https://blog.sentry.io/2019/05/14/sentry-9-1-and-upcoming-
changes) Say you are a small single person SaaS? are you going to install
ZooKeeper, Kafka, Snuba...No. But they need it...

------
mnming
It's almost that every month there is an article like this popping up in HN.
The main ideas are all similar: "I don't like complexity, simplicity is king.
React, K8S, SAP and bluh bluh are bad. " Maybe I just don't really fit in
here. I almost always feel the opposite.

I don't find those new, so called "complex", software difficult. I can pretty
easily see why that we need those new features and what problem are they
trying to solve. I'm really grateful there are ppl out there solving my
problem one-by-one for free. I feel those new softwares enable me to do much
more and better than I could do 7~8 years ago.

I feel sorry for the fellows who just can't grasp new technology quickly but
this world is cruel.

It's only hard if you can't get it.

------
bborud
Why is gRPC considered bad?

I can understand that people who are used to HTTP based APIs will feel more
comfortable being able to tap into a rich fauna of tooling for HTTP.

However personally I am not really interested in raw network APIs as long as
they work. I'm interested in programming - not poking around low level stuff.

I'm interested in libraries for speaking to services. When I use services like
Twilio and Stripe, I'm not interested in rolling my own. I'm not going to
implement my own unless I really have to. It isn't nice to just publish an API
and not have libraries for talking to them.

gRPC, at least for me, makes both initial bootstrap, development and
maintainance easier.

------
asdfk-12
This post sums up the last decade. Shouldn't tools and protocols become
simpler and easier to use? If megacorps can use complexity to reduce
democratization of the web, they will because of profit motive.

------
cryptica
> Anything that has a simple and elegant feature-set ends up coöpted by people
> who just want to build big ungainly architecture and ends up inheriting
> features from whatever megacorp the coöpters came from

So true.

------
AnIdiotOnTheNet
I feel the author's pain. Earlier today I literally said "modern software
makes me wish I was dead". Things are so unbelievably bad now, and nobody in a
position to do anything about it seems to care, and the kids growing up with
this garbage think it's normal that it sucks so bad. No, it's worse than that,
they celebrate it! It is now at the point that I no longer want to work in
this field and am actively seeking alternatives. I just can't deal with the
bullshit anymore.

------
nudpiedo
I can 100% agree regarding the evolution of etcd as I used several versions of
it and the amount of workarounds I had to use on every version went from 0 to
"too many moving parts to know whether I will get a call over the weekend".

With gRPC came as well came as well incompatibility of many clients with less
popular architectures and as the article describes, being that a super
specialization (because that's the only thing that it is), we are losing the
qualities that made OSS legendary.

------
parentheses
Joe Armstrong spoke of "the mess we're in" and I think he captured it well. He
was referring to the symptom in general, where-as this essay is about one of
the root causes.

Having recently learned bits & pieces of Kubernetes, I can see the author's
perspective as it's one I held for some time. Kubernetes is useful and makes
many things easier. In exchange you give up finer-grained control and you have
to work with an abstraction. Such is the nature of all abstractions.

------
djhaskin987
My favorite excerpt from this article and one that sums it up nicely can
actually be found in one of its footnotes:

> I am _filled with rage_ just thinking about how we took a fundamental part
> of the Internet, simple enough that _anyone_ can implement an HTTP server,
> and replaced it with this garbage protocol [HTTP/2] pushed by big megacorps
> that doesn't solve any real problems but will completely cut out future
> generations from system programming for the web.

------
zwischenzug
Don't like what etcd has become? Fork off an older version and help maintain
it as a simpler product.

Open source is a market. If you stop building it, they might come too.

~~~
pelasaco
that's exactly what i wrote. Just fork it!

------
brown9-2
> That's it. That's the story. Popular modern technology is taken over by
> expats from a megacorp and made worse in the service of a hyper-specialized
> (and just plain over-hyped) orchestration platform

If you hate the direction the open source project has gone in, why not fork it
and have the v2/HTTP version live on forever? The project does not owe it to
you to stay crystallized at a moment in time.

~~~
trynewideas
When v2 is removed, I'll be shocked if etcd isn't forked because it still has
enough value to enough people to maintain it.

But it hasn't been removed yet. It might be cynical, but I think a v2 fork now
won't accomplish anything because the people most interested in one would
simply continue to use v2 upstream.

------
winrid
Unrelated note / self plug - your comment section is hard to read. I built a
comment tool that will auto adjust to dark sites like yours, and it supports
importing from your comment tool:

[https://blog.fastcomments.com/(7-07-2020)-fastcomments-on-
si...](https://blog.fastcomments.com/\(7-07-2020\)-fastcomments-on-sites-with-
dark-backgrounds.html)

------
EmanueleAina
> I am filled with rage just thinking about how we took a fundamental part of
> the Internet, simple enough that anyone can implement an HTTP server,

Ahahahahahah, yes, anyone can implement a HTTP server, just badly. HTTP/1.1 is
quite complex, the spec alone spans over eight RFCs: if you can implement all
of that I doubt the HTTP/2 serialization is much of a concern. :P

------
johnmarcus
I stopped reading at "Kubernetes (or, as the "cool kids" say, k8s) is the
worst thing to happen to system administration since systemd.". These are the
_best_ tools to happen for system administration in 20 years. Also, etcd is a
terrible db. You cant back it up and can easily lose quorum (and all your
data) with a network outage.

------
gtaylor
This a rant post that backs up very few of its assertions. Though, the author
may not have been trying to write for serious consumption. Sometimes it's
therapeutic to have a good rant.

I'm not sure what there is for us to discuss. Nice rant, I guess? The post
does not attempt to persuade or change opinions (which, again: cool. sometimes
it's nice to have a good rant).

------
d3ntb3ev1l
As systems get more complex, whether they need to be or not, the issue I
believe is no one takes the time to explain or document why. They just plow
ahead building complex stuff at times seemingly so they can say “look at this
complex system I built to solve that really hard edge case”. When 99% of the
use cases were already solved.

------
doonesbury
OP's piece is heat not much light except maybe replacing HTTP with gRPC (if
true). Should have added second path and made it configurable. Storing
Kubernetes stuff in ETC is the raison d'etre for multi PAXOS systems: it
stores stuff. Also ETCD seems to be an innocent by-stander: the real whine
seems to be about K8s.

------
pdonis
The big missing piece in this article: if the author has a real need for the
old simple etcd, he can always fork it. If nobody has done that, that means
nobody has enough of a need for a different etcd from the one the Xooglers
took over. In which case it's hard to work up a lot of concern.

------
MorganGallant
Summarized by GPT-3 as "Etcd is a cool thing, but some bad people took it over
and made it worse."

------
louwrentius
> If you are running a truly enormous system and want to have off-the-shelf
> orchestration for it, Kubernetes may be the tool for you. For 99.9% of
> people out there, it's just an extra layer of complexity that adds almost
> nothing of value.

This is probably even true for 99.9% of HN readers.

------
qqj
In a world of space architects and ignorant cargo cult practitioners, this
post is a breath of fresh air. While some might find the style vitriolic and
“toxic” it’s nothing but honesty peppered with salty experience.

Having said that, the guy represents the old guard of tech, the kind of people
who ask “why do you want to do this” instead of trying to help you, and
usually follow this up with a useless suggestion for an alternative that sort
of does what you want but not really. Thanks, I guess. They have a very
concrete idea of how things should be done, new ways of doing old things are
frowned upon, and anyone challenging their “authority” is seen as an imbecile.

I had the dubious pleasure of working with such a man a few years back and
while I couldn’t but admire his technical expertise and competence, I grew to
hate him to a degree that would rival that which the author of the blog post
hates Google. Get off our lawn, old geeks! You’re not the only ones with
opinions around here.

------
fooster
Why on earth did this garbage make the front page? For shame!

The author is a total jerk trashing everything in his path with hyperbole
rather than sound logical arguments. Read his rant on protobuf. Why would
anyone waste their time tearing down the contributions of others?

------
knodi
At some point huge complexity are introduced to simplify a problem and solve
for a gap or feature. Don't look at this as a burden, look it as a use case
you don't require currently and this may not be the right tool for you to use.

------
charintstr
Well can't say I'm surprised at these reactions. Random internet dude releases
opinion, followed by everyone else releasing their opinions all over the
internet. Discord ensues.

------
jarym
Meh, author certainly has a point about K8s being overkill for 99.9% people
out there.

Unless you need to operate at a certain scale (that most people never will)
then Kubernetes is like taking an F1 car onto a regular road. Recipe for pain.

------
Ericson2314
Hot take, the original stuff using simple etcd had no reason to be
decentralized in the first place other than more overengineering.

Everything post says is bad is bad, everthing the post says is good is also
bad.

------
watt
OK, there seems to be a parallel universe of computing where systemd,
Kubernetes, gRPC and Protocol Buffers are considered bad. Time will tell if
they have been right, or are sore dinosaurs.

------
einpoklum
1\. The project leaders agreed to this; it's not just the Google people. If it
were just Google, they would have had to fork the project and call it getcd or
whatever, then ruin it at the expense of their own reputation.

2\. Speaking of forking, it's still quite possible for a few interested
developers to fork an clean older version, call it freetcd, and rebuild the
reputation. It worked for LibreOffice after all - and that's when OpenOffice
wasn't terrible, just somewhat mismanaged.

All that said - I don't quite see why you need a specialty database for a
simple key-value store for /etc settings. Aren't there plain-vanilla kv-stores
which would do?

------
luord
I might not agree entirely, or even mostly, but I have seen how trying to
emulate google can seriously harm applications or open source projects. Guess
etcd is an example.

------
joshspankit
First time I’m aware of that reading the footnotes provides at least as much
meat and interest as the article itself. There’s even footnotes _in_ the
footnotes.

------
RcouF1uZ4gsC
> The software development world would prefer to use their multi-gigabyte IDEs
> running on ElectronJS to build thousand-dependency Java applications
> targeting ungainly APIs on hard-to-operate systems than support something
> simpler and better.

I think that developer tools have always taken up a significant proportion of
the resources available on a developer machine. Back in the 90's, Visual Basic
was criticizes as being bloated for requiring a 4MB of RAM (the exact number
may be off).

Now we have vastly more powerful computers. I think using those resources to
have easier, more extensible, and more capable developer environments is a
good tradeoff.

------
Animats
Google rolling their own replacement for TCP and HTTP is a disappointment.
Their own benchmarks indicate they get maybe 10% more performance on a good
day. All this comes with a huge increase in complexity. Which means more bugs.

The "solution" to bugs today is forced updates. Vendors love being able to
forcibly impose whatever they want to do on the user. Turn off features, put
in more ads, whatever. If software was reliable enough, nobody would upgrade,
which damages the business model. It's so convenient that software seems to
need constant "security updates".

------
123BLiN
I think that author is more dev than ever ops, a year to try to support the
project not on dev but on prod in highly available configuration could make
eyes open

------
123BLiN
It all seems like a position of a dev not ops. I suppose a year in supporting
his projects in prod in highly available configuration can change his
oppinions.

------
mleonhard
I am a Xoogler and I needlessly added protocol buffers to my project, at great
expense. The author is correct on this point.

------
erikbye
How was etcd forced to accept these changes? It is up to the project
owners/maintainers to deny harmful changes.

------
weddpros
The Kubernetes podcast explained just last week how moving etcd from HTTP to
gRPC allowed a large gain in scalability

------
nserrino
I have not found etcd to be 'an absolute pleasure to work with', to put it
lightly. It has been a plague of stability issues and it sometimes seems
better to roll one's own than continue tracking down issue after issue in etcd
(yes I realize DIY has its own set of probably bigger problems :) ) I don't
know if this experience is due to the changes on the original etcd
implementation that the author is describing.

------
mwcampbell
The tone of this article is so unrelentingly snarky and negative, I think it's
just flame bait. So I flagged it.

------
dutch3000
i always cringe when the tech megacorps buy out an up and coming start up that
shows promise. good for the start up employees as they cash in, but i can’t
help but think that overall these actions stiffle future innovation moments
that could have been if the start ups matured more on their own.

------
cbsmith
There is an element of "craftsman blaming their tool" here. I don't mean that
as a criticism of the author, who is clearly frustrated with some choices they
didn't make but nonetheless have to suffer the consequences of, but rather of
the larger context. Clearly, someone, somewhere is craftsman blaming their
tools.

------
zxienin
> to infect etcd with Google technologies, as is their way. Etcd's simple HTTP
> API was replaced by a "gRPC"

etcd needed a gatekeeper like Linus Torvalds [1]

[1] [https://lkml.org/lkml/2020/6/1/1567](https://lkml.org/lkml/2020/6/1/1567)

------
DSingularity
I thinkThe author Mahers some good points. Why not fork the old version?

------
mobilemidget
I actually block all QUIC traffic, the only use I have seen for it so far from
my OSX machines is chrome phone home traffic. At the weirdest moments,
suddenly little snitch pops up, hey chrome wants to connect to ip x.y.z.1 443
QUIC

------
tremguy
It's open source. You can always start your own fork, no?

------
nix23
He is partially right because i think containers or vagrant are good tools for
development in 80% of all cases...but on the other hand i like having
monolith's in testing and production...so there's that ;)

------
tedunangst
What happened to etcd v1 that it's no longer useable?

------
justinzollars
I enjoyed your rant James. Its like a walk to Hermanos.

------
rdiddly
Does one not simply make a fork of the older version?

------
pelasaco
Too much hate when the solution is quite easy: Fork it from the version that
you like and maintain it.. I'm quite sure lot of people would love to support
you on that.

------
sheeshkebab
It’s difficult to build simple (to use) things

------
haecceity
gRPC is actually pretty nice. It's simpler than COM. More complex than passing
JSON around but also less bug prone.

------
MauranKilom
I found these footnote labyrinths¹ hard to follow.

¹See [https://xkcd.com/1208/](https://xkcd.com/1208/).

------
dsaron
I think I've just discovered one of my new favourite blogs, speaking unbirdled
truth.

------
AtOmXpLuS
Everything in AtOmXpLuS

------
exabrial
I liked this because the author advocates for simplicity.

------
red_admiral
There may well be evil at google, but it's not here.

My take on this is that CoreOS and google have different use cases, and the
beset solution would have been to fork the project. Since google is big
[citation not needed], most development effort would probably end up there,
but people who wanted the simple version could stick with it.

The difference between a HTTP API and gRPC is one of scale. If the HTTP one is
a few bytes more per message, and takes a few more clock cycles to decode,
then at google-scale that might mean extra energy use on the order of a small
town. RPC makes complete sense for google here.

Then there's reliability. If something fails one in a million times, it's a
minor inconvenience to someone who uses it once a day but if you're running it
four billion times a day then you start to notice. gRPC is strongly typed so
it actually removes the malformed HTTP failure mode - and you can bet that
google's SREs are taking care of the other ones too. When the OP says "Quality
is, alas, a dying art", I'm not sure whether that's an honest description of
just how many "nines" the google SREs are building into their projects. It's
not like they invented gRPC to make their systems _less_ stable!

I can recommend rachelbythebay's comments on this, she's a SRE who has done
the rounds in silicon valley - appropriate posts here are "Some Items from my
Reliability List" [1] where she famously says "there is far, far too much JSON
at $company" and ends that section with "Why aren't you using some kind of
actual RPC mechanism? Do you like pain?" \- at the scale she's operating, RPC
is _less_ pain it would seem.

In "We have to talk about this Python, Gunicorn, Gevent thing" [2] she
criticises the use of a popular python framework at the scale she's operating
at (I'm sure this was shared on HN in the past).

This doesn't mean that python, "RPC over HTTP", JSON or etcd are in any way
bad. In fat their simplicity makes them great things for beginners to learn
and medium to large companies to use - they're not just prototyping tools,
people can and do build a lot of real stuff with them. It's just that at some
point before you get to the ridiculous scale of google's infrastructure,
there's a tipping point where going full RPC starts saving you a lot more than
it costs; that's not a bad thing and it has nothing to do with trying to
"capture the market" or anything like that. A corollary of this: precisely
because google works at such a scale, there will be a lot more developers
working on grpc-etcd than the "consumer" one.

Summary: developer builds great tool for A; big company has use case B.

[1]
[http://rachelbythebay.com/w/2019/07/21/reliability/](http://rachelbythebay.com/w/2019/07/21/reliability/)
[2]
[http://rachelbythebay.com/w/2020/03/07/costly/](http://rachelbythebay.com/w/2020/03/07/costly/)

------
pwdisswordfish2
6\. HTTP/2 a.k.a. SPDY is a comically bloated Layer 5/6/7 mega-combo protocol
designed to replace HTTP. It takes something simple and (most importantly!)
comprehensible and debuggable for junior programmers and replaces it with an
insanely over-complicated^7 system that requires tens of thousands of lines of
code to implement the most minimal version of, but which slightly reduces page
load time and server costs once you reach the point of doing millions of
requests per second. I am filled with rage just thinking about how we took a
fundamental part of the Internet, simple enough that anyone can implement an
HTTP server, and replaced it with this garbage protocol pushed by big
megacorps that doesn't solve any real problems but will completely cut out
future generations from system programming for the web. &#8617;&#8617;

7\. My favorite HTTP/2 interaction has been finding and reporting this bug in
haproxy. The compression scheme in HTTP/2 is so shitty that the "compression
table" in RFC 7541 S: Aa is just a list of the 61 most popular headers from
Google properties. &#8617;

8\. HTTP/3 is all the badness of HTTP/2, but run over a worse layer 4 protocol
named QUIC that totally fucks up networking for everybody in order to get a
tiny bit more optimization for Google. That's all it does. It makes the
Internet strictly worse for everybody but slightly better for the hugest of
huge web properties. Nobody out here in the real Internet gives the slightest
shit about head-of-line blocking from TCP, and lots of people want TCP state-
aware firewalls and load-balancers to work. &#8617;

"Nobody out here in the real Internet gives the slightest shit about head-of-
line blocking from TCP, and lots of people want TCP state-aware firewalls and
load-balancers to [continue to] work."

These protocols do not make it more difficult for advertisers or advertiser-
funded companies like Google to serve ads and optimise serving ads, they make
it easier. For example, they allow for faster and more efficient serving of
third party resources, e.g., ads, and siphoning of user data without any
affirmative interaction from the user, and cramming more data into more
headers that are more difficult to monitor. As an end-user I aim to send the
absolute minimum headers needed in the particular instance. 1-2 usually works.
From an end-user perspective we should be trying to _decrease_ headers not
_increase_ them. Websites should only collect the information they actually
need, nothing more. That is not the sort of web envisioned by the "new" HTTP
protocol advocates.

I actually use HTTP/1.1 pipelining with a user-agent I wrote myself and this
original HTTP pipelining has worked great for me for over 20 years. I am not
an advertiser or an ad-supported company; I am just an end-user retrieving
text in bulk from the web. Thus in addition to state-aware firewalls and load-
balancers, I want HTTP/1.1 pipelining to continue to work. The proponents of
these protocols want their audience to believe HTTP/1.1 pipelining doesn't
work or has serious problems. But that is only for _their_ use case. In fact
it does work for end users doing efficient bulk data retrieval without
retrieving any ads or other cruft. But these Google-sponsored protocols are
not written for end-users wanting to avoid ads. Those are not Google's
customers.

------
dilandau
There's a lot of justified hatred for kubernetes, but for fuck's sake why did
anyone adopt it in the first place, if they didn't understand what they were
getting into? It's like the startups from 10 years ago who absolutely had to
use MongoDB because "scale", and then they never get off the ground because
they're trying to implement ACID from first principles.

~~~
freedomben
It's possible that many people (like me) had been trying to solve a similar
non-business problem for years, including rolling our own solutions and the
like, and when an open source option widely backed emerged and looked like a
possible standard, we accept some warts in exchange for a broad, general
purpose, flexible, automatable, and well-thought out (yes) solution.

Of course K8s is not perfect, and it's overkill for small to medium apps (I
think hype train convinced a lot of people they would need to scale to massive
cloud levels when really they didn't), but if you have ever needed K8s
(especially for a complex microservice system at big enterprise level) then
you know the value and you remember the proprietary vendor-locked era of
sadness before K8s emerged.

~~~
123BLiN
and me)

