
Running Istio In Production - cr_huber
https://engineering.hellofresh.com/everything-we-learned-running-istio-in-production-part-1-51efec69df65
======
rurounijones
Lots of shallow dismissals in the comments for this and it seems to be a trend
every time microservices come up.

"N Microservices for $BUSINESS_DOMAIN ? Crazy!"

Can't we just taka it on good faith that the engineers who do these blog posts
have at least a modicum of competence and therefore their solution has _some_
merit (that is worth discussing) rather than just "Micro-services bad"
dismissals?

<sideshow bob>Yes I realise this can also be taken as a shallow dismissal of
the shallow dismissals</sideshow bob>

~~~
altmind
>> Can't we just taka it on good faith that the engineers who do these blog
posts have at least a modicum of competence and therefore their solution has
some merit

On a technical resource, no. In a technical article, technical points need to
be explained. Decisions need to be checked.

Unless your argument is "it cannot be bad because it must have been checked by
many people", we are totally in the position to provide any sort of
constructive feedback.

~~~
rurounijones
If you want to give __constructive __feedback then have at it.

My comments were aimed at the shallow dismissals at the level to which I gave
an example (and examples of which can be found in the comments on this post).

If a response to these articles is along the lines of "OMG they are using
microservices, BAD!" then that is not constructive or useful unless you are
trying to invoke Cunningham's Law (Or whatever the one was about getting help
on a linux mailing list by saying something cannot be done).

------
altmind
>> At HelloFresh we run hundreds of microservices that do everything from
supply chain management and handling payments to saving customer preferences.
Running microservices at scale is not without its own challenges and many
companies are beginning to experience the pain of complexity.

Lets solve this problem by introducing another level of indirection and not
solving the root cause(?).

At this point i really believe that software architects who don't code, don't
belong to this industry. If the implementers and operators are suffering,
there should be a feedback channel.

~~~
cube2222
Indeed, it is another level of indirection, and I’m all for it (having
introduced a custom service mesh based on envoy into the company I work for
having ~100 microservices, though it took more like 2-3 months of me working
solo).

It’s a great way to get a uniform metrics and troubleshooting experience.

Most important though, especially if you have strategically compiled binaries
as microservices, a service mesh lets you roll out improved routing logic
easily because it’s all abstracted away, not encoded in client libraries in X
number of languages. Same for tracing. Wanna add a field to all traces being
generated? Go recompile and redeploy a 100 services... or change one service
mesh config.

Other than that, envoy can usually withstand much more traffic than the
service it overlays, so you can use it to provide DoS protection in depth, by
limiting on service proxies everywhere. Saved us a couple outage escalations
already.

~~~
altmind
>> Other than that, envoy can usually withstand much more traffic than the
service it overlays

services-to-service peer-to-peer communication creates problems on its own.
and i dont even know what are the benefits. it does not guarantee you anything
per-se. all the redundancy and bandwith improvements need to be... coded...
like with any other approach.

------
geuis
Having struggled with setting up kubernetes over the last month on my own,
I’ve come to realize the absolute value of simplicity.

In the end Helm just introduced more problems than it solved. Rather than
applying configs haphazardly and relying on 3rd party services, it was
ultimately much simpler to just download the configs for whatever service was
needed (nginx ingress controller for me) and committing them to source
control.

My biggest take away from the k8s community is that lots of people write
terrible documentation and other people write blog posts and SO answers
without actually understanding how kubernetes works under the hood.

There is still an ocean of depth in regards to k8s I don’t know yet, but I
feel a lot stronger about getting the intermediate basics. I’m at a point
where I’m being productive again.

~~~
theK
This is actually the advice I’m giving every new systems engineer that joins
our organization.

Learning how Kubernetes works is much easier if you first get a firm grasp of
the basics and then start bolting stuff on like istio knative and all the
other cool stickers “modern architects” wet dream about.

~~~
TeMPOraL
Here's a worrying trend: software architects seem to talk in brand names
instead of concept these days.

~~~
theK
Yes, absolutely. It is like an infection or something. Once you catch it you
only can communicate in logos of hip tech brands.

Edit: punctuation

------
llarsson
I would like to read more about this, but the Part I that was linked is very,
very thin on information. It's essentially just an introduction about the
cloud tech in use, not anything substantial about the actual topic "running
Istio in production".

Looking forward to upcoming parts!

------
core-questions
Delivering pre-packaged meals to people requires _hundreds of microservices_.
That is truly astounding.

~~~
pianoben
It's not entirely unreasonable on its face; physical fulfillment is a
complicated business domain. You might have services dealing with suppliers,
customers, delivery vendors, regulatory compliance, payments, sales taxes, so
on, so forth. You might have ML services to predict product availability or
customer demand. Those subdomains might themselves be decomposed into API
services, vendor gateways, background workers, etc. This doesn't factor in
infra-related things like data stores, caches, etc.

Even if you only sipped the microservices kool-aid, I can easily see dozens of
services.

Granted, it seems reasonable that a handful of monoliths _could_ get the job
done. Without having worked at HelloFresh, I'm inclined to think there's more
to the story that we don't know. Maybe there's a good reason to have as many
services as they do.

~~~
hcnews
I would wager that its because ex-Uber folks work there and carried their ways
with them.

~~~
pianoben
That's rather pointed, but also matches just about every system-design
interview I've ever given to an Uber engineer.

Next time, I want to ask them how they keep track of all those services!

 _edit_ : replaced an incorrectly-used idiom

~~~
jordanbeiber
Tip - use a database.

Computers are awesome at automating things, that goes for dev tooling as well.

If you’ve touched the ITSM space you’re used to managing and maintaining many
thousand of assets. A few hundred microservices is nothing, really.

My team use what you could call a simplified CMDB (configuration management
database) which is cross referenced against the service discovery.

The cmdb keep info about every service, such as persistent data-sources, vm’s
etc, but most important - relationships: domain, team, services and resources.

A microservice is basically a ”ci” (configuration item) with a managed
lifecycle.

~~~
anton_gogolev
Keeping track is one thing. Actually _running_ the Lernaean Hydra of an
application in production is a whole another story. The amount of
"housekeeping" you have to do to keep the thing afloat is astounding:
cascading failures, distributing tracing, logging and diagnostics, metrics.
Even operational side of things require a lot of attention. Presumably, each
microservice would require at least a minimal level of admin-level tooling
around it.

~~~
jordanbeiber
Logging and monitoring is part of the lifecycle. Use strict automated
conventions to aid developer teams. Always opt for convention before
configuration is our tooling motto! :)

Log shipping is what we do from thousands of servers already (you should at
least!), adding a shipper for a few 100 containers on a set of hosts is no big
deal.

Fluent(d/bit) -> some kind of elastic? There are a few resonable patterns
available that works and scales pretty well.

Failures and issues with the actual code - well I might have been lucky... DDD
with somewhat senior devs where no spaghetti action takes place. The tooling
we keep usually seem to pinpoint issues fairly well.

We’re on the scale of roughly 40 devs and my team of 3 support them with
tooling that handles service lifecycle and operational stuff.

It let’s us be pretty fluent with what and how teams build and iterate stuff.
I guess it requires a certain scale and experience though.

------
trimbo
This is one of those posts that brings out comments about things being over
architected for the business. My mind definitely went there.

But, at some point, companies that want to keep talented tech people need to
let them go build what they want to build. Maybe those things are over-
architected for what the company needs right now, but it's tough to say if
that's a bigger risk than losing talented tech people.

~~~
kortilla
No, that’s definitely the kind of talent you don’t want to have. Engineers too
naive to realize they’ve over-engineered something are a cancer in an org. You
end up with home-grown complex solutions to problems that nobody but the
original team can grok.

------
BossingAround
I recently started investigating Istio, and though it will probably be an
unpopular opinion here on HN, I honestly don't understand why it's not a core
part of Kubernetes.

It's amazingly simple to configure, the docs are pretty OK, and the benefits
seems huge. Mutual TLS within minutes? BAM, done. Don't worry about cert
rotation, Citatel does that for you.

Block all egress, and whitelist what you need? Like, that's a killer feature!
Also, the inability to do 90-10 canary releases with plain K8s baffles me.
With Istio? Simple...

I don't know, I'm sure I'll find the pains of Istio in the coming months, but
in my dev cluster, it looks amazing.

~~~
tmpz22
I have a hunch various cloud providers have been aware of TLS related
frustrations for some time and are working on proprietary turn key products to
add to their own offerings at relatively high prices.

This quietly disincentivized them to lean into any OSS stacks that will take
away from this future revenue stream.

------
m0zg
Surprisingly, it is possible to build a profitable business in this space. I
looked them up and they're (very slightly) profitable. They also operate in
multiple countries, which likely adds complication to their infrastructure, so
it's not as ridiculous as it sounds.

------
jteppinette
“Now, with our new service mesh—that only took a few months to roll out—a
failure in our hello-fresh-left-pad microservice can be withstood with only a
few hours of downtime” /s - Don’t go work for HelloFresh unless you hate your
nights and weekends.

~~~
anton_gogolev
You can experience the same, if not worse, levels of pain and suffering with a
monolithic application designed by the same Enterprise Architects.

------
loopz
Part 1

