Things that this seems to do are application level end-to-end authorization / authentication, load balancing, monitoring, etc.
Not sure if it does service discovery or you'll still need something else for that.
Service mesh encapsulates the complexities of distributed service-to-service communication so you don't have to deal with them. There are a few options (not just Istio) and all approach the problem in a different way.
I have written about this here: https://glasnostic.com/blog/what-is-a-service-mesh-istio-lin... and here: https://glasnostic.com/blog/should-i-use-a-service-mesh
Coding these higher-level concerns yourself is laborious and error-prone, so if your application runs on Kubernetes, then service mesh can be a substantial help.
Now this all worked, since "grpc" (stubby) did all this (I guess) - then each borg machine, magically collected these metrics/logs/etc. and if these were not super useful to us - normal developers, they were the first thing an SRE would ask (especially as we moved to spanner later). Often on call, if you have an issue, you'll just have to increase the sampling to 100% for some time (say 30 seconds), and it'll capture quite enough for SRE to take a look (as it happens, this would be done when there is an incident, or things are going slow, bad, etc.).
Now back, in a gamedev company, where we started having (without accepting yet) "micro"-services - like things talking to various caching backends, things sitting behind rancher, postgres, mysql, custom breed services - but they all talk directly through sock() api, or http, etc. - e.g. collecting metrics from them is usually - whatever they expose (if they) to prometheus - but you don't get the whole idea.
Now, and I could be wrong, but this is my understaning - rather than rewriting these services to use GRPC, or something to plug same metrics, you can instead make these piece of software don't talk directly to each other but talk to a "service mesh" sidecar (daemon, etc.) then itself it'll talk to another "service mesh" (itself) on another node/machine, but by doing this (and I guess some proper configuration), you'll get the tracing information you need.
So at least to me, it's an escape hatch to place in front of something like mysql, postgresql, etc. but still get it in the e2e picture - e.g. user have sent a request, and we want to see where it went everywhere...
And this is where I see the value of the service mesh. There is also the cases of handling retry errors, "flaky" servers (unhealhty) through circuit breaking and passive health, or cumulative timeout (better wording here?) - e.g. 500ms timeout from the first request, decreasing the timeout with each subsequent call, thus dropping eventually.
Then authentiation/authorization, rather than re-implementing in several languages (or much of the above) you do it in one language (C++ for envoy!)
The elephant in the room - is how much is spent on this extra "hub" in the communcation. Also things like UDP support, and who knows what else (simple app developer here, I don't know network details)
I'm curious if, say in an IT environment, where both custom services and third-party services live, either on-prem or cloud, is it common to have both Organic architectures and microservice-based applications operating and working side-by-side? If so, what are good concepts to make this work with least effort?
btw: the font on https://glasnostic.com is thin and hard to read :)
In most cases, organic architecture would include microservice-based applications. At a very high level, organic architecture is a style that IT adopts when the multitude of business needs make the organization itself "organic". An outward sign of this transformation is the emergence of parallel teams and independent release cycles. When this happens, applications take on the role of digital capabilities that can be recombined to support new products and services and the difference between an "application" and a "service" fades away. So, in the IT environment you describe, microservice-based applications would be part of the organic architecture.
The key issue in composing capabilities in such a way is the emergence of complex behaviors. For instance, composed fan-out patterns tend to be non-linear, large-scale and highly dynamic. To fully realize the potential of organic architecture, you'll need a way to control these behaviors. This is what we do.
My last three companies were moving from a monolith to microservices. In every case, the rationale for the move was little more than "monolith development is slow, therefore microservices".
Unfortunately for all of them, the true reasons for the slow development was poor separation of concerns, insufficient and brittle tests, and years of accumulated hacky shortcuts to ship new features "faster".
The folks pushing the microservice panacea were "proven right" in that development was much faster... initially. Without years of cruft in their way, devs were able to churn out new stuff (after a significant time spent ramping up on infrastructure code).
Eventually, and in every case, the fundamental flaws resurfaced, this time increasing in severity now that changing the system often required multiple service changes. As well, a host of new kinds of problems emerged: distributed transactions, network failures, client/server versioning, debugging across services, etc.
I'm now a microservice skeptic. Whatever your solution, it should be at the same "level" as the problem. A massive change in technical architecture will not solve the accumulated cost of your poor engineering quality choices.
You absolutely can bootstrap your own cluster of servers running K8s.
The equivalent for Linkerd 2 is Conduit (that was its own app before and now is entirely part of Linkerd 2).
From what I've seen, Linkerd 2 doesn't feature as much features as Istio or Linkerd 1 for now (no circuit breaker for example). From the few issues I've seen on Github, it seems a bit immature to be used in a production environment.
Between the fact that I've just started looking into that last month and that it changes at a rate that is really hard to follow, I'm might be wrong...
I must be losing touch with tech!
DevOps Community: "Hold my beer."
Hold true to the course of solid fundamentals and learn what you need, when you need it; not a moment sooner!