Hacker News new | past | comments | ask | show | jobs | submit login
Microservices, Containers and Kubernetes in Ten Minutes (gravitational.com)
384 points by old-gregg 13 days ago | hide | past | web | favorite | 85 comments

There's so much to comment on in this article. First off gravity looks like an interesting product, it's shame a blog article on that hadn't made it onto the front page.

I hate to be nit-picky but I do feel that articles like this do more harm than good by oversimplifying a fairly advanced architectural pattern by downplaying the testing, resiliency and deployment approaches.

* easier to test - this is flat out wrong. Your application or system logic is spread out across multiple process boundaries - the only way to test it, is to deploy the dependent services on your dev machine and test the set or design your services so that all the application logic can run in a single process. Think spark and spark workers that move from a thread to a distributed model through configuration. Application logic can be tested with this approach, but not necessarily system behaviour (which can be simulated)

* rapid and flexible deployment models - in a large microservices fleet where code is managed by multiple teams and sits in several repositories the dependencies are not explicit (I can not do a "find usages" on a API call and see all microservices that are using it) - so deployments can be decoupled but there tends to be lots of breakages unless you have sophisticated well-thought out testing (see first point).

* resiliency - I'm not going to expand on that because 1st and 2nd point allude to a brittle system and hence reduced resiliency. There's also data and transnational boundaries across services which need to be addressed to. To be fair, there is a hint to solving this problem through something like kakfa but it isn't called out.

Microservices can be simpler with good tooling. Kubernetes is only a very small part of it. Maybe a follow up article may be good to clarify on what tooling can be provided for 1) better testing 2) reliable deployments 3) greater resiliency.

OP here. Largely agree with your clarifications. Your comment is a fantastic example of how nuanced the real world is. Maybe it was a bit naive on my part to even try to compress these topics into a 10 minute read, but for anybody who wants to learn more, comments like these are pure gold. (and gave me a few ideas for what to write about) Thank you.

As a total amateur I found your post a very helpful introduction. Thanks!

To me, this isn't nit-picky, but nuanced. From my experience Kubernetes and the microservice architecture is essentially a technical substrate for an organisational problem.

I'm not 100% convinced there's inherent technical value until you're running at the kind of scale the big hitters do, but by then you're also looking at the hosting solution as a whole, not a deployment in a cloud. Docker, Kubernetes, offer the illusion of being easy but as soon as you start getting serious, they are anything but.

What it does do, for smaller businesses, is create a more-or-less 1:1 mapping between a team structure and a deployment pipeline. At the end of it you're distributing functions in a codebase and depending on the network for resiliency, as opposed to the language or the VM.

At the same time, the knowledge of these systems has great value because those skills are in demand now.

Docker and/or Dokku are pretty simple and easy to get running. When you outgrow a single-server deployment, that's when it gets more complicated. I like docker in general because it allows me to test/script creation and teardown, even locally.

The biggest short coming right now is that Windows support for both Windows and Linux containers (LCOWS) is really immature, and Windows containers in particular isn't mature enough at this point. I'm dealing with this for some testing scenarios where containers are simpler than dedicated VMs.

Kubernetes allows for an opinionated way to deploy your applications with many good practices being a part of the deployment cycle. I would say that the tooling matters, and having _some_ common way of deploying applications is a great way to enable developers who may only want to focus on their code but also want more control over their deployment pipeline. In summary:

* enables dev teams to do their own ops (mostly) instead of having a centralized ops team which needs to set things up and be on call for stuff when it break. This allows scaling up your organization, since once a couple of teams are setup, you can simply keep adding more and more teams and replicate the same deployment pipeline to all

* trains development teams to do more ops level stuff without getting lost in the weeds. They no longer need to worry about DNS, SSL Certs, LoadBalancers etc., they just use kubernetes ingress and services. Of course they can dig more if they want to but the defaults appear to be sensible enough for most

* promotes a cattle v/s pets model and allows teams to rapidly iterate without worrying about breaking the system in unknown ways

There are different types of testing.

    * Manual Testing
    * Automated Unit Testing
    * Black Box Testing
    * Smoke Testing
    * Functional Systems Testing
With varying layers of complexity... I would say that Unit and Black box testing can be much easier with microservices. Orchestrating smoke tests or functional systems tests is a bit harder.

It also can vary by application design and what tooling you have in place.

It’s not a question of which types of testing are easier, it’s a question of which types of testing are necessary to guarantee the same level of coverage. If you’re testing functionality that crosses a service boundary, you need to do integration testing. If you’re testing functionality that exists within a single service, you can settle for unit tests.

In a monolithic architecture where your only service boundaries are frontend-to-backend and backend-to-datastore, you still need some integration testing but not a ton. If your backend is comprised of a network of microservices, you need to integration test all of those points and, most likely, all of the data access layers that each microservice has.

I'm not quite use how unit testing can be easier. Instead of stubbing a function call, you have to stub a TCP/IP request. If you think of a "microservice" as a library that runs in a different process and whose dispatch mechanism is via TCP/IP, you can pretty easily see that testing a library is easier. It's even more easy if your library is compiled with a language whose types are statically checked because you can guarantee that the library functions match the types of the calls without having to jump through hoops like gRPC.

If the TCP/IP call and mechanisms are abstracted out into other modules for FaaS, then you only need to test your single, isolated function, what it touches and returns. You don't even need to test the entire platform.

Structurally you can still use a mono-repo for all your deployed functions so that they can eve share some test functionality.

I am reading through that section thinking similar things.

An application designed with microservices can be thought of as a monolith with unreliable network calls between the components. That is going to make most things more difficult not easier. Sure microservices might encourage you to design your application in a more modular way, but that's a bit of a strawman argument.

First - always think whether monolith would solve your problems.

Second - would mixed architecture solve my problems ?

Third - would microservices solve my problem better?

Not every company has to be a Netflix or Zalando or Google...

I think we finally at the point when people should stop jumping on buzzwords and do the hard work of going through pros and cons of every solution.

The advantage of jumping on the buzzwords is that have you many others with you.

I had to build a SaaS platform for hosting a business webapp, with the particularity that every client had their own separate codebase and database (no multitenancy). It was a simple django-like stateless app in front of Postgres and we had a low availability SLA (99.5%) so I thought it was simple, just deploy each app to one of a pool of servers, with a single Nginx and Postgres instances on each server, then point the client's (sub)domain to it.

In the end, it worked fine, but since I couldn't find anyone doing the same, it meant I had to write a whole bunch of custom tooling for deployment, management, monitoring, backups, etc.

Were I starting now and decided on Kubernetes, the solution would be more complex and less efficient, but it would mean we would have one-click deployment and pretty dashboards and such in a couple of days instead of many weeks. And if we had to bring someone in, they wouldn't have to learn my custom system.

Buzzwords are a kind of poor-man's standards, they provide some sort of common ground among many people.

So instead you chose to still have to customize deployments and network setup. Alot more complex security config to do things seemingly "fast". Is that company a start up ? If not it will cost them alot later on..

You are still going to need to customize kubernetes deployments.

It seems people too often compare the ideal cases between two choices, but completely discount the likely cases. Such seems the situation when debating microservices vs monoliths.

Exactly. In real world, most projects need a mixed architecture.

I totally agree. Deploying microservices and running k8s sound easy until you actually do it. For example, just see this section of k8s docs about exposing services: https://kubernetes.io/docs/concepts/services-networking/serv... . You need to understand many different concepts first to get this right. However, I think once you cross that hurdle, the traditionally harder stuff like auto scaling, rolling upgrade becomes relatively easier.

However, I would say that it's really early days for K8S and the ecosystem around it. As long as K8S does not try to solve every problem in the world and focus on the problems it's designed to solve, things will get easier and then may be a 60-min video can do some justice. ;-)

As somebody who just tried microservices in their latest project I especially agree to the points you made about deployment and testing. Testing can be quite hard, you have to essentially think about it from the start. Testing is much, much easier if you keep everything in one "monolith".

That doesn't mean microservices are bad. It is incredibly well to have a well defined border between services, if you manage to draw the border right. And the extra amount of planning that goes into thinking about where to draw the borders and how to design the interfaces is already a big win. They can really help you keeping a feeling of freedom when it comes to changes, because every service is just a box with a managable number of inputs and outputs. Or at least it should be..

AWS SAM nails this so hard.

They straight up have hundreds of pre-generated tests depending on what your functions input is, and you can run _every_ test locally.

It is possible to build balls of mud either with microservices or larger-sized services.

There are pros and cons to both.

But when "easier testing" (from OP) is said to be an advantage of microservices specifically...I am not sure what to say.

I mean it is not like every test has to test the entire monolith. And it is not like testing each microservice in isolation is always going to be sufficient.

It is entirely orthogonal. Of course a well-designed microservice is easier to test than a monolithic ball of mud, but a well designed monolith is also easier to test than a poorly designed spaghetti of microservices.

It is as if some people think "good code" and "microservices" are synonyms. No. They are orthogonal.

Industry is always going in circles. Fads come and go.

In many ways, Cloud Functions are very similar to a horizontally scalable stateless monolith backend. When you break up services small enough, the "monolith" arises again as simply the sum of what you deploy in an organization.

> when "easier testing" (from OP) is said to be an advantage of microservices specifically...I am not sure what to say.

I think the service boundary ends up being a very good place to inject "fakes", because that boundary is not artificial like it is when you fake out parts of a monolith. The RPC service has two methods and that is the only way _anything_ can interact with the real service, so faking out those two RPC calls let you write focused tests easily.

Obviously you can have service boundaries in monolithic applications, but they are easy to ignore "just this once". By having an API boundary enforced in production, you avoid these problems (or the workarounds become more creative, but that's easier to say no to).

The average monolithic app sitting in production is quite difficult to test because no thought is given to internal APIs. Things rendering HTML make literal database queries, and so the only way to test things it so just run the whole monolith against a local database. That ends up being slow and flaky.

Basically, microservices forces code to do less. When code does less, it's easier to test. When any HTML page can write to your database, you have a mess on your hands. That is totally orthogonal to microservices, but microservices enforce your API contract in production, which I find valuable.

> The RPC service has two methods and that is the only way _anything_ can interact with the real service, so faking out those two RPC calls let you write focused tests easily.

Except now you have to cover cases like “the service is down”, “the service is too latent”, “the actual outputs from the service differ from the documentation”, “the service is behind a reverse proxy that mutates headers in a surprising way”, “you are behind a reverse proxy that mutates headers in a surprising way”, etc.

Yes, this is a set of problems that you need to care about. This is why there are so many Kubernetes-types working on "the service mesh". They all do a little bit too much for my taste (linkerd wants to provide its own Grafana dashboard and touch every TCP connection so as to measure it, Istio wants to molest _everything_ in your cluster and make even basic network connections be a full-fledged Kubernetes API object), but there are more reasonable solutions available. Envoy, specifically, is very good. You can use it for outgoing or incoming requests, and it can be configured to, say, retry retryable gRPC requests (solving your "the service is down" issue, at least if only one replica is down). It also scales cleanly to the amount of complexity you want; you can define your clusters in a file... or write a program that speaks a certain gRPC API so it can discover them for you automatically.

Latency is always going to be a problem and moving things to another computer certainly doesn't decrease it. Everything these days has pretty good support for observability; opentracing to inspect slow requests, prometheus to see how things are doing in general. You can get a handle on it and it doesn't cost much. My team is moving from a PHP monolith that has so much framework code that an empty HTTP response takes 100ms minimum to generate. None of our microservices are that slow, even when 3 or 4 backends and gRPC-web <-> http translation are involved. But it does set an upper bound and that's a reasonable concern to have.

Monolithic apps are not freed from latency; they read from disk, they talk to a database server, etc. So application developers already have this under control (or have filed it away in the "don't care" bucket); for example, every function in Go that does I/O probably takes a context. The context times out and it cancels your local operation just as easily as it cancels a remote operation. So I don't think this is a new concern, or one that people should be too afraid of, other than getting that last bit of performance out of the system.

As for proxies inside your cluster intercepting traffic to other pods and mutating headers in surprising ways... I recommend not running one of those. (Yes, those magical service meshes are some of those. If you don't know why you need one, I recommend living without one until you know you need one. It may be never!)

If you don't control your network, you won't have good luck talking to services on the network. It is orthogonal to microservices; you will be using the network more, so one that's bad will hurt you more. But in general, if you control your infrastructure and the nodes on it, you won't run into a "reverse proxy that mutates headers in a surprising way”. If there is one of those, I recommend killing it rather than not splitting up logical services into separate jobs.

FYI Linkerd's Grafana dashboards are purely additive. If you don't like them, ignore them.

We also work hard to make Linkerd incremental. It doesn't "touch every TCP connection" so much as "touch every TCP connection that you explicitly tell it to".

Not sure I agree. Running things in production is not free. There is resource usage, attack surface, learning curve, and the chance for something to go wrong. Especially prometheus; it is a high resource usage service using 2G of RAM on nodes that only have 3. I am happy to have one of those in my cluster, but an extra one "just because" actually costs something.

Pods are not free either, at least not on Amazon. You get 18 pods per node (at least on t3.medium, which is what I use), and daemonsets quickly eat into that and make every additional node less useful as you increase cluster capacity. In a world where you're already running aws-node, kube-proxy, jaeger-agent, prometheus-node-exporter, and fluentd, you have to be judicious about the value of additional per-node services. I see the benefit in linkerd, but not all the extra stuff it comes with. Having an envoy cluster do gRPC load-balancing between services is enough; yes, you can't tcpdump the streams, it doesn't transparently add TLS, it doesn't configure itself through Kubernetes objects, and it doesn't quite insert the observability that linkerd does... but it does work well and comes with less tooling and resource cost.

I like this way of thinking about things. I do wonder if this is one of those areas where the utility/reward is very unevenly distributed across team skill levels; if your team is very disciplined, and has strong alignment on architecture and internal APIs, then there are some significant benefits to testing a monolithic system (e.g. I can mock time and still test the full monolithic system end-to-end).

However if the team is undisciplined (or the org is just too big to achieve tight alignment), then having some enforced architectural boundaries (bounded contexts) inside which the complexity is capped will at least limit the scope of poor architecture inside a specific service, and generally puts a floor on unintended coupling.

There's another dimension too though; I think that microservices require a much higher level of devops / CI/CD maturity to do well. So the maximum value might come from poorly aligned but very ops-savvy orgs, whereas the minimal value would come from highly aligned and disciplined orgs that don't have a strong devops/automation skillset.

Not sure which dimension is more important though.

I think the article nails it on the head in the first paragraph

>Change their database schema.

>Release their code to production quickly and often.

>Use development tools like programming languages or data stores of their choice.

>Make their own trade-offs between computing resources and developer productivity.

>Have a preference for maintenance/monitoring of their functionality.

Every single one of those is an organizational problem. Microservices are fantastic for solving the problem of "very large organization trying to manage multiple releases from multiple teams with lots of interconnecting dependencies". But they solve that problem with a giant flaming chainsaw with greased up handle. It works but don't use it unless you have to.

You have an assumption that the initial design/division of responsibility into microservices was perfect.

If not, then no, the unintended coupling is not limited to within a service, it will find its way to the API between the services.

IMO if you are writing backend code where it is difficult to insert fakes at any point where you need to test, then all bets are off anyway. Doesn't matter if you use monoliths or microservices. If you can't abstract out some dependency and insert a fake without actually hitting the network layer then you shouldn't be writing backend code.

It is more about test-driven development than anything else.

You need a policy to do micro-services in the first place. You can also have policies (enforced by peer review) to have well-tested code and internal API boundaries. It doesn't come by itself, but neither does microservices.

At the end of the day you need development processes driven primarily by automated tests and good coders. Bad coders will make a total spaghetti mess of microservices too, in fact the consequences of microservices can be catastrophic and crippling if the developers don't properly understand how to write distributed systems etc. (trust me I've seen it happen).

Difference is, when you have that micro-service spaghetti mess, refactoring it to something with clean nice boundaries can be harder than with a monolith -- since the position of the boundaries tend to be less malleable. (This varies depending on how the microservices are tested & deployed, and also, how "malleable" and how "hard" a boundary is is really orthogonal.)

If micro-services does anything, it is to raise the bar for what developers can succeed in working with it. Put those same developers at work on a monolith though, and I think the results are the same. It's a filter for good developers, more than the things you say, IMO.

Like I said, the value of microservices is that your design is enforced at run time, not just at policy-setting or code review time. There is a lot of value in that; coupling issues can't just "sneak" in.

But behold the coupled mess you get if your up front design (the boundaries between the microservices) wasn't perfectly chosen or isn't robust against spec changes.

Can easily happen for instance when architects that are too removed from the actual code are choosing the boundaries. Then developers have to work around it.

There will still be coupling issues. It just goes over the network, and is a lot harder to refactor.


I think this is something like a No True Scotsman fallacy. Somehow people see a tightly coupled mess distributed over the network "not real microservices". Ok, but the people who made it intended to make real microservices, and that is what is important.

> It is entirely orthogonal. Of course a well-designed microservice is easier to test than a monolithic ball of mud, but a well designed monolith is also easier to test than a poorly designed spaghetti of microservices.

I would also add that a well-designed monolith is inherently easier to test than a functionally equivalent well-designed microservices architecture.

It just goes to show how few folks truly test in isolation.

I don’t always, but I’ve never had the luxury of working on a greenfield project. With legacy code it’s a mammoth task to pull apart the spaghetti (more like a drawer of tangled headphones) to get the ideal test.

A better way of stating it, "It goes to show how few folks truly value testing in isolation" that is, having good tests in isolation is seen as a nice ideal to pursue, but delivering working code first trumps all.

"How do you know it works?" is a common reply to this line of reasoning, obviously we all have an idea of what "working" looks like, it's just whether or not we have gone the extra step of formalizing it in the form of tests.

Could you elaborate on what you mean with this? It seems like you are hinting at something which I apparently did not pick up.

Testing components in isolation is useful to ensure that your components work as expected. But it will never ensure that your whole system works as expected, unless the interaction between your components is so simple as to be obviously correct (which, in my experience, is almost never the case).

> “It is as if some people think "good code" and "microservices" are synonyms. No. They are orthogonal.”

I disagree very strongly, and it is also part of why I believe monorepos are generally a mistake.

Microservices are a natural extension of things like decoupling and Single Responsibility Principle.

Just because you superficially could achieve similar effects with gargantuan amounts of tooling and imposed conventions in a monolith class or something is absolutely no type of refutation of the fact that modularity and separated boundaries between encapsulations of units of behavior represent a better way to organize and structure the design.

It is no different and there is no leakiness to the same abstraction when you move to discuss services instead of classes or source code units, or polyrepos v monorepos. The abstraction definitely can become leaky if taken too far in other domains, it just happens that the abstraction is depicting precisely the same organizational complexity properties in the case of source code -> service boundaries -> repository boundaries.

Well, but some services composed of microservices turn into a distributed spaghetti monolith though.

They only "naturally decouple" if you draw the lines between units correctly in the first place. And if you are able to do that well, that's the most important step in making any code turn out well regardless of the size of the services and how much of the boundaries are in the same class/repo/process/service. It also correlates with the "tendency to cheat" within a single service.

There ARE real advantages to micro-services, sure. But you trade them against an ability to quickly refactor if it turns out you drew the lines completely wrong initially. Or perhaps you end up with something that becomes very complex that could in fact have been short-circuited and replaced by 5% as much code by looking at the problem from an entirely different angle -- which you never do because of the pattern that has settled in how the micro-services were divided.

(At the end of the day, the code that is simplest to maintain is the one you don't need to run..)

So I maintain that it's a trade-off.

This seems relevant: https://xkcd.com/2044/

What you propose ends up sounding like this from my point of view:

Salad is generally better for your health than red meat. However, some people eat so much salad, with so much dressing, that it ends up being worse than red meat. Meanwhile, with great care about meal planning and moderation, some people stay pretty healthy eating red meat.

Therefore red meat is actually healthier than salad.

Currently trying to debug some microservice based system: I read your text with red meat being the microservices.

Things are always perfect when reading a simple blog post presenting the happy path. How to check what services are down, how to react to the fact, how to come back? Nah. The fact your µs RAM access just became ms network access? Don't care. Someone just decided to change the interface of their microservice so 2 others are not compatible anymore? That's just "microservice done bad". Being able to see the flow of things and add breakpoint where you need it? Nope.

It is funny to see this kind of problem when you've already experienced them in the embedded world with software components in cars or just in distributed computing.

Most application will never see the kind of scale where adding the kind of code and tools overhead have any RoI. So you end with products released too late or products so brittle you may has well not have launched.

> “Currently trying to debug some microservice based system: I read your text with red meat being the microservices.”

This doesn’t make much sense, unless you’re debugging normal-to-high quality code microservices, and still find the code to be worse than average case monolith services.

He is explaining it further down! For instance, he cannot set a breakpoint and follow execution flow (because suddenly the flow resumes in another microservice)

It seems from your comment that you assume one can always work with only one service, and not need to consider the whole system of services acting together. That is naive.

I do not assume you can work only with one service. But what you brought up from the parent comment about breakpoints makes no sense.

It’s like complaining that someone mocked out a complex submodule in a unit test, so your breakpoint descends into a mock instead of the real thing. You’re mistakenly wanting the wrong thing.

Testing that spans service boundaries is a known entity. Most of the time you want to be testing one service in isolation and mocking out any dependency calls it makes to other services.

But in cases when you want to do integration or acceptance testing involving multiple live services, that’s fine too. You could for instance run the suite via something like docker-compose.

But if you want the debugger to step through the internals of some effectively third party dependency, that’s just a poor approach to debugging. You need to mock that away and isolate whether the third party entity (whether it’s an installed package, separate service, whatever) is really to blame before descending to debugging in that entity.

Imagine if someone is debugging a data processing pipeline task. It makes a service call to a remote database. You really think your debugger should follow the service call and step through the database’s code? That’s a terrible way to debug. That example extends perfectly well no matter what the service call is into, whether it’s local or remote, whether it’s in the same language or runtime or not...

Well in that context, red meat is obviously healthier if you know up front that the people in question are going to pile on dressing...

Context is everything.

I am mainly advocating that it depends on the kind of coders on the teams, how many teams, how sure you are about the up front design / boundaries between services (I have seen such boundaries drawn VERY wrong, so wrong that nothing else ever mattered), how sure you are about the spec, etc

Start with monolith and refactor smaller services gradually as the design solidifies...

> Well in that context, red meat is obviously healthier if you know up front that the people in question are going to pile on dressing...

I think you're being too charitable accepting this analogy at all - somehow microservices are presented as something obviously and inherently better (salad) versus non-microservice approach (red meat).

If we're going down the route of silly analogies which are terrible way to argue anything, how about this:

Non-microservice architectures are normal diet of meat, fish, vegetables, grains and sugar which you can keep under control if you have any idea of what you're doing. Microservices are gluten-free diet - very popular for no good reason, it makes everything harder and you should only pursue it if you have very good reason to and you understand the cons.

> “somehow microservices are presented as something obviously and inherently better (salad) versus non-microservice approach (red meat).”

Yes, this is called the Single Responsibility Principle, in this case applied to service architecture. More generally it is a property of modularity and decoupling.

All else equal then satisfying these properties is better than not satisfying them.

The all else equal assumption clearly holds in practice, where people write equally awful code in both cases and so microservices introduces no additional tech debt yet it does introduce SRP and modularity benefits.

Could you find specific examples of monolith services with small enough tech debt that they outperform some specific other example using microservices? Of course.

Does this matter for reasoning more generally about which pattern is better ceteris paribus? Very little, probably not at all.

I am not convinced that microservices always causes less coupling as you claim.

Sometimes the coupling just jumps into the network/API layer. (Why would it not?) This can happen unless your initial divison into services was perfect (and if you indeed have that much foresight, there would be no reason why a monolith would accrue tech debt either, there would be no temptation to add debt).

The main difference is that when you discover that the initial divison into "Responsibilities" were wrong, it is easier to change and come up with another set of "Responsibilities" in a monolith and deploy the refactored service as a unit.

You talk as if you can just initially define the Single Responsibility then things will be fine. But where I have seen real failure is in identifying those initial responsibilities and choosing the wrong way to look at the total system.

My experience is with monoliths having less coupling and I suspect that the cause is that monoliths are easier to refactor as the requirements change; refactoring the very structure of the service mesh while keeping things running is such a big task that one is more tempted to start adding hack in the API layer.

Yes, one is then violating the Single Responsibility Principle. But if an organization sits there and needs to change the requirements within some deadline -- it is not going to spend 3x the cost and time because a hack violates some principle -- and the alternative is the wrong service taking on some extra work.

If you want to retort "but then they are doing microservices wrong" then I say No True Scotsman. And one could say exactly the same about monolith tech debt too..

> “The main difference is that when you discover that the initial divison into "Responsibilities" were wrong, it is easier to change and come up with another set of "Responsibilities" in a monolith and deploy the refactored service as a unit.”

This is generally not true in my experience, because the degree of implementation-sharing and reliance on common leaked abstractions is so high in monolith codebases.

Through great concerted effort, some highly disciplined teams might not fall into that ubiquitous problem of monoliths and for those exceedingly rare teams your way of thinking could work. But this is so rare it is inapplicable when considering which approach to use in general cases.

I’ll also say that I’ve worked on several monolith services and several microservices stored in dozens to hundreds of separate repos. The tooling cost to make either pattern work at scale was the same, but refactoring was so much easier with polyrepos that each isolated services. Just spin up a new repo and redraw the service boundaries.

Finally, many times services become associated with a fixed, versioned API, and must support backward compatibility for long periods of time. In these cases, redrawing service boundaries is usually not desirable regardless of initial mistakes, until you hit a point when you can release a new major version of the services. In the polyservice / polyrepo case, this is very easy, and the repos and separated code for v2 need not have anything to do at all with v1, and can be developed entirely in parallel, with mocked out assumptions of service boundaries or reliance on legacy v1 stuff.

Actually I wonder if your reasoning is circular..

If you saw a coupled mess of microservices with a lot of technical debt you would probably say that it is not "Microservices" because it is violating the Single Responsibility Principle all over the place. They just tried to do microservices -- but didn't manage to -- so do you then count it as a failure of microservices thinking?

If not then propose a new architectural alternative: The Debt-Free SRP Monolith!!

Sadly, organizations cannot choose to either make a SRP Microservice system or a Debt-Free SRP Monolith. They can only attempt. And I am yet to be convinced to attempting Microservices is that correlated with achieving SRP.

I don’t think it’s circular at all. I’m saying that if the level of tech debt is held equal between both a microservice design and an analogous monolith design, then the fact that the microservice design has greater properties of decoupling, isolation and modularity make it de facto better.

Obviously if the baseline levels of tech debt or poor implementation are not equal, all bets are off.

> In fact, you may discover that you already have a dozen of microservices deployed at your organization.

Very true, robotics/avionics/etc. have been using "microservices" for a long time, nothing new here. Robot Operating System (ROS), Data Distribution Service (DDS), Lightweight Communications and Marshalling (LCM), and others all encourage that architecture.

Yep... any enterprise is likely full of them:

  ldap: authentication microservice
  syslog: logging microservice  
  smtp: messaging microservice  
  smb/nfs: file storage microservice
They haven't been called that for the past 40 years, but that's what they are.

Back in the day it was called Service Oriented Architecture. Then CORBA happened and that became a curse word, and guess we will repeat this when HTTP/4.0 will feature RPC?

CORBA is a bit different: that's more about making remote code appear as if it were local, and having the network disappear. SOA isn't necessarily RPC: it could be message passing, for example. Async invocation like that in RPC is much tougher.

I do think the distinction between "services" and "microservices" is a bit overblown. Clearly something like IMAP is not a microservice - it contains auth, storage, search and a few other pieces. Calling LDAP a microservice is potentially a stretch for similar reasons. But fundamentally, the concerns and architecture of both are extremely similar.

> I do think the distinction between "services" and "microservices" is a bit overblown.

I just realized that adding micro- or nano- or -oriented is a great way to create a buzzword in the current climate. Say, microsecurity, nanolearning, or anger-oriented user interface.

Except that they don't really talk to each other and if they do it's using custom protocols. Also I wouldn't call those "micro".

They don't talk to each other? So you've never seen an smtp server store its spool on an nfs mount, authenticate against ldap, and log to a remote syslog server?

A microservice doesn't have to be micro! They are oftentimes a single bounded context in domain driven design, which can be huge.

So, a service, then?

Don't confuse applications and protocols with microservices.

Microservices are services within an application.

There's every chance your MX MTA authenticates people against LDAP, uses LDAP again to figure out where to forward the mail to another system for final account storage, then the MDA stores the message on a clustered file system or an object store, then tells the logging system to log the delivery. The LDAP systems for authentication and for where the user's mail lives might be separate instances.

Then the MUA connects to an IMAP proxy which talks to LDAP to authenticate and to determine where the messages for this user are stored (again, possibly different LDAP instances), then connects to an IMAP backend that retrieves the data from a clustered file system or object store. The IMAP proxy, IMAP backend, and MDA are separate systems. The object store is, likewise, a separate system.

Meanwhile some of your users are using a webmail client as their MUA. That talks to an outbound-only MTA and the IMAP proxy, but it may talk directly to LDAP for authentication rather than authenticating to the mail servers first. It can pull user preferences from LDAP. It pulls their contact book from LDAP. These might be three different LDAP instances. I has a calendar app in the same page in another tab, but that talks instead to a separate CalDav server. The folder pane which updates with the number of unread messages in each folder updates through a different backend process on a different web server from the listing of mail in your current folder, which is a separate web server from the one that just fetched the content of the highlighted message into your preview pane.

Meanwhile. half of these systems actually forward through another MTA which makes no final delivery decision itself but scans for spam scoring. Those messages that make it through the filtering get forwarded to another service which only scans for malware. Then those messages which pass go to the system that forwards the mail for final delivery to the user or to the remote party's mx server.

All of these systems need their timestamps correct, so they all talk to an NTP service running alone on a VM or container that does nothing else.

All of these systems send their logs to a central logging cluster via a defined protocol. The logging servers do nothing else.

Just because your mail server might run qmail, Courier, Amavis, SpamAssassin, procmail, and mutt on one box with local storage and local logging doesn't mean that's how mail is done at scale. It's pretty clear to me how if you think of "email" as the application that it is composed of microservices.

A very long comment that ignores the one it purports to reply to...

Microservices are called that because they reside within an app that itself provides a service. Otherwise, of course you could call everything and anything 'microservice' but then the term would become meaningless.

GMail (or any other email offering) is "an app" from many perspectives.

Yes, it is an app. Not a 'microservice'. That's the point.

An app containing many services handling individual, separated aspects of the overall app, connected by queues and standard protocols. Many pieces interchangeable with others. If you'd hide the fact that it's e-mail and wrapped a bunch of HTTP/gRPC around it you could easily sell the same architecture to people as "a messenger build on microservice principles".

I'm a ROS user (~7 years) and occasional contributor. One interesting wart you find in the ROS community is that there are two main supported ways of testing your ROS code:

- Unit test the individual functions and classes using gtest or nose (this is built into catkin, this buildsystem).

- Integration test your ROS APIs bringing up an actual ROS multi-process system and having a designated test node which exits to deliver the pass/fail verdict (this is managed by the "rostest" package).

The much-touted advantage of microservices is supposed to be that you have these obvious service interface boundaries at which to test, but the reality in ROS land is that it's a lot of work to test there. It's work to generate the binaries or playback data, failures are harder to understand, and because you're at the mercy of RPC timing, nothing is deterministic, so your tests end up full of fudge factors and tolerances.

And on top of that, the tests themselves execute way slower, since there's so much more set up and teardown.

Is that a failure in the framework? Not sure, but the end result of it is that you only test at the RPC boundary that functionality which can only be tested there; everything else is stuffed into library functions that can be verified with gtest.

The main question I've always had about microservices has been, how do you handle services that are truly depended on by all other services? For example, the authentication service is depended on by pretty much anything else.

This is not a hypothetical question — I'm actually looking for a better way to solve this right now in Aether's business infrastructure right now.

This is a problem I've seen at every company I've ever worked at. Everyone starts off thinking they'll have independence, but it's only natural to build on what already exists and thus you end up with complex dependency topologies.

Large software companies like Google/Facebook home build their own opinionated frameworks for publishing services that include citing of dependencies via config files. Internal engines then scrape for these configs and manage the relationship topology across environments. As far as SWEs are concerned its like magic.

I'm working on trying to standardize such a framework, https://docs.architect.io, and would love preliminary feedback.

What's the particular issue you're running into? A few general notes though:

- Uptime SLAs will probably be very high for such a central service, so HA backing database choices and / or read-only replicas. For authn & authz caching session tokens / API keys / policy decisions / ... is fairly essential to not overload your DB and keep latency down to acceptable levels.

- Work out what latency budget you can afford given that every user action is going to have to go through this service, possibly multiple times. Stop the world garbage collected languages probably aren't ideal here. Judicious use of caching goes a long way.

- Have load shedding / circuit breaking / per-service quota mechanisms in place to prevent issues cascading around your systems. Exponential back-off is a lifesaver.

- Have good integration tests to catch regressions (functionality & performance), roll-out / roll-back mechanisms to catch the ones you miss. Test these with known-bad changes every once in a while.

I've recently been doing a load of development on similar systems, so happy to give my two cents - email is in profile if you'd prefer. The two SRE books are a worthwhile read.

Authentication is a service that is obviously used by other microservices, otherwise it would have no use.

But with microservices coupling is as loose as possible.

Your authentication microservice provides an interface (REST or whatever) independent on the underlying implementation and it does not matter how it implemented. As long has the interface 'contract' is fulfilled you can re-implement it from Java to Rails, or move it from here to there, if you wish and that is completely transparent to all other microservices.

> higher overall resiliency.

yes. because making a page render dependent on the availability of 10s of independent network services is going to greatly improve your resiliency.

I guess it's a matter of perspective.

If you are willing to count a partially working experience as still working, then yes, you might be able to say you're more resilient because only, say the service to add a product to the basket is down - you can still look at products, so technically, you're still up.

That's not how availability is measured normally though. In my (and definitely in my users) book, a site where some stuff is broken is broken. Period.

And if that's the environment you're working in, you're really not improving matters by adding unnecessary network layers between components of your application.

When you should use microservices: when you have many independent two-pizza dev teams. (If you live in Pittsburgh, three-pizza teams...)

I'll bite. Amazon Pittsburgh office inside joke?

> Small projects should not shy from the monolithic design. It offers higher productivity for smaller teams.

Yes. In my next job search I'd like to find a way to feel out a team for whether they adopt new patterns reflexively and dogmatically vs. consciously and contextually.

The Amazon example reminds me of a former coworker telling me about how their processes ran back in '99-ish. Every webserver process was huge (128MB sticks in my head, if I have the period-correct units straight), and only lived for the length of the session before being terminated, due to memory leaks and the like.

If you want to build microservices on top of kubernetes I'd recommend go-micro https://github.com/micro/go-micro

How do you want your spaghetti served? Baked-in the code, or the rpc variety?

You should not ask: "When to use Microservices?" if your previous question was: "What is the difference between microservices and containers?"

> Small projects should not shy from the monolithic design. It offers higher productivity for smaller teams

Wow. Has the author any real-world experience from using micro-services? Or is it just word-of-mouth? Because I have a complete different take on this.

example code ?

There isn't any. This is essentially a thinly disguised ad for Gravity.

exactly, code sample would help on this ad feeling, for me diagrams are too abstract.

very good explanation!

Applications are open for YC Summer 2019

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact