Biggest issue with microservices: "Microservices can be monoliths in disguise" -- I'd omit the can and say 99% of the time are.
It's not a microservice if you have API dependencies. It's (probably) not a microservice if you access a global data store. A microservice should generally not have side effects. Microservices are supposed to be great not just because of the ease of deployment, but it's also supposed to make debugging easier. If you can't debug one (and only one) microservice at a time, then it's not really a microservice.
A lot of engineers think that just having a bunch of API endpoints written by different teams is a "microservice architecture" -- but they could't be more wrong.
Once when starting a new gig I inherited a "microservices" architecture.
They were having performance problems and "needed" to migrate to microservices. They developed 12 seperate applications, all in the same repo, deployed independently it's own JVM. Of course if you were using microservices, you needed docker as well, so they had also developed a giant docker container containing all 12 microservices which they deployed to a single host (all managed by supervisord). Of course since they had 12 different JVM applications, the services needed a host with at least 9GiB of RAM so they used a larger instance. Everything was provisioned manually by the way because there was no service discovery or container orchestration - just a docker container running on a host (an upgrade from running the production processes in a tmux instance). What they really had was a giant monolithic application with a complicated deployment process and an insane JVM overhead.
Moving to the larger instance likely solved the performance issues. In place they now had multiple over provisioned instances (for "HA"), and combined with other questionable decisions, were paying ~100k/year for a web backend that did no more than ~50 requests/minute at peak. But hey at least they were doing real devops like Netflix.
For me, I've become a bit more aware of cargo cult development. I can't say I'm completely immune to cargo cult driven development either (I once rewrote an entire Angular application in React because "Angular is dead") so it really opened my eyes how I could also implement "solutions" without truly understanding why they are useful.
> They developed 12 seperate applications, all in the same repo, deployed independently it's own JVM.
I've dealt with an even worse system, with a dozen separate applications, each in its own repo, then with various repos containing shared code. But the whole thing was really one interconnected system, such that a change to one component often required changes to the shared code, which required updates to all the other services.
It was a nightmare. At least your folks had the good sense to use a single repository.
> even though theoretically services can be deployed in isolation, you find that due to the inter-dependencies between services, you have to deploy sets of services as a group
You don't always have to know why but it's somewhat frightening that so many "engineers" don't have a clue why they are doing something (because Google does it). And I'm of course guilty of it myself jumping on the hype-train or uncritically taking advice from domain experts only to find out years later that much of it was BS. Most of the time though, you will not reach enlightenment. I guess it's in our nature to follow authority, hype, trends and group think.
What's gonna look better on a devs CV: 'spent a year maintaining a CRUD monolith app' vs 'spent a year breaking monolith into microservices, with shiny language X to boot'.
We can be a very fashion and buzzword driven industry sometimes.
EDIT: this perverse incentive goes all they way to the top, through to CTO level. Sometimes I wonder if businesses understand just how much money and effort is wasted on pointless rewrites that make life harder for everyone.
> it's somewhat frightening that so many "engineers" don't have a clue why they are doing something (because Google does it).
This doesn't stop at engineering; open offices, teaser/trick based interviewing, OKR's, ... Even GOOGLE doesn't do some of those things anymore, but the follower sheep still do.
I recently had a similar experience; our product at work is a monolith not in the greatest shape as it has technical debt which we inherited and our product is usually used condescendingly when talking to other teams working on different products. To our surprise when we started testing it with cloud deployments, it was really lightweight compared to just one of the 25 java micro-services from the other teams.
Their "microservices" suffered from the same JVM overhead and to remedy this they are joining their functionalities together (initially they had 30-40).
>They were having performance problems and "needed" to migrate to microservices. They developed 12 seperate applications, all in the same repo, deployed independently it's own JVM.
9 times out of 10 it's because developers don't know how to properly design and index the underlying RDBMS. I've noticed there is a severe lack of knowledge of that for the average developer.
Sounds like they don't understand why it's called a microservice to begin with. They're not supposed to be solutions an entire piece of software, just dedicated bits at least that's what I'd figure with a name such as "micro". When we adopted microservices at my job (idk if Azure Functions count or not) we did it because we had 1 task we needed taken out of our main application for performance concerns and because we knew it would involve way more work to implement (.NET Framework codebase being ported to .NET Core which meant the dependencies from .NET Framework did not work anymore in .NET Core) but we eventually turned it into a WebAPI instead due to limitations of Azure Functions for what we wanted to do (process imagery of sorts).
> Of course since they had 12 different JVM applications, the services needed a host with at least 9GiB of RAM so they used a larger instance.
well experimentally oracle solved that problem, somewhat. you could now use CDS and *.so files for some parts of your application.
it probably does not eliminate every problem, but yeah it helps a bit at least.
but well it would've been easier to just use apache felix or so to start all the applications on a OSGI container.
this would've probably saved like 5-7 GiB of RAM.
> A microservice should generally not have side effects.
That's plainly wrong. I get the gist of what you are saying and I more or less agree with it but you expressed it poorly.
Having API dependencies is not an issue. As long as the microservices don't touch each others data and only communicate with each other through their API boundaries microservices can and should build on top of each other.
I think your bad experiences are due to microservice apps which are unnecessarily fragmented into a lot of services. Sometimes, even when you respect service boundaries that can be a problem - when you have to release a bunch of services to ship a feature that's a sign that you have a distributed monolith on your hands.
I like to think of services, even my services, as third party ones I can't touch. When I view them this way the urge to tailor them to the current feature I'm hacking on lessens and I identify the correct microservice the given modification belongs to easier.
> That's plainly wrong. I get the gist of what you are saying and I more or less agree with it but you expressed it poorly.
I'm not sure what you think side effects are, but I'm using the standard computer science definition you can look up on Wikipedia. If you have a microservice that modifies, e.g. some hidden state, it's a disaster waiting to happen. Having multiple microservices that have database side-effects will almost always end up with a race condition somewhere. Have fun debugging that.
I'm using the same definition. Writing data to a database is a side effect. If there is no side effect then what's the point of calling a service? So it does computation? Then who saves the result of that computation, thus doing a side effecting operation?
If no one then what's the point of that service's existence?
A side effect is by definition some mutation that's out of the scope of the function -- if the purpose of the microservice is to put stuff in a database, then (by definition) it's not a side effect. Switching a flag on top of doing some work, on the other hand (e.g. flip "processed" to true in a global database) is a side effect.
Modifying data in a database is a side effect. Since you brought up the wikipedia definition, here it is:
> In computer science, a function or expression is said to have a side effect if it modifies some state outside its scope or has an observable interaction with its calling functions or the outside world besides returning a value. For example, a particular function might modify a global variable or static variable, modify one of its arguments, raise an exception, write data to a display or file, read data, or call other side-effecting functions.
Note "write data to a display or file". I think we agree that writing to a database falls under this definition, hence using terms like "side effecting" when talking about microservices is misleading.
Technically it depends on if anything outside the service can see the database and if the state is saved after the service returns. It might seem useless to call a DB if you're not going to save state, but a Rube Goldberg architecture could do so for the lulz.
So, according to your advice, we shouldn't use micro-services that write to a database, ever? That doesn't make any sense to me. Multiple services writing to the same database can be bad, but a single service storing persistent state in a database is perfectly fine. Just because we're micro- doesn't mean we have to cut out 90% of what a service is.
Just like all the other authenticated APIs in the world: you get a token when you log on, and use that token to authenticate yourself on future calls to the services. That's a lot of what OAuth and its ilk handle.
This management of API boundaries is likely handled for you by an app, though, so from a user perspective the story is still "open netflix, enter password, watch movie".
But how is the data shared? E.g. when you sign up you store your data. But to edit it, or to just display it you need to access it again, but that data only belongs to the sign-up microservice...?
I think that it is accurate to say that in a system composed of microservices, a microservice should not effect the state of other microservices in the system other than by consuming them.
Whether it should consume other microservices is less clear, and gets into the choreography vs. orchestration issue; choreography provides lower coupling, but may be less scalable.
> Services consuming other services, sounds like recipe for spaghetti.
Can we extend that logic to classes or interfaces? Accessing data operations through a well-established API is generally seen as a good thing and is the exact cure for spaghetti...
Service APIs also entail load balancing and decoupled deployments, so they eliminate unclear architecture that arises at the app level when trying to tune the whole for individual components. Particularly when a shared component exist across multiple systems.
For a generalized microservices architecture: layering is a bit of a misnomer as everything is loosely in the same 'service' layer... I'd also point out that in N-tiered applications application services or domain services calling other services at the same layer is seen as the solely approved channel for re-use, not an anti-pattern.
>Can we extend that logic to classes or interfaces?
This was one of the main ideas behind the original definition of OOP. The original notion of "object" was very similar to our current notion of "service":
Objects received messages, including messages sent over the network. There was not supposed to be a clear distinction between local and remote services - by design. A lot of inter-computer stuff could be/was handled transparently.
Look, I generally prefer choreography over orchestration, too, but architectural dogma can conflict with pragmatism, and at a minimum Netflix’s argument as to why they found orchestration more usable at scale seems plausible enough for me not to reject it out of hand without experience with the two patterns at anything like Netflix scale.
Just like sometimes, in real-world C, “goto” is the right tool, even though arbitrary jumps of that kind are also a “recipe for spaghetti”.
A microservice, imo, should just be a simple black box that takes in some input and returns some output (sometimes asynchronously). No side-effects necessary. No fiddling with database flags or global state, and definitely no hitting other microservices. See @CryoLogic's post for a good example. This means that you simply can't build some things using microservices -- like logging in a user -- and you'd be right.
But that is ridiculous. Fiddling with database flags is silly, I agree, but inserting and updating database is a completely normal side effect of most business logic. So if your microservice handles any kind of ordinary feature of your business solution, it will almost definitely have side effects because it will write to database. There might be services which just do some computation in memory and return result to you but I think those will be a small minority, most of your services dealing with features such as payment, subscriptions, identity etc (just some examples) will have useful side effects.
I agree, mostly. A black box can have internal state, but it should not have shared state with another black box (defeats the purpose calling it a black box). If two black boxes (microservices) shared state, then we'd need to think of the composite as a single black box.
If two black boxes directly contact each other, then that also defeats the purpose. Microservices are not appealing unless talk via message queues. The whole point of microservices was to handle scale independently for independent functions.
Where do you suggest storing that state if it needs to be persistent? The definition of microservices should not assume anything about how long I need to track my data. If two of your microservices are touching the same database fields, then that's the implementor's mistake.
Why can't you build a microservice that does a handshake with an API gateway to authenticate a user? When doing microservices you have to have a sane auth strategy, and that generally means you encapsulate authentication/authorization in a service that your gateway will talk to.
a bunch of API endpoints written by different teams is a "microservice architecture"
Or chaos, or madness, or Bedlam.
Most people have enough trouble getting three methods in the same file to use the same argument semantics. Every service is an adventure unto itself.
We have a couple services that use something in the vein of GraphQL but some of the fields are calculated from other fields. If you have the derived field but not the source field you get garbage output and they don’t see the problem with this
> It's not a micro-service if you have API dependencies
Just out of curiosity, what alternatives are there to avoid API dependencies? Is it really possible to make non-trivial apps while avoiding internal dependencies?
At some level, is it really possible to have a truly decoupled system?
I'm wondering if the original "API dependencies" comment didn't mean "shared API dependencies". As in, multiple API/services depending on the same shared code/library.
APIs calling other APIs is...well, I'm having a hard time understanding how that could be construed as fundamentally wrong.
I’m confused. If a microservice doesn’t call the api of any other microservices, then when is sending the requests to any of them?
A large purpose of service oriented architecture is encapsulation. If no other microservices can make requests to your microservice, then you really haven’t encapsulated much.
I tend to think that the job of invoking the services lies within a gateway. For example, you can have a microservice for recipes, but a web gateway that know all of the various integrations necessary to generate a page. So the web gateway is essentially a monolith.
If and when you need to support mobile devices independently of your web UI, you can have a mobile gateway. Same idea. This gateway is optimized to know how to handle mobile traffic realities like smaller download sizes, etc.
I'm thinking this concept improperly conflates synchronous requests with eventually-consistent asynchrony.
No, you definitely don't want microservices making synchronous requests to other microservices and depending on them that way.
But it still may be necessary for your services to depend on each other, and that's where you can allow that communication through asynchronous eventually consistent communication. Actor communications, queue submission/consumption, caching, etc.
That is nice in theory and I agree it should be done wherever possible but lot of the time business logic will require immediate synchronous response to be returned as a next step in workflow will execute different branch of logic based on condition/result returned from previous microservice and the frontend / consumer / app will need immediate confirmation about whether action succeeded.
Even in such cases you might want to move bulk of processing to asynchronous queue based system but part of the logic might need to be executed synchronously (authorise credit card payment, you can process the payment asynchronously later, perhaps in bulk cron jobs like Apple Itunes does it but initial authorisation which decides whether purchase is successful must be synchronous).
Microservices is a back-end service pattern. MVC (Model View Controller) is a front-end pattern to enforce separation between data, UI and interaction logic.
I just wish someone with "street cred" (or with a famous, recognizable name I could use for appeal to authority) could create a simple post saying "Hey, if you have a shared data store that all services depend on and are accessing directly, you are not doing microservices". "And you also don't have microservices if you have to update everything in one go as part of a "release"".
That way I could circulate it throughout the company and maybe get the point across. I've tried to argue unsuccessfully. After all, "we are doing K8s, so we have micro-services, each is a pod, duh!" No, you have a monolith, which happens to be running as multiple containers...
Microservices with a share datasource are just a Service Oriented Architecture, circa 2005. You might have a variety of middle tier services, deployed in their own boxes, or at the very least Java service containers, but ultimately talking to some giant Oracle DB behind it. Microservices that share a database are not deployed into a container running JBoss, and instead use something more language-agnostic, but it's ultimately the same thing. All you have to do is quote the many criticisms of that era, when any significant DB change was either impossible, or required dozens of teams to change things in unison.
The best imagery I know for this picture is a two-headed ogre. It might have multiple heads, but one digestive system. Doesn't matter which head is doing the eating, ultimately you have the same shit. I've head semi famous people talk about this in conferences, but few articles.
> If you can't debug one (and only one) microservice at a time, then it's not really a microservice.
It depends on what you want to debug. It is like unit test vs integration test. If you are finding a bug related to integration between multiple services, you definitely need to debug on multiple services.
Do you have examples of micro services and good data warehouses working well side by side? Your point makes sense, but I keep hoping for a way to have One Data Source Truth working side by side with the services that across it.
A data warehouse really should be completely orthogonal to any architecture choices. Good data warehouses are fed by data engineering pipelines that don’t care if you have a single rdbms or multiple document stores or people dropping CSVs in an FTP directory.
I hate to burst your bubble, but you shouldn’t and can’t have truth working along side systems that access it. Data is messy and tends toward dishonesty. The only way to get clean truth for your organization is by thoughtfully applying rules, cleaning and filtering as you go. The more micro your architecture is, the more this is true. Because there is no way 20different teams are all going to have the same understanding of the business rules around what constitutes good, clean input data. Even if your company is very clear and well-documented about business and data rules, if you hand the same spec sheet to 20 different teams, you are going to get 20 variations on that spec.
The only way to get usable data that can be agreed upon by an entire company (or even business unit) is by separating your truth from your transactional data. That’s kind of the definitional of a data warehouse.
If you let your transactional systems access and update data directly in your warehouse, you are in for a universe of pain.
> The only way to get usable data ... is by separating your truth from your transactional data.... If you let your transactional systems access and update data directly in your warehouse, you are in for a universe of pain.
I strongly agree with this assessment :)
I have posted a bit more on this nearby, but Apache Kafka is well positioned as a compromise to support both of those truths: an orthogonal data warehouse full of sanitized purity and chatty apps writing crappy data to their hearts content.
By introducing a third system in between the data warehouse and transactional demands, Kafka decouples the communicating systems and introduces a clear separation of concerns for cross-system data communication (be they OLAP, or OLTP).
If your transactional data is crappy (mine is!), and you want your data warehouse pure (I do!), then Kafka can be a 'truthy' middle ground where compromises are made explicit and data digestion/transformation is explicitly mapped, and all clients can feast on data to their hearts content.
You might want to look into Apache Kafka, with log compaction, which provides a model to accomplish exactly that while also handing message passing/data streaming.
Your data warehouse can suck facts from Kafka (with ETL on either side of the operation, or even integrated into Kafka if you so desire), and you can keep Kafka channels loaded with micro-"Truth"s (current accounts, current employees, etc). That way apps get basically real-time simplified access to the data warehouse while your data warehouse gets a client streaming story that's wicked scalable. And no coupling in between...
It's a different approach than some mainstream solutions, but IMO hits a nice goldilocks zone between application and service communication and making data warehousing in parallel realistic and digestible. YMMV, naturally :)
It's not a microservice if you have API dependencies. It's (probably) not a microservice if you access a global data store. A microservice should generally not have side effects. Microservices are supposed to be great not just because of the ease of deployment, but it's also supposed to make debugging easier. If you can't debug one (and only one) microservice at a time, then it's not really a microservice.
A lot of engineers think that just having a bunch of API endpoints written by different teams is a "microservice architecture" -- but they could't be more wrong.