It's when an organization grows and the software grows and the monolith starts to get unwieldy that it makes sense to go to microservices. It's then that the advantage of microservices both at the engineering and organizational level really helps.
A team of three engineers orchestrating 25 microservices sounds insane to me. A team of of thirty turning one monolith into 10 microservices and splitting into 10 teams of three, each responsible for maintaining one service, is the scenario you want for microservices.
Even though we are in the early stages of redesign, I’m already seeing some drawbacks and challenges that just didn’t exist before:
- Performance. Each of the services talks to the other service via well-defined JSON interface (OpenAPI/Swagger yaml definitions). This sounds good in theory, but parsing JSON and then serializing it N times has a real performance cost. In a giant “monolith” (in the Java world) EJB talked to each other, which despite being java-only (in practice), was relatively fast, and could work across web app containers. In hindsight, it was probably a bad decision to JSON-ize all the things (maybe another protocol?)
- Management of 10-ish repositories and build jobs. We have Jenkins for our semi-automatic CI. We also have our microservices in a hierarchy, all depending on a common parent microservice. So naturally, branching, building and testing across all these different microservices is difficult. Imagine having to roll back a commit, then having to find the equivalent commit in the two other parent services, then rolling back the horizontal services to the equivalent commit, some with different commit hooks tied to different JIRA boards. Not fun.
- Authentication/Authorization also becomes challenging since every microservice needs to be auth-aware.
As I said we are still early in this, so it is hard to say if we reduced our footprint/increased productivity in a measurable way, but at least I can identify the pitfalls at this point.
If you ditch the HTTP stuff while you're at it, you can also handily circumvent all the ambiguities and inter-developer holy wars that are all but inherent to the process of taking your service's semantics, whatever they are, and trying to shoehorn them into a 30-year-old protocol that was really only meant to be used for transferring hypermedia documents. Instead you get to design your own protocol that meets your own needs. Which, if you're already building on top of something like protobuf, will probably end up being a much simpler and easier-to-use protocol than HTTP.
Now when someone goes to replace one side, it’s often impossible to even figure out a full definition of the structure of the data, much less the semantics. You watch a handful of data come across the pipe, build a replacement that handles everything you’ve seen, and then spend the next few months playing whack-a-mole fixing bugs when data doesn’t conform to your structural or semantic expectations.
JSON lets you get away with never really specifying your APIs, and your APIs often devolve to garbage by default. Then it becomes wildly difficult to ever replace any of the parts. JSON for internal service APIs is unmitigated evil.
I've migrated multiple services out of our monolith into our micro service architecture and oh boy, it is just impossible to know (or find someone who knows) what structure is passed around, or what key is actually being used or not.
Good luck logging everything and pulling your hair documenting everything from the bottom up.
That's hardly a JSON problem. You still experience that problem if you adopt any undocumented document format or schema.
But then, of course, that can be considered boilerplate code (and, in the beginning and most of the time, it actually is just a duplication of your internal object structure).
The easiest path with JSON is to do none of this, and so the majority of teams (particularly inexperienced ones) do none of it. With protos, someone must at least sit down and authoritatively outline the structure of any data being passed around, so at a minimum you’ll have that.
But even just forcing developers to do this generally means they start thinking about the API, and have a much higher chance of documenting the semantics and cleaning up parts that might otherwise have been unclear, confusing, or overly complex.
They are literally infinitely faster to encode/decode than protobuf.
They even have the same obnoxious append-only extensibility of protobuf if that’s what really gets your jimmies firing.
1) I'm the author of Cap'n Proto; I'm probably biased.
2) A lot could have changed since 2014. (Though, obviously, serialization formats need to be backwards-compatible which limits the amount they can change...)
Start out with protobufs, so you can take advantage of gRPC and all of its libraries and all the other tooling that is out there.
If you profile and determine that serialization/deserialization is actually a bottleneck in your system in a way that is product-relevant or has non-trivial resource costs, then you can look at migrating to FlatBuffers, which can still use gRPC.
that should not happen.
if it does you don't have a microservice architecture, you have a spaghetti service architecture.
The same issue appeared when OOP was fairly new: people started using it heavily and ended up making messes. They were then told that they were "doing it wrong". OOP was initially sold as magic reusable Lego blocks that automatically create nice modularity. There was even an OOP magazine cover showing just that: a coder using magic Legos that farted kaleidoscopic glitter. Microservices is making similar promises.
It took a while to learn where and how to use OOP and also when not to: it sucks at some things.
If X is a technology I don't like, and it's not working for you, then it's the wrong solution.
If X is a technology I don't like, and it is working for you, then you simply haven't scaled enough to understand its limitations.
If X is a technology I like, but it's not working for you, then your shop is "doing X wrong".
If X is a technology I like, and it's working for you, then it's the right solution and we're both very clever.
> How does one know if X is the wrong solution or if X is the right solution but the shop is "doing X wrong"?
Edit: And a civil conversation ensued. :)
You have an auth.yourapp.com and api.yourapp.com and maybe tracer.yourapp.com and those three things are not a single app that behaves like auth, api or tracer depending on setting of a NODE_ENV variable? If so, you have micro services.
Be that as it may, I believe mirkules's issue is not an uncommon one. Perhaps saying "building a microservice architecture 'the right way' is a complex and subtle challenge" would capture a bit of what both of you are saying.
Something being complex and therefor easy to mess up does not mean it's a great system and the users are dumb, especially if there are other (less complicated, less easy to mess up) ways to complete the task.
Supporting API versioning is not a complex or subtle challenge. It's actually a very basic requirement of a microservice architecture. It's like having to test null pointers: it requires additional work but it's still a very basic requirement if you don't want your app to blow up in your face.
A "service" is not defined principally by a code repository or communications-channel boundary (though it should have the latter and may have the former), but by a coupling boundary.
OTOH, maintaining a coupling boundary can have non-obvious pitfalls; e.g., supported message format versions can become a source of coupling--if you roll back a service to the version that can't send message format v3, but a consumer of the messages requires v3 and no longer supports consuming v2, then you have a problem.
The whole point of a microservice is to create a service boundary. If you have a private interface where both sides are maintained by the same team, both sides should be in the same service.
When all applications are adjusted, the accept-headers request a protobuff format in return.
=> Propagated everywhere except when a js ajax calls happens to the api-gateway.
That assertion is not true. Media type API versioning is a well established technique.
You are right. It should not happen. It is difficult to see these pitfalls when unwinding an unwieldy monolith, and, as an organization all you’ve ever done are unwieldy monoliths, that have a gazillion dependencies, interfaces and factories.
We learned from it, and we move on - hopefully, it serves as a warning to others.
Monolith has huge advantage when maybe your code is like 100k lines or below:
1. Easy cross module unit testing/integration testing, thus sharing components is just easier.
2. Single deployment process
3. CR visibility automatically promotes to all parties of interests, assuming the CR process is working as desired.
4. Also, just a personal preference, easier IDE code suggestion. If you went through json serializing/de-serializing across module boundary, type inference/cohesion is just out-of-reach.
And it is not like monolith doesn't have separation of concern at all. After all, monolith can have modules, and submodules. Start abstracting using file system API, but grouping relevant stuff into folders, before put them into different packages. After all, once diverging, it is really hard to go back.
Unless you have a giant team and more than enough engineers to spare for devops. Micrservices can be considered as a organizational premature optimization.
And people say web applications are never CPU bound :)
JSON has its advantages, I prefer a feature flag when I need performance ( protobuff vs Json), http headers do the rest
Knowing absolutely nothing about your product, this sounds like a bad way to split up your monolith.
Probably something like 5 teams of 3 each managing 1 microservice would be a better way to split things up. That way each team is responsible for defining their service's API, and testing and validating the API works correctly including performance requirements. This structure makes it much less likely services will change in a tightly coupled way. Also, each service team must make sure new functionality does not break existing APIs. Which all make it less likely to have to roll back multiple commits across multiple projects.
The performance issues you cite, also seem to indicate you have too many services, because you are crossing service boundaries often, with the associated serialization and deserialization costs. So each service probably should be doing more work per call.
"all depending on a common parent microservice"
This makes you microservices more like a monolith all over again, because a change in the parent can break something in the child, or prevent the child from changing independently of the parent.
Shared libraries I think are a better approach.
"Authentication/Authorization also becomes challenging since every microservice needs to be auth-aware."
Yes, this is a pain. Because security concerns are so important, it is going to add significant overhead to every service to make sure you get it right, no matter what approach you use.
Surely splitting up your application along arbitrary lines based on advice of an internet stranger whose never seen the application and doesn't know the product/business domain just isn't sound way of approaching the problem.
Conway's Law is profound. Lately I realized even the physical office layout (if you have one) acts as an input into your architecture via Conway's Law.
We used to have a parent maven pom and common libraries but got rid of most of that because it caused too much coupling. Now we create smaller more focused common libraries and favor copy/paste code over reuse to reduce coupling. We also moved a lot of the cross cutting concerns into Envoy so that the services can focus on business functionality.
This looks like a big step backwards to me.
In my opinion, decoupling should be prioritized over DRYness (within reason). A microservice should be able to live fairly independently from other microservices. While throwing out shared libraries (which can be maintained and distributed independently from services) seems like overkill, it seems much better than having explicit inheritance between microservice projects like the original poster is describing.
For any non trivial code, which needs to be maintained and be kept well tested, to the contrary of the OP, I would favor shared libraries over copy/paste.
Would that user object be the responsibility of one service, or written to many tables in the system under different services, or...?
Do we accept that sometimes things may be out of sync until they aren't? That can be a jarring user experience. Do we wait on the Service B event until responding to the client request? That seems highly coupled and inefficient.
I'm genuinely confused as to how to solve this, and it's hard to find good practical solutions to problems real apps will have online.
I suppose the front end could be smart enough to know "we haven't received an ack from Service B, make sure that record has a spinner/a processing state on it".
Also, you should check your domains and bounded contexts and reevaluate whether A and B are actually different services. They might still legitimately be separate. Just something to check.
Then your question is about optimizing on top of the usual architecture which hopefully is an infrequent source of pain that is worth the cost of making it faster. I could imagine some clever caching, Service A and Service B both subscribing to a source of events that deal with the data in question, or just combining Service A and B into one component.
For the same reason monoliths tend to split when the organization grows, it is often more manageable to have a small number of services per team (ideally 1, or less).
It's ok if a service owns more than one type of entity.
It's less good if a service owns more than one part of your businesses' domain, however
People seem to forget that there’s a continuum between monolith and microservices, it’s not one or the other.
Multiple monoliths, “medium”-services, monolith plus microservices, and so on are perfectly workable options that can help transition to microservices (if you ever need to get there at all).
Definitely don't just stuff unrelated stuff into a service since a team that normally deals with that service is now working on unrelated stuff. If the unrelated stuff takes off, you now have two teams untangling your monolithic service.
That said, I'm a big fan of medium sized services, the kind of thing that might handle 10 or 20 different entities.
More likely, I suspect, is that either you are shipping way too much data around, you have too much synchrony, or some other problem is being being hidden in the distribution. (I once dealt with an ESB service that took 2.5 seconds to convert an auth token from one format to another. I parallelized the requests, and the time to load a page went from 10 sec to <3; then I yanked the service's code into our app and that dropped to milliseconds.)
Performance problems in large distributed systems are a pain to diagnose and the tools are horrible.
This also means that each service should have no other services as dependencies, and if they do, you have to many separate services and you should probably look into why hey aren't wrapped up together.
Using a stream from a different service is one thing: You should have clearly defined interfaces for inter-service communication. But if updating a service means you also need to fix an upstream service, your doing it wrong and are actually causing more work than just using a monolith.
EDIT: and because you have clearly defined interfaces, these issues with updating one service and affecting another service literally cannot exist if you've done the rest correctly.
- Performance: use gRPC/protobuf instead of HTTP/OpenAPI, really not much of a reason to use HTTP/OpenAPI for internal endpoints these days
- Repo Management: No one is stopping you from using a monorepo but yourselves :)
Our product is a collection of large systems used by many customers with very different requirements - and so we often fall into this configurability trap: “make everything super configurable so that we don’t have to rebuild, and let integration teams customize it”
Each service should be fully independent, able to be be deployed & rolled-back w/o other services changing.
If you're making API changes, then you have to start talking about API versioning and supporting multiple versions of an API while clients migrate, etc.
Which adds some more complexity that just does not exist in monolythic architecture
It's not just the serialization cost but latency (https://gist.github.com/jboner/2841832) as well, every step of the process adds latency, from accessing the object graph, serializing it, sending it to another process and/or over the network, then building up the object graph again.
The fashion in .net apps used to be to separate service layers from web front ends and slap an SOA (the previous name for micro-services) label on it. I experimented with moving the service to in process and got an instant 3x wall clock improvement on every single page load, we were pissing away 2/3rds of our performance and getting nothing of value from it. And this was int the best case scenario, a reasonably optimized app with binary serialization and only a single boundary crossing per user web request.
Other worse apps I've worked on since had the same anti-pattern but would cross the service boundary dozens/hundreds/thousands of times and very simple pages would take several seconds to load. It's enterprise scale n+1.
If you want to share code like this then make a dll and install it on every machine necessary, you've got to define a strict API either way.
- Logging. All messages pertaining to a request (or task) should have a unique ID across the entire fleet of services in order to follow the trail while debugging.
Thoughts must obviously be given to protocols. Json is an obvious bad choice for this use case...
The point of microservices is loose coupling, including in the code. Having a code hierarchy negates this and arguably is bad practice in general.
Can you explain this a bit more? I thought the point was to have each service be as atomic as possible, so that a change to one service does not significantly impact other services in terms of rollbacks/etc.
If I'm wrong here let me know, our company is still early days of figuring out how to get out of the problems presented by monolith (or in our case, mega-monolith).
My unscientific impression is that some of the organizational costs - just keeping the teams coordinated and on the same page - can become even more "expensive" than the technical costs.
Does each micro service have to live in it's own repository? Especially with a common library everyone uses?
Its not really a micro service - its a distributed monolith
People forget the original 'microservice': the database. No one thinks about it as adding the complexity of other 'services' because the boundaries of the service are so well defined and functional.
A team size of 10 should be able to move fast and do amazing things. This has been the common wisdom for decades. Get larger, then you spend too much time communicating. There's a reason why Conway's Law exists.
The generation of programmers that Martin Fowler is from, are exactly the people from whom I got my ideas around how organization politics effect software and vice versa. There was plenty of cynicism around organization politics back then.
> We do not claim that the microservice style is novel or innovative, its roots go back at least to the design principles of Unix.
By the way, for a relatively small service to be shared by multiple applications, try RDBMS stored procedures first.
Was there a consensus resolution?
Smalltalk is awesome. Everyone else is doing it wrong, those dirty unwashed!
What reasons do you have for making that link? What are you refering to?
It's possible to load some code and snapshot as a Smalltalk image; then load some different code and snapshot as a different Smalltalk image.
It's a different story when you're working on a team, and a different story when there are two or more teams using the same repository. Sure, you still have the image. The debate had to do with how the Smalltalk image affected the community's relationship to the rest of the world of software ecosystems, and how the image affected software architecture in the small. That "geography" tended to produce an insular Smalltalk community and tightly bound architecture within individual projects.
> … relationship to the rest … insular Smalltalk community…
Perhaps not the image per se, so much as the ability to change anything and everything.
Every developer could play god; and they did.
Turns out that not every god is as wise and as benevolent as every other god.
There were awesome people who did awesome stuff; and there were others — unprepared to be ordinary.
People at least played around with that as a research project. There's one that showed up at the Camp Smalltalks I went to, with a weird-but-sensible sounding name. (Weird enough I can't remember the name.)
There would have been great utility in such a thing. For one thing, the debugger in Smalltalk is just another Smalltalk application. So what happens when you want to debug the debugger? Make a copy of the debugger's code and modify the debugger hooks so that when debugging the debugger, it's the debugger-copy that's debugging the debugger. With multi-image Smalltalk, you could just have one Smalltalk image run the debugger-copy without doing a bunch of copy/renaming. (Which, I just remembered, you can make mistakes at, with hilarious results.)
If you do the hacky shortcut of implementing one Smalltalk inside another Smalltalk (Call this St2), then the subset of objects that are the St2 objects can act a bit like a separate image. In that case, the host Smalltalk debugger can debug the St2 debugger.
Otherwise — [pdf] "Distributed Smalltalk"
Otherwise (for source code control) — "Mastering ENVY/Developer"
What I'm talking about is loading up multiple images into the same IDE and run them like fully separate images with maybe some plumbing for communication and code loading between them. You can sorta pull that stunt by, as stcredzero mentioned, running Smalltalk in Smalltalk, ut I want separate images.
At the same time? Why? What will that let you do?
Meaning on a single machine. Not across networks.
> run a network of VMs with different code
What do you think prevents that being done with "fully separate images" (VMs in their own OS process) ?
In this example on Ubuntu "visual" is the name of the VM file, and there are 2 different image files with different code in them "visualnc64.im" and "quad.im".
$ /opt/src/vw8.3pul/bin/visual /opt/src/vw8.3pul/image/visualnc64.im &
$ /opt/src/vw8.3pul/bin/visual /opt/src/vw8.3pul/image/quad.im &
Do you see?
Both of those instances of the Smalltalk VM, the one in OS process 8689 and the one in OS process 8690, are headfull — they both include the full Smalltalk IDE, they are both fully capable of editing and debugging code.
(There's a very visible difference between the 2 Smalltalk IDEs that opened on my desktop: visualnc64.im is as-supplied by the vendor; quad.im has an additional binding for the X FreeType interface library, so the text looks quite different).
(iirc Back-in-the-day when I had opened multiple Smalltalk images I'd set the UI Look&Feel of one to MS Windows, of another to Mac, of another to Unix: so I could see which windows belonged to which image.)
No, that would not be enough to make anything work. What I can think of is an IDE that had access to all the VMs running and some plumbing for the VMs to communicate. I would love to be able to spin-up Smalltalk VMs so I can simulate a full system on my desk. Having separate IDEs running means I don't have any integration so I have to debug in multiple different IDEs when tracing communications. I can imagine some of the debugging and code inspection that could be extended to look at code running simultaneously in multiple VMs.
"Open a debugger where you can trace the full stack on all involved machines."
"Inspect objects in the debugger or open inspectors on any of the objects, regardless of the system they are running on."
April 1995 Hewlett Packard Journal, Figure 7 page 90
Not Technically, as they increase complexity.
But they enable something really powerful: continuity of means, continuity of responsibility, that way a small team has full hand of developing AND operating a piece of a solution.
Basically, organization tends to be quite efficient when dealing with small teams (about dozen people, pizza rule and everything), that way information flows easily, with point to point communication without the need of a coordinator.
However, with such architecture, greater emphasis should be put on interfaces (aka APIs). A detailed contract must be written (or even set as a policy):
* how long the API while remain stable?
* how will it be deprecated? with a Vn and Vn-1 scheme?
* how is it documented?
* what are the limitations? (performance, call rates, etc)?
If you don't believe me, just read "Military-Standard-498". We can say anything about military standards, but military organizations, as people specifying, ordering and operating complex systems for decades, they know a thing or two about managing complex systems. And interfaces have a good place in their documentation corpus with the IRS (Interface Requirements Specification) and IDD (Interface Design Description) documents. Keep in mind this MIL-STD is from 1994.
From what I recall, it's very waterfall minded in term of specification workflow, it's also quite document heavy, and the terminology and acronyms can take a while to get used to.
I found it was a bit lacking regarding how to put together all the pieces into a big system, aka the Integration step. IMHO It's a bit too software oriented, lacking on the system side of thing (http://www.abelia.com/498pdf/498GBOT.PDF page 60).
While I do feel like one team should hold ownership of a service, they should also be working on others and be open to contributions - like the open source model.
Finally, going from a monolith to 10 services sounds like a bad. I'd get some metrics first, see what component of the monolith would benefit the most (in the overall application performance) from being extracted and (for example) rewritten in a more specialized language.
If you can't prove with numbers that you need to migrate to a microservices architecture (or: split up your application), then don't do it. If it's not about performance, you've got an organizational problem, and trying to solve it with a technical solution is not fixing the problem, only adding more.
I guess that's where the critical challenge lies. You'd better be damn sure you know your business domain better than the business itself! So you can lay down the right boundaries, contracts & responsibilities for your services.
Once your service boundaries are laid down, they're very hard to change
It takes just one cross-cutting requirement change to tank your architecture and turn it into a distributed ball of mud!
Something so inflexible can't survive contact with reality (for very long).
At work we run 20-something microservices with a team of 14 engineers, and there's no siloing. If we need to add a feature that touches three services then the devs just touch the three services and orchestrate the deployments correctly. Devs wander between services depending on the needs of the project/product, not based on an arbitrary division.
If you are doing http/json between microservices then you are definitely holding it wrong.
Do yourself a favor and use protobuf/grpc. It exists specifically for this purpose, specifically because what you're doing is bad for your own health.
Or Avro, or Thrift, or whatever. Same thing. Since Google took forever to open source grpc, every time their engineers left to modernize some other tech company, Facebook or Twitter or whatever, they'd reimplement proto/stubby at their new gig. Because it's literally the only way to solve this problem.
So use whatever incarnation you like.. you have options. But json/http isn't one of them. The problem goes way deeper than serialization efficiency.
(edit: d'oh! Replied to the wrong comment. Aw well, the advice is still sound.)
I once worked at a company where a team of 3 produced way more than 25 microservices. But the trick was, they were all running off the same binary, just with slightly different configurations. Doing it that way gave the ops team the ability to isolate different business processes that relied on that functionality, in order to limit the scale of outages. Canary releases, too.
It's 3 developers in charge of 25 different services all talking to each other over REST that sounds awful to me. What's that even getting you? Maybe if you're the kind of person who thinks that double-checking HTTP status codes and validating JSON is actually fun...
I worked on an e-commerce site a decade ago where the process types were:
1. Customer-facing web app
2. CMS for merchandising staff
3. Scheduled jobs worker
4. Feed handler for inventory updates
5. Distributed lock manager
6. Distributed cache manager
We had two binary artifacts - one for the CMS, one for everything else - and they were all built from a single codebase. The CMS was different because we compiled in masses of third-party framework code for the CMS.
Each process type ran with different config which enabled and configured the relevant subsystems on as needed. I'm not sure to what extent we even really needed to do that: the scheduled jobs and inventory feed workers could safely have run the customer app as well, as long as the front-end proxies never routed traffic to them.
What isn't trivial is when someone decides to make an Order microservice and an Account microservice when there's a business rule where both accounts and orders can only co-exist. Good fucking luck with 3 developers, I'm pretty sure with a team of 3 in charge of 23 other microservices you aren't exhaustively testing inter-microservice race conditions.
The apps all handle a bespoke data connection, converting it into a standard model which they submit to our message broker. From then on our services are much larger and smaller in number. It's very write-once-run-forever, some of these have not been touched since their inception years ago, resulting in a decreased complexity and maintenance cost.
The trick is not having REST calls all over yours services. You're just building a distributed monolith at that point.
I've been daydreaming about monoliths and will be asking at interviews for my next job hoping to find more simplified systems. I came from the game industry originally, where you only have one project for the game and one more for the webservice if it had one, and maybe a few others for tools that help support the game.
And 3 companies with micro service infrastructures that had lousy products and little business success.
Can’t totally blame microservices but I recall a distinctly slower and more complicated dev cycle.
These were mostly newer companies where micro services make even less sense and improving product and gaining users is king.
Microservices is a deployment choice. It's the choice to talk between the isolated parts with RPC's instead of local function calls.
So are there no reasons to have multiple services? No there are reasons, but since it's about deployments, the reasons are related to deployment factors. E.g. if you have a subsystem that needs to run in a different environment, or a subsystem that has different performance/scalability requirements etc.
Even if you're working on an early prototype which fits into a handful of source files, it can be useful to organize your application in terms of parallel, independent pieces long before it becomes necessary to enforce that separation on an infrastructure/dev-ops level.