They help teams manage complex interdependencies by creating strong ownership boundaries, strong product definition, loose coupling, and allow teams to work at their own velocities and deployment cadences.
If you're doing microservices with 20 people, you're doing it wrong.
It's when you have 500 people that the average engineer doesn't need to know about the eventually consistent data model of your active-active session system. They just need your service API.
Wrong for the project and your employer most likely, but for you personally? Your CV for the next job will massively benefit from lines about leading initiative to reimplement a monolith as microservices, compared to old boring reliable technologies. I'm only half joking.
Like all generalisations, this rule will be wrong sometimes. I think you're imagining a single coherent "product" that all those people (presumably engineers) are working together on.
But imagine a consultancy (which I have witnessed) where projects are just one person, or occasionally two people, and only last a few months. Project outputs usually involving building a new component and slinging it together with one or two existing ones. In practice relatively few of these components get reused in future projects, but it's very hard to predict up front which ones will turn out to be key. In this case building microservices makes a lot of sense because the abandoned components don't take up mental bandwidth to maintain but they're always available if they do turn out to be useful later (even if in they need some updating, being self contained with a clear interface gives a head start). The multi process aspect is certainly a pain, and in principle it could be done with carefully curated libraries instead, but then the temptation for tangled interdependencies would be there and you'd be more tied to a specific language.
You just described library boundaries. You did not describe why the interface has to be a socket.
>the average engineer doesn't need to know about the eventually consistent data model of your active-active session system
People can create unnecessarily connections between microservices too. It's only slightly harder than punching an extra hole through the public interface of an in-process API. Is that what we're spending so much time and money on.
Or only do that going forward; e.g. for future feature requests that would otherwise require adding another huge chunk to the monolith.
Upgrading a major version of an ORM for instance. It seems there are certain key improvements that boil down to "the only way this can be improved would result in a big-bang release affecting the entire of the system and take a dedicated team quite a long time to do". It's just not palatable for anyone.
For the parts we carve out as microservices we are essentially able to erase that technical debt. That comes at the cost of increased ops complexity. We're not ready or capable of having 50 microservices from an ops perspective, but we are capable of handling more on the order of ~5.
There are other options in your case, like isolating any dependencies in a module with a well defined API instead of having it all in a separate (micro-)service.
In reality there are maybe 1 ~ 3 hotpots in the domain that reveice many more transactions than the rest.
In Java, OSGi was particularly good for this, but you could easily build it around a fairly simple DI framework in pretty much every major language.
The most important thing though is to keep your interfaces well-designed and well-communicated. They are the principal source of bugs and misunderstandings.
You don't need to define rigid ownership.
Though some pieces will move slower than others (and some things will never change), a single deploy cadence is probably fine.
There's also the danger of an incomplete microservice migration where you've now got hobbled together half-service weirdware that you have to support forever.
Don't do microservices until engineers working on their thing break your thing.
Ugh I have to deal with this at my current job, except it's not an incomplete microservice migration it's by design and comes with a weird hobbled-together event queue implemented as a table in the DB and that has a known race condition that pops up every couple of weeks.
If you really need to build a microservice to replace a part of your monolith. You need to hire a few devs that can act as a dedicated team.
Perhaps the strangler application approach could work to resolve that aspect over time.
This figure excludes other code bases
those developers worked on (for instance, at the peak of those ten developers includes several who were working on other discontinued products part time - even for most of their time).
It also excludes test data, schemas etc.
And there are several developers who took great pride in deleting code - I know I used to have more - lines than + lines in my few years.
So... Yeah. Thanks for denying my reality I live every day.
We've written over 10 million lines as 20-40 people in the past 15 years. I'll be sure to tell everyone we're not supposed to be moving so fast.
Imagine for a minute, if there were 50 developers (ignoring managers and administrative overhead) trying to understand 500 lines a day (a medium sized module in most languages), they still wouldn't be able to go through half the codebase in a whole year. So basically most of codebase can't be read or maintained.
Does it have a dictionary of 3M words to support spell check? :D
It's a combination of three different problems working against us in concert.
1) compute layer is multitenant but the databases are single tenant (so one physical DB server can hold several hundred tenant database with each customer having their own).
2) We're locked into some very old dependencies we cannot upgrade because upgrading one thing Cascades into needing to upgrade everything. This holds us back from leveraging some benefits of more modern tech.
3) certain entities in the system have known limits whereby when a customer exceeds a certain threshold the performance on loading certain screens or reports becomes unacceptable. Most customers don't come near those limits but a few do. The few that do sometimes wind up blowing up a database server from time to time affecting other clients.
For most of the domain stuff, to be honest I'd like to fix the performance problems and deadlocks by just making data access as efficient as possible in those spots. I think that could get us quite a bit more mileage if we took it seriously and pushed it.
For the single tenant database situation, I don't really know how to approach fixing that. I don't see us having enough resource to ever reengineer it as is. Maybe it's possible for us as a team, maybe it's not. The thinking is that for the parts of the domain we're able to split out, we could make those datastores multitenant.
There's also a bunch of integration code stuck in the monolith that causes various noisy neighbor problems that we are trying to carve out. I think that's a legitimate thing to do and will be quite beneficial.
But yeah... It's a path we're dipping our toes into this year in an effort to address all of these problems which are too big for us to tackle one by one.
I propose this because I think having database instances split up by tenant (even if multiple DBs share the same physical server) is actually a pretty good place to be, especially if you can shuffle per-tenant databases around onto new hardware and play "tetris" with the noisiest tenants' DBs. Moving back to multitenant-everything seems like a regression, and using (message|web|request) routing to break the compute layer up into per-tenant or per-domain clusters of hardware can often unlock some of the main benefits of microservices without a massive engineering effort.
This pretty much describes exactly where we are right now. We've been able to migrate the big customers to a new, less overloaded database server. We could continue to do that. I believe it's what you call a "bridge" architecture, so the compute layer is stateless and can serve any tenant. It's also got a queue/service bus to offload a lot of stuff that the web servers shouldn't be doing. That stuff is all on autoscaling but even that's not a panacea.