Because there is no newly invented architecture called "modular monolith" - monolith was always supposed to be MODULAR from the start.
Micro services were not an answer to monolith being bad. Something somewhere went really wrong with people's understanding and there is bunch of totally wrong ideas.
That is also maybe because a lot of people did not knew they were supposed to make modules in their code and loads of monoliths ended up being spaghetti code just like now lots of micro service architectures end up with everything depending on everything.
The interesting thing about microservices is not that it lets you split up your code on module boundaries. Obviously you can (and should!) do that inside any codebase.
The thing about microservices is that it breaks up your data and deployment on module boundaries.
Monoliths are monoliths not because they lack separation of concerns in code (something which lacks that is not a ‘monolith’, it is what’s called a ‘big ball of mud’)
Monoliths are monoliths because they have
- one set of shared dependencies
- one shared database
- one shared build pipeline
- one shared deployment process
- one shared test suite
- one shared entrypoint
As organizations and applications get larger these start to become liabilities.
Microservices are part of one solution to that (not a whole solution; not the only one).
Monoliths don’t actually look like that at scale. For example you can easily have multiple different data stores for different reasons, including multiple different kinds of databases. Here’s this tiny little internal relational database used internally, and there’s the giant tape library that’s archiving all this scientific data we actually care about. Here’s the hard real time system, and over there’s the billing data etc etc.
The value of a monolith is it looks like a single thing from outside that does something comprehensible, internally it still needs to actually work.
> But all those data sources are connected to from the same runtime, right?
Yes, this is an accurate assessment from what I've seen.
> And to run it locally you need access to dev versions of all of them.
In my experience, no. If the runtime never needs to access it because you're only doing development related to datastore A, it shouldn't fall over just because you haven't configured datastore B. Lots of easy ways to either skip this in the runtime or have a mocked interface.
> And when there’s a security vulnerability in your comment system your tape library gets wiped.
This one really depends but I think can be an accurate criticism of many systems. It's most true, I think, when you're at an in-between scale where you're big enough to be a target but haven't yet gotten big enough to afford more dedicated security testing at an application code level.
> But all those data sources are connected to from the same runtime, right?
Not always directly, often a modern wrapper is setup around a legacy system that was never designed for network access. This can easily mean two different build systems etc, but people argue about what is and isn’t a monolith at that point.
Nobody counts the database or OS as separate systems in these breakdowns so IMO the terms are somewhat flexible. Plenty of stories go “In the beginning someone built a spreadsheet … shell script … and the great beast was hidden behind a service. Woe be unto thee who dare dare disturb his slumber.”
This actually feels like a good example of the modularity that i talked about and feature flags. Of course, in some projects, it's not what one would call a new architecture (like in my blog post), but rather just careful usage of feature flags.
> But all those data sources are connected to from the same runtime, right?
Surely you could have multiple instances of your monolithic app:
If the actual code doesn't violate the 12 Factor App principles, there should be no problems with these runtimes working in parallel: https://12factor.net/ (e.g. storing data in memory vs in something external like Redis, or using the file system for storage vs something like S3)
> And to run it locally you need access to dev versions of all of them.
With the above, that's no longer necessary. Even in the more traditional monolithic profiles without explicit feature flags at work, i still have different run profiles.
Do i want to connect to a live data source and work with some of the test data on the shared dev server? I can probably do that. Do i want to just mock the functionality instead and use some customizable data generation logic for testing? Maybe a local database instance that's running in a container so i don't have to deal with the VPN slowness? Or maybe switch between a local service that i have running locally and another one on the dev server, to see whether they differ in any way?
All of that is easily possible nowadays.
> And when there’s a security vulnerability in your comment system your tape library gets wiped.
Unless the code for the comment system isn't loaded, because the functionality isn't enabled.
This last bit is where i think everything falls apart. Too many frameworks out there are okay with "magic" - taking away control over how your code and its dependencies are initialized, oftentimes doing so dynamically with overcomplicated logic (such as DI in the Spring framework in Java), vs the startup of your application's threads being a matter of a long list of features and their corresponding feature flag/configuration checks in your programming language of choice.
Personally, i feel that in that particular regard, we'd benefit more from a lack of reflection, DSLs, configuration in XML/YAML etc., at least when you're trying to replace writing code in your actual programming language with those, as opposed to using any of them as simple key-value stores for your code to process.
You're talking about something very odd here... a monorepo, with a monolithic build output, but that... transforms into any of a number of different services at runtime based on configuration?
Is this meant to be simpler than straight separate codebase microservices?
This is actually quite a nice sweet spot on the mono/micro spectrum. Most bigger software shops I've worked at had this architecture, though it isn't always formally specified. Different servers run different subsets of monolith code and talk to specific data stores.
The benefits are numerous, though the big obvious problem does need a lot of consideration: with a growing codebase and engineering staff, it's easy to introduce calls into code/data stores from unexpected places, causing various issues.
I'd argue that so long as you pay attention to that problem as a habit/have strong norms around "think about what your code talks to, even indirectly", you can scale for a very long time with this architecture. It's not too hard to develop tooling to provide visibility into whats-called-where and test for/audit/track changes when new callers are added. If you invest in that tooling, you can enforce internal boundaries quite handily, while sidestepping a ton of the organizational and technical problems that come with microsevices.
Of course, if you start from the other end of the mono/micro spectrum and have a strong culture of e.g. "understand the service mesh really well and integrate with it as fully as possible" you can do really well with a microservice-oriented environment. So I guess this boils down to "invest in tooling and cultivate a culture of paying attention to your architectural norms and you will tend towards good engineering" ... who knew?
> You're talking about something very odd here... a monorepo, with a monolithic build output, but that... transforms into any of a number of different services at runtime based on configuration?
Shudder...a previous team's two primary services were configured in exactly this way (since before I arrived). Trust me, it isn't (and wasn't) a good idea. I had more important battles to fight than splitting them out (and that alone should tell you something of the situation they were in!).
Its really not odd at all...this is how compilers work...we have been doing it forever.
Microservices were a half baked solution to a non-problem, partly driven by corporate stupidity and charlotan 'security' experts - I'm sure big companies make it work at enough scale, but everything in a microservice architecture was achievable with configuration and hot-patching. Incidentally, you don't get rid of either with a MCS architecture, you just have more of it with more moving parts...absolute sphegetti mess nightmare.
It’s not that odd. Databases, print servers, or web servers for example do something similar with multiple copies of the same software running on a network with different settings. Using a single build for almost identical services running on classified and unclassified networks is what jumps to mind.
It can be. If you have two large services that need 99+% of the same code and their built by the same team it can be easier to maintain them as a single project.
A better example is something like a chain restaurant running their point of sale software at every location so they can keep operating when the internet is out. At the same time they want all that data on the same corporate network for analysis, record keeping, taxes etc.
> You're talking about something very odd here... a monorepo, with a monolithic build output, but that... transforms into any of a number of different services at runtime based on configuration?
I'd say that it's more uncommon than it is odd. The best example of this working out wonderfully is GitLab's Omnibus distribution - essentially one common package (e.g. in a container context) that has all of the functionality that you might want included inside of it, which is managed by feature flags: https://docs.gitlab.com/omnibus/
Now, i wouldn't go as far as to bundle the actual DB with the apps that i develop (outside of databases for being able to test the instance more easily, like what SonarQube does, so you don't need an external DB to try out their product locally etc.), but in my experience having everything have consistent versions and testing that all of them work together makes for a really easy solution to administer.
Want to use the built in GitLab CI functionality for app builds? Just toggle it on! Are you using Jenkins or something else? No worries, leave it off.
Want to use the built in package registry for storing build artefacts? It's just another toggle! Are you using Nexus or something else? Once again, just leave it off.
Want SSL/TLS? There's a feature flag for that. Prefer to use external reverse proxy? Sure, go ahead.
Want monitoring with Prometheus? Just another feature flag. Low on resources and would prefer not to? It has got your back.
Now, one can argue about where to draw the line between pieces of software that make up your entire infrastructure vs the bits of functionality that should just belong within your app, but in my eyes the same approach can also work really nicely for modules in a largely monolithic codebase.
> Is this meant to be simpler than straight separate codebase microservices?
Quite a lot, actually!
If you want to do microservices properly, you'll need them to communicate with one another and therefore have internal APIs and clearly defined service boundaries, as well as plenty of code to deal with the risks posed by an unreliable network (e.g. any networked system). Not only that, but you'll also need solutions to make sense of it all - from service meshes, to distributed tracing. Also, you'll probably want to apply lots of DDD and before long changes in the business concepts will mean having to refactor code across multiple services. Oh, and testing will be difficult in practice, if you want to do reliable integration testing, as will local development be (do you launch everything locally? do you have the run configurations for that versioned? do you have resource limits set up properly? or do you just connect to shared dev environments, that might cause difficulties in logging, debugging and consistency with what you have locally?).
Microservices are good for solving a particular set of problems (e.g. multiple development teams, one per domain/service, or needing lots of scalability), but adding them to a project too early is sure to slow it down and possibly make it be unsuccessful if you don't have the pre-existing expertise and tools that they require. Many don't.
In contrast, consider the monolithic example above:
- you have one codebase with shared code (e.g. your domain objects) not being a problem
- if you want, you still can use multiple data stores or external integrations
- calling into another module can be as easy as a direct procedure call in it
- refactoring and testing both are now far more reliable and easy to do
- ops becomes easier, since you can just run a single instance with all of the modules loaded, or split it up later as needed
I'd argue that up to a certain point, this sort of architecture actually scales better than either of the alternatives, in comparison to the regular monoliths it's just a bit slower to develop in that it requires you to think about boundaries between the packages/modules in your code, which i've seen not be done too often, leading to the "big ball of mud" type of architecture. So i guess in a way that can also be a feature of sorts?
I'd like to challenge one part of your comment - that microservices break up data on module boundaries. Yes, they encapsulate the data. However, the issue that causes spaghettification (whether internal to some mega monolith, across modules, or between microservices), is the semantic coupling related to needing to understand data models. Dependency hell arises when we need to share an agreed understanding about something across boundaries. When that agreed understanding has to change - microservices won't necessarily make your life easier.
This is not a screed against microservices. Just calling out that within a "domain of understanding", semantic coupling is pretty a fact of life.
that's not at all accurate of any of the monoliths I've worked on. This in particular describes exactly zero of them:
- one shared database
Usually there's one data access interface, but behind that interface there are multiple databases. This characterization doesn't even cover the most common of upgrades to data storage in monoliths: adding a caching layer to an existing database layer.
.NET Remoting, from 2002, was expressly designed to allow objects to be created either locally or on a different machine altogether.
I’m sure Java also had something very similar.
Monolith frameworks we’re always designed to be distributed.
The reason distributed code was not popular was because the hardware at the time did not justify it.
Further, treating hardware as cattle and not pets was not easy or possible because of the lack of a variety of technologies such as better devops tools, better and faster compilers, containerization, etc.
I would actually disagree - to me you can have "decent separation of concerns in your code" but still have only built the app to support a single entry point. "Modular monolith" to me is a system that is built with the view of being able to support multiple entry points, which is a bit more complex than just "separating concerns"
If your concerns are well separated in a monolith (in practice this means being able to call a given piece of functionality with high confidence that it will only talk to the data stores/external resources that you expect it will), adding new entry points is very easy.
Now, it's not trivial--going from, say, a do-everything webserver host to a separation of route-family-specific web servers, background job servers, report processing hosts, and cron job runners does require work no matter how you slice it--but it's a more or less a mechanical or "plumbing" problem if you start from a monolithic codebase that is already well behaved. Modularity is one significant part of said good behavior.
My theory is that microservices became vogue along with dynamically typed languages. Lack of static types means that code becomes unmanageable at a much lower level of complexity. So the complexity was "moved to the network", which looks like a clear win if you never look outside a single component.
I've wondered if it's not a ploy by cloud vendors and the ecosystem around them to increase peoples' cloud bills. Not only do you end up using many times more CPU but you end up transferring a lot of data between availability zones, and many clouds bill for that.
A microservice architecture also tends to lock you into requiring things like Kubernetes, further increasing lock-in to the managed cloud paradigm if not to individual clouds.
> I've wondered if it's not a ploy by cloud vendors and the ecosystem around them to increase peoples' cloud bills. Not only do you end up using many times more CPU but you end up transferring a lot of data between availability zones, and many clouds bill for that.
Disagree. I'd argue that microservices are inherently more cost effective to scale. By breaking up your services you can deploy them in arbitrary ways, essentially bin packing N microservices onto K instances.
When your data volume is light you reduce K and repack your N services.
Because your services are broken apart they're easier to move around and you have more fine grained scaling.
> further increasing lock-in to the managed cloud paradigm if not to individual clouds.
Also disagree. We use Nomad and it's not hard to imagine how we would move to another cloud.
More granular scaling. Scaling up instances of 8, 16, 32GB or even larger instances is much more expensive than 1,2,4GB instances. In addition, monoliths tend to load slower since there's more code being loaded (so you can't scale up in sub minute times)
Obviously there's lazy loading, caching, and other things to speed up application boot but loading more code is still slower
The archetypical "microservice" ecosystem I am aware of is Google's production environment. It was, at that point, primarily written in C++ and Java, neither very famous for being dynamically typed.
But, it was a microservice architecture built primarily on RPCs and not very much on message buses. And RPCs that, basically, are statically typed (with code generation for client libs, and code generation for server-side stubbing, as it were). The open-source equivalent is gRPC.
Where "going microservice" is a potential saving is when different parts of your system have different scaling characteristics. Maybe your login system ends up scaling as O(log n), but one data-munging part of the system scales as O(n log n) and another as just O(n). And one annoying (but important) part scales as O(n * 2). With a monolith, you get LBs in place and you have to scale you monolith out as the part that has the worst scaling characteristic.
But, in an ideal microservice world (where you have an RPC-based mechanism taht can be load-balanced, rather than a shared message bus that is harder to much harder to scale), you simply dial up the scaling factor of each microservice on their own.
Amazon was also doing microservices very early and it was a monolithic C++ application originally (obidos).
Microservices was relly more about locality and the ability to keep data in a memory cache on a thin service. Rather than having catalog data competing with the rest of the monolithic webserver app on the front end webservers, requests went over the network, to a load balancer, they were hashed so that the same request from any of the webservers hit the same catalog server, then that catalog server usually had the right data for the response to be served out of memory.
Most of the catalog data was served from BDB files which had all the non-changing catalog data pushed out to the catalog server (initially this data had been pushed to the webserver). For updates all the catalog servers had real-time updates streamed to them and they wrote to a BDB file which was a log of new updates.
That meant that most of the time the catalog data was served out of a redis-like cache in memory (which due to the load balancer hashing on the request could use the aggregated size of the RAM on the catalog service). Rarely would requests need to hit the disk. And requests never needed to hit SQL and talk to the catalog databases.
In the monolithic world all those catalog requests are generated uniformly across all the webservers so there's no opportunity for locality, each webserver needs to have all the top 100000 items in cache, and that is competing with the whole rest of the application (and that's even after going to the world where its all predigested BDB files with an update service so that you're not talking SQL to databases).
It depends. If your monolith requires, say, 16 GB to keep running, but under the hood is about 20 pieces, each happily using less than 1 GB, tripling only a few of these means you may be able to get away with (say) 25 GB. Whereas tripling your monolith means 48 GB.
You can obviously (FSVO "obvious") scale simply by deploying your monolith multiple times. But, the resource usage MAY be substantially higher.
But, it is a trade-off. In many cases, a monolith is simpler. And in at least some cases, a monolith may be better.
If you go with static languages you are pretty much stuck with Microsoft or Oracle, both which are ahole companies who cannot be trusted. There is no sufficiently common 3rd static option, at least not tied to Big Tech.
Not really. You could do the same by having different organisational units being responsible for different libraries. And the final monolith being minimum glue code combined with N libraries. Basically the same way your code depend on libraries from other vendors/OSS maintainers.
The problem is that orgs are not set in stone. Teams get merged and split in reorgs, buyouts and mergers happen, suddenly your microservices designed around "cleanly defined boundaries" no longer make any sense. Sure you can write completely new microservices but that is a distraction from delivering value to the end customer.
So? The solution is that services travel with teams. I've been through more than one reorg. That's what always happens. The point is to have clear ownership and prevent conflicts from too many people working on the same code.
> monolith was always supposed to be MODULAR from the start
Well, that certainly is sensible, but I wasn't aware that someone had to invent the monolith and define how far it should go.
Alas, my impression is that the term "monolith" doesn't really refer to a pattern or format someone is deliberately aiming for in most cases, but instead refers to one big execution of a lot of code that is doing far more than it should have the responsibility to handle or is reasonable for one repository to manage.
I wish these sorts of battles would just go away, though, because it's not like micro services are actually bad, or even monoliths depending on the situation. They're just different sides of the same coin. A monolith results from not a lot of care for the future of the code and how it's going to be scaled or reused, and micro services are often written because of too much premature optimization.
Most things should be a "modular monolith". In fact I think most things should start out as modular monoliths inside monorepos, and then anything that needs to be split out into its own separate library or microservice can be made so later on.
No one had to invent the monolith or define how far it should go; it was the default.
Microservices came about because companies kept falling into the same trap - that because the code base was shared, and because organizational pressures mean features > tech debt, always, there was always pressure to write spaghetti code rather than to refactor and properly encapsulate. That doesn't mean it couldn't be done, it just meant it was always a matter of time before the business needs meant spaghetti.
Microservices, on the other hand, promises enforced separation, which sounds like a good idea once you've been bitten by the pressures of the former. You -can't- fall into spaghetti. What it fails to account for, of course, is the increased operational overhead of deploying all those services and keeping them playing nicely with each other. That's not to say there aren't some actual benefits to them, too (language agnostic, faults can sometimes be isolated), but the purported benefits tend to be exaggerated, especially compared to "a monolith if we just had proper business controls to prevent engineers from feeling like they had to write kluges to deliver features in time".
The individual code bases of a microservice might not involve spaghetti code, but the interconnections certain can. I'm looking at a diagram of the service I work on, with seven components (written in three languages), five databases, 25 internal connections, two external interfaces, and three connections to outside databases, all cross-wired (via 6 connections) with a similar setup geographically elsewhere in case we need to cut over. And that's the simplified diagram, not showing the number of individual instances of each component running.
There is clear separation of concerns among all the components, but it's the interconnections between them that take a while to pick up on.
Fair; I should have been more explicit - your code can't fall into spaghetti (since the services are so small). Of course, you're just moving that complexity into the infrastructure, where, yeah, you still have the same complexity and the same pressures.
> Because there is no newly invented architecture called "modular monolith" - monolith was always supposed to be MODULAR from the start.
Isn't "non-modular monolith" just spaghetti code? The way I understand it, "modular monolith" is just "an executable using libraries". Or is it supposed to mean something different?
The way I see it, spaghetti code is actually a very wide spectrum, with amorphic goto-based code on one end, and otherwise well structured code, but with too much reliance on global singletons on the other (much more palatable) end. While by definition spaghetti code is not modular, modularity entails more. I would define modularity as an explicit architectural decision to encapsulate some parts of the codebase such that the interfaces between them change an order of magnitude less frequently than what's within the modules.
In my world, the solution depends on the requirement. I can't take all the criticism of each as if they are competition to each other. Also, multiple monolithic (can't stand that word as well) can be applied to distribute resources and data, and to reduce dependencies.
Compute, storage, and other services have gotten to the point where they are unlimited, they were originally designed for what was considered monolithic applications. Shared tenancy is not good in the age of multiple dependencies and in terms of security needs for mission-critical applications and data.
Cloud host providers got too eager in seeking to create new lines of business and pushed micro-service architectures far too early to maturity, and now we're just beginning to see it's faults, many which can't be fixed without major changes that will likely make them pretty much useless, or alternatively just similar to monolithic architectures anyway.
Profit and monopolistic goals shouldn't drive this type of IT innovation, solving critical problems should. We shouldn't just throw away all that we've engineered over the past decade and reinvent the wheel... Heck, many liars are still running FINTECH on COBOL.
> * got too eager in seeking to create new lines of business and pushed * far too early to maturity, and now we're just beginning to see it's faults, many which can't be fixed without major changes that will likely make them pretty much useless
> Because there is no newly invented architecture called "modular monolith" - monolith was always supposed to be MODULAR from the start.
Unless you're in a large established org, modulatity is likely a counter-goal. Most startups are looking to iterate quickly to find pre-PMF. Unless you write absolutely terrible code, my experience is you're more likely to have the product cycle off a set functionality before you truly need modularity. From there, you either (1) survive long enough to make an active decision to change how you work (2) you die and architecture doesn't matter.
"Modular monolith" is a nice framing for teams who are at that transition point.
Agree. IMO modularity/coupling is the main issue. My issue w/ the microservice architecture is that it solves the modularity problem almost as a side effect of itself but introduces a whole host of new ones that people do not anticipate.
Yes, if you, at the outset, say we will separate things into separate services, you will get separated services. However, you do NOT need to take on the extra complexity that comes with communication between services, remote dependency management, and additional infrastructure to reduce coupling.
I explicitly claim to use a "citadel" archicture [0] when talking about breaking off services for very similar reasons. Having a single microservice split out from a monolith is a totally valid application of microservices, but I've had better results in conversation when I market it properly.
I've found this to go so far as to have "monolith" understood to mean "single instance of a system with state held in memory".
Micro services were not an answer to monolith being bad. Something somewhere went really wrong with people's understanding and there is bunch of totally wrong ideas.
That is also maybe because a lot of people did not knew they were supposed to make modules in their code and loads of monoliths ended up being spaghetti code just like now lots of micro service architectures end up with everything depending on everything.