I wonder why no one ever talks about architectures in the middle between those two - modular monoliths.
The point in time where you're splitting your codebase up in modules (or maybe are a proponent of hexagonal architecture and have designed it that way from the beginning), leading to being able to put functionality behind feature flags. That way, you can still run it either as a single instance monolith, or a set of horizontally scaled instances with a few particular feature flags enabled (e.g. multiple web API instances) and maybe some others as vertically scaled monoliths (e.g. scheduled report instance).
In my eyes, the good part is that you can work with one codebase and do refactoring easily across all of it, have better scalability than just a monolith without all of the ops complexity from the outset, while also not having to worry as much about shared code, or perhaps approach the issue gently, by being able to extract code packages at first.
The only serious negatives is that this approach is still more limited than microservices, for example, compilation times in static languages would suffer and depending on how big your project is, there will just be a bit of overhead everywhere, and not every framework supports that approach easily.
Because there is no newly invented architecture called "modular monolith" - monolith was always supposed to be MODULAR from the start.
Micro services were not an answer to monolith being bad. Something somewhere went really wrong with people's understanding and there is bunch of totally wrong ideas.
That is also maybe because a lot of people did not knew they were supposed to make modules in their code and loads of monoliths ended up being spaghetti code just like now lots of micro service architectures end up with everything depending on everything.
The interesting thing about microservices is not that it lets you split up your code on module boundaries. Obviously you can (and should!) do that inside any codebase.
The thing about microservices is that it breaks up your data and deployment on module boundaries.
Monoliths are monoliths not because they lack separation of concerns in code (something which lacks that is not a ‘monolith’, it is what’s called a ‘big ball of mud’)
Monoliths are monoliths because they have
- one set of shared dependencies
- one shared database
- one shared build pipeline
- one shared deployment process
- one shared test suite
- one shared entrypoint
As organizations and applications get larger these start to become liabilities.
Microservices are part of one solution to that (not a whole solution; not the only one).
Monoliths don’t actually look like that at scale. For example you can easily have multiple different data stores for different reasons, including multiple different kinds of databases. Here’s this tiny little internal relational database used internally, and there’s the giant tape library that’s archiving all this scientific data we actually care about. Here’s the hard real time system, and over there’s the billing data etc etc.
The value of a monolith is it looks like a single thing from outside that does something comprehensible, internally it still needs to actually work.
> But all those data sources are connected to from the same runtime, right?
Yes, this is an accurate assessment from what I've seen.
> And to run it locally you need access to dev versions of all of them.
In my experience, no. If the runtime never needs to access it because you're only doing development related to datastore A, it shouldn't fall over just because you haven't configured datastore B. Lots of easy ways to either skip this in the runtime or have a mocked interface.
> And when there’s a security vulnerability in your comment system your tape library gets wiped.
This one really depends but I think can be an accurate criticism of many systems. It's most true, I think, when you're at an in-between scale where you're big enough to be a target but haven't yet gotten big enough to afford more dedicated security testing at an application code level.
> But all those data sources are connected to from the same runtime, right?
Not always directly, often a modern wrapper is setup around a legacy system that was never designed for network access. This can easily mean two different build systems etc, but people argue about what is and isn’t a monolith at that point.
Nobody counts the database or OS as separate systems in these breakdowns so IMO the terms are somewhat flexible. Plenty of stories go “In the beginning someone built a spreadsheet … shell script … and the great beast was hidden behind a service. Woe be unto thee who dare dare disturb his slumber.”
This actually feels like a good example of the modularity that i talked about and feature flags. Of course, in some projects, it's not what one would call a new architecture (like in my blog post), but rather just careful usage of feature flags.
> But all those data sources are connected to from the same runtime, right?
Surely you could have multiple instances of your monolithic app:
If the actual code doesn't violate the 12 Factor App principles, there should be no problems with these runtimes working in parallel: https://12factor.net/ (e.g. storing data in memory vs in something external like Redis, or using the file system for storage vs something like S3)
> And to run it locally you need access to dev versions of all of them.
With the above, that's no longer necessary. Even in the more traditional monolithic profiles without explicit feature flags at work, i still have different run profiles.
Do i want to connect to a live data source and work with some of the test data on the shared dev server? I can probably do that. Do i want to just mock the functionality instead and use some customizable data generation logic for testing? Maybe a local database instance that's running in a container so i don't have to deal with the VPN slowness? Or maybe switch between a local service that i have running locally and another one on the dev server, to see whether they differ in any way?
All of that is easily possible nowadays.
> And when there’s a security vulnerability in your comment system your tape library gets wiped.
Unless the code for the comment system isn't loaded, because the functionality isn't enabled.
This last bit is where i think everything falls apart. Too many frameworks out there are okay with "magic" - taking away control over how your code and its dependencies are initialized, oftentimes doing so dynamically with overcomplicated logic (such as DI in the Spring framework in Java), vs the startup of your application's threads being a matter of a long list of features and their corresponding feature flag/configuration checks in your programming language of choice.
Personally, i feel that in that particular regard, we'd benefit more from a lack of reflection, DSLs, configuration in XML/YAML etc., at least when you're trying to replace writing code in your actual programming language with those, as opposed to using any of them as simple key-value stores for your code to process.
You're talking about something very odd here... a monorepo, with a monolithic build output, but that... transforms into any of a number of different services at runtime based on configuration?
Is this meant to be simpler than straight separate codebase microservices?
This is actually quite a nice sweet spot on the mono/micro spectrum. Most bigger software shops I've worked at had this architecture, though it isn't always formally specified. Different servers run different subsets of monolith code and talk to specific data stores.
The benefits are numerous, though the big obvious problem does need a lot of consideration: with a growing codebase and engineering staff, it's easy to introduce calls into code/data stores from unexpected places, causing various issues.
I'd argue that so long as you pay attention to that problem as a habit/have strong norms around "think about what your code talks to, even indirectly", you can scale for a very long time with this architecture. It's not too hard to develop tooling to provide visibility into whats-called-where and test for/audit/track changes when new callers are added. If you invest in that tooling, you can enforce internal boundaries quite handily, while sidestepping a ton of the organizational and technical problems that come with microsevices.
Of course, if you start from the other end of the mono/micro spectrum and have a strong culture of e.g. "understand the service mesh really well and integrate with it as fully as possible" you can do really well with a microservice-oriented environment. So I guess this boils down to "invest in tooling and cultivate a culture of paying attention to your architectural norms and you will tend towards good engineering" ... who knew?
> You're talking about something very odd here... a monorepo, with a monolithic build output, but that... transforms into any of a number of different services at runtime based on configuration?
Shudder...a previous team's two primary services were configured in exactly this way (since before I arrived). Trust me, it isn't (and wasn't) a good idea. I had more important battles to fight than splitting them out (and that alone should tell you something of the situation they were in!).
Its really not odd at all...this is how compilers work...we have been doing it forever.
Microservices were a half baked solution to a non-problem, partly driven by corporate stupidity and charlotan 'security' experts - I'm sure big companies make it work at enough scale, but everything in a microservice architecture was achievable with configuration and hot-patching. Incidentally, you don't get rid of either with a MCS architecture, you just have more of it with more moving parts...absolute sphegetti mess nightmare.
It’s not that odd. Databases, print servers, or web servers for example do something similar with multiple copies of the same software running on a network with different settings. Using a single build for almost identical services running on classified and unclassified networks is what jumps to mind.
It can be. If you have two large services that need 99+% of the same code and their built by the same team it can be easier to maintain them as a single project.
A better example is something like a chain restaurant running their point of sale software at every location so they can keep operating when the internet is out. At the same time they want all that data on the same corporate network for analysis, record keeping, taxes etc.
> You're talking about something very odd here... a monorepo, with a monolithic build output, but that... transforms into any of a number of different services at runtime based on configuration?
I'd say that it's more uncommon than it is odd. The best example of this working out wonderfully is GitLab's Omnibus distribution - essentially one common package (e.g. in a container context) that has all of the functionality that you might want included inside of it, which is managed by feature flags: https://docs.gitlab.com/omnibus/
Now, i wouldn't go as far as to bundle the actual DB with the apps that i develop (outside of databases for being able to test the instance more easily, like what SonarQube does, so you don't need an external DB to try out their product locally etc.), but in my experience having everything have consistent versions and testing that all of them work together makes for a really easy solution to administer.
Want to use the built in GitLab CI functionality for app builds? Just toggle it on! Are you using Jenkins or something else? No worries, leave it off.
Want to use the built in package registry for storing build artefacts? It's just another toggle! Are you using Nexus or something else? Once again, just leave it off.
Want SSL/TLS? There's a feature flag for that. Prefer to use external reverse proxy? Sure, go ahead.
Want monitoring with Prometheus? Just another feature flag. Low on resources and would prefer not to? It has got your back.
Now, one can argue about where to draw the line between pieces of software that make up your entire infrastructure vs the bits of functionality that should just belong within your app, but in my eyes the same approach can also work really nicely for modules in a largely monolithic codebase.
> Is this meant to be simpler than straight separate codebase microservices?
Quite a lot, actually!
If you want to do microservices properly, you'll need them to communicate with one another and therefore have internal APIs and clearly defined service boundaries, as well as plenty of code to deal with the risks posed by an unreliable network (e.g. any networked system). Not only that, but you'll also need solutions to make sense of it all - from service meshes, to distributed tracing. Also, you'll probably want to apply lots of DDD and before long changes in the business concepts will mean having to refactor code across multiple services. Oh, and testing will be difficult in practice, if you want to do reliable integration testing, as will local development be (do you launch everything locally? do you have the run configurations for that versioned? do you have resource limits set up properly? or do you just connect to shared dev environments, that might cause difficulties in logging, debugging and consistency with what you have locally?).
Microservices are good for solving a particular set of problems (e.g. multiple development teams, one per domain/service, or needing lots of scalability), but adding them to a project too early is sure to slow it down and possibly make it be unsuccessful if you don't have the pre-existing expertise and tools that they require. Many don't.
In contrast, consider the monolithic example above:
- you have one codebase with shared code (e.g. your domain objects) not being a problem
- if you want, you still can use multiple data stores or external integrations
- calling into another module can be as easy as a direct procedure call in it
- refactoring and testing both are now far more reliable and easy to do
- ops becomes easier, since you can just run a single instance with all of the modules loaded, or split it up later as needed
I'd argue that up to a certain point, this sort of architecture actually scales better than either of the alternatives, in comparison to the regular monoliths it's just a bit slower to develop in that it requires you to think about boundaries between the packages/modules in your code, which i've seen not be done too often, leading to the "big ball of mud" type of architecture. So i guess in a way that can also be a feature of sorts?
I'd like to challenge one part of your comment - that microservices break up data on module boundaries. Yes, they encapsulate the data. However, the issue that causes spaghettification (whether internal to some mega monolith, across modules, or between microservices), is the semantic coupling related to needing to understand data models. Dependency hell arises when we need to share an agreed understanding about something across boundaries. When that agreed understanding has to change - microservices won't necessarily make your life easier.
This is not a screed against microservices. Just calling out that within a "domain of understanding", semantic coupling is pretty a fact of life.
that's not at all accurate of any of the monoliths I've worked on. This in particular describes exactly zero of them:
- one shared database
Usually there's one data access interface, but behind that interface there are multiple databases. This characterization doesn't even cover the most common of upgrades to data storage in monoliths: adding a caching layer to an existing database layer.
.NET Remoting, from 2002, was expressly designed to allow objects to be created either locally or on a different machine altogether.
I’m sure Java also had something very similar.
Monolith frameworks we’re always designed to be distributed.
The reason distributed code was not popular was because the hardware at the time did not justify it.
Further, treating hardware as cattle and not pets was not easy or possible because of the lack of a variety of technologies such as better devops tools, better and faster compilers, containerization, etc.
I would actually disagree - to me you can have "decent separation of concerns in your code" but still have only built the app to support a single entry point. "Modular monolith" to me is a system that is built with the view of being able to support multiple entry points, which is a bit more complex than just "separating concerns"
If your concerns are well separated in a monolith (in practice this means being able to call a given piece of functionality with high confidence that it will only talk to the data stores/external resources that you expect it will), adding new entry points is very easy.
Now, it's not trivial--going from, say, a do-everything webserver host to a separation of route-family-specific web servers, background job servers, report processing hosts, and cron job runners does require work no matter how you slice it--but it's a more or less a mechanical or "plumbing" problem if you start from a monolithic codebase that is already well behaved. Modularity is one significant part of said good behavior.
My theory is that microservices became vogue along with dynamically typed languages. Lack of static types means that code becomes unmanageable at a much lower level of complexity. So the complexity was "moved to the network", which looks like a clear win if you never look outside a single component.
I've wondered if it's not a ploy by cloud vendors and the ecosystem around them to increase peoples' cloud bills. Not only do you end up using many times more CPU but you end up transferring a lot of data between availability zones, and many clouds bill for that.
A microservice architecture also tends to lock you into requiring things like Kubernetes, further increasing lock-in to the managed cloud paradigm if not to individual clouds.
> I've wondered if it's not a ploy by cloud vendors and the ecosystem around them to increase peoples' cloud bills. Not only do you end up using many times more CPU but you end up transferring a lot of data between availability zones, and many clouds bill for that.
Disagree. I'd argue that microservices are inherently more cost effective to scale. By breaking up your services you can deploy them in arbitrary ways, essentially bin packing N microservices onto K instances.
When your data volume is light you reduce K and repack your N services.
Because your services are broken apart they're easier to move around and you have more fine grained scaling.
> further increasing lock-in to the managed cloud paradigm if not to individual clouds.
Also disagree. We use Nomad and it's not hard to imagine how we would move to another cloud.
More granular scaling. Scaling up instances of 8, 16, 32GB or even larger instances is much more expensive than 1,2,4GB instances. In addition, monoliths tend to load slower since there's more code being loaded (so you can't scale up in sub minute times)
Obviously there's lazy loading, caching, and other things to speed up application boot but loading more code is still slower
The archetypical "microservice" ecosystem I am aware of is Google's production environment. It was, at that point, primarily written in C++ and Java, neither very famous for being dynamically typed.
But, it was a microservice architecture built primarily on RPCs and not very much on message buses. And RPCs that, basically, are statically typed (with code generation for client libs, and code generation for server-side stubbing, as it were). The open-source equivalent is gRPC.
Where "going microservice" is a potential saving is when different parts of your system have different scaling characteristics. Maybe your login system ends up scaling as O(log n), but one data-munging part of the system scales as O(n log n) and another as just O(n). And one annoying (but important) part scales as O(n * 2). With a monolith, you get LBs in place and you have to scale you monolith out as the part that has the worst scaling characteristic.
But, in an ideal microservice world (where you have an RPC-based mechanism taht can be load-balanced, rather than a shared message bus that is harder to much harder to scale), you simply dial up the scaling factor of each microservice on their own.
Amazon was also doing microservices very early and it was a monolithic C++ application originally (obidos).
Microservices was relly more about locality and the ability to keep data in a memory cache on a thin service. Rather than having catalog data competing with the rest of the monolithic webserver app on the front end webservers, requests went over the network, to a load balancer, they were hashed so that the same request from any of the webservers hit the same catalog server, then that catalog server usually had the right data for the response to be served out of memory.
Most of the catalog data was served from BDB files which had all the non-changing catalog data pushed out to the catalog server (initially this data had been pushed to the webserver). For updates all the catalog servers had real-time updates streamed to them and they wrote to a BDB file which was a log of new updates.
That meant that most of the time the catalog data was served out of a redis-like cache in memory (which due to the load balancer hashing on the request could use the aggregated size of the RAM on the catalog service). Rarely would requests need to hit the disk. And requests never needed to hit SQL and talk to the catalog databases.
In the monolithic world all those catalog requests are generated uniformly across all the webservers so there's no opportunity for locality, each webserver needs to have all the top 100000 items in cache, and that is competing with the whole rest of the application (and that's even after going to the world where its all predigested BDB files with an update service so that you're not talking SQL to databases).
It depends. If your monolith requires, say, 16 GB to keep running, but under the hood is about 20 pieces, each happily using less than 1 GB, tripling only a few of these means you may be able to get away with (say) 25 GB. Whereas tripling your monolith means 48 GB.
You can obviously (FSVO "obvious") scale simply by deploying your monolith multiple times. But, the resource usage MAY be substantially higher.
But, it is a trade-off. In many cases, a monolith is simpler. And in at least some cases, a monolith may be better.
If you go with static languages you are pretty much stuck with Microsoft or Oracle, both which are ahole companies who cannot be trusted. There is no sufficiently common 3rd static option, at least not tied to Big Tech.
Not really. You could do the same by having different organisational units being responsible for different libraries. And the final monolith being minimum glue code combined with N libraries. Basically the same way your code depend on libraries from other vendors/OSS maintainers.
The problem is that orgs are not set in stone. Teams get merged and split in reorgs, buyouts and mergers happen, suddenly your microservices designed around "cleanly defined boundaries" no longer make any sense. Sure you can write completely new microservices but that is a distraction from delivering value to the end customer.
So? The solution is that services travel with teams. I've been through more than one reorg. That's what always happens. The point is to have clear ownership and prevent conflicts from too many people working on the same code.
> monolith was always supposed to be MODULAR from the start
Well, that certainly is sensible, but I wasn't aware that someone had to invent the monolith and define how far it should go.
Alas, my impression is that the term "monolith" doesn't really refer to a pattern or format someone is deliberately aiming for in most cases, but instead refers to one big execution of a lot of code that is doing far more than it should have the responsibility to handle or is reasonable for one repository to manage.
I wish these sorts of battles would just go away, though, because it's not like micro services are actually bad, or even monoliths depending on the situation. They're just different sides of the same coin. A monolith results from not a lot of care for the future of the code and how it's going to be scaled or reused, and micro services are often written because of too much premature optimization.
Most things should be a "modular monolith". In fact I think most things should start out as modular monoliths inside monorepos, and then anything that needs to be split out into its own separate library or microservice can be made so later on.
No one had to invent the monolith or define how far it should go; it was the default.
Microservices came about because companies kept falling into the same trap - that because the code base was shared, and because organizational pressures mean features > tech debt, always, there was always pressure to write spaghetti code rather than to refactor and properly encapsulate. That doesn't mean it couldn't be done, it just meant it was always a matter of time before the business needs meant spaghetti.
Microservices, on the other hand, promises enforced separation, which sounds like a good idea once you've been bitten by the pressures of the former. You -can't- fall into spaghetti. What it fails to account for, of course, is the increased operational overhead of deploying all those services and keeping them playing nicely with each other. That's not to say there aren't some actual benefits to them, too (language agnostic, faults can sometimes be isolated), but the purported benefits tend to be exaggerated, especially compared to "a monolith if we just had proper business controls to prevent engineers from feeling like they had to write kluges to deliver features in time".
The individual code bases of a microservice might not involve spaghetti code, but the interconnections certain can. I'm looking at a diagram of the service I work on, with seven components (written in three languages), five databases, 25 internal connections, two external interfaces, and three connections to outside databases, all cross-wired (via 6 connections) with a similar setup geographically elsewhere in case we need to cut over. And that's the simplified diagram, not showing the number of individual instances of each component running.
There is clear separation of concerns among all the components, but it's the interconnections between them that take a while to pick up on.
Fair; I should have been more explicit - your code can't fall into spaghetti (since the services are so small). Of course, you're just moving that complexity into the infrastructure, where, yeah, you still have the same complexity and the same pressures.
> Because there is no newly invented architecture called "modular monolith" - monolith was always supposed to be MODULAR from the start.
Isn't "non-modular monolith" just spaghetti code? The way I understand it, "modular monolith" is just "an executable using libraries". Or is it supposed to mean something different?
The way I see it, spaghetti code is actually a very wide spectrum, with amorphic goto-based code on one end, and otherwise well structured code, but with too much reliance on global singletons on the other (much more palatable) end. While by definition spaghetti code is not modular, modularity entails more. I would define modularity as an explicit architectural decision to encapsulate some parts of the codebase such that the interfaces between them change an order of magnitude less frequently than what's within the modules.
In my world, the solution depends on the requirement. I can't take all the criticism of each as if they are competition to each other. Also, multiple monolithic (can't stand that word as well) can be applied to distribute resources and data, and to reduce dependencies.
Compute, storage, and other services have gotten to the point where they are unlimited, they were originally designed for what was considered monolithic applications. Shared tenancy is not good in the age of multiple dependencies and in terms of security needs for mission-critical applications and data.
Cloud host providers got too eager in seeking to create new lines of business and pushed micro-service architectures far too early to maturity, and now we're just beginning to see it's faults, many which can't be fixed without major changes that will likely make them pretty much useless, or alternatively just similar to monolithic architectures anyway.
Profit and monopolistic goals shouldn't drive this type of IT innovation, solving critical problems should. We shouldn't just throw away all that we've engineered over the past decade and reinvent the wheel... Heck, many liars are still running FINTECH on COBOL.
> * got too eager in seeking to create new lines of business and pushed * far too early to maturity, and now we're just beginning to see it's faults, many which can't be fixed without major changes that will likely make them pretty much useless
> Because there is no newly invented architecture called "modular monolith" - monolith was always supposed to be MODULAR from the start.
Unless you're in a large established org, modulatity is likely a counter-goal. Most startups are looking to iterate quickly to find pre-PMF. Unless you write absolutely terrible code, my experience is you're more likely to have the product cycle off a set functionality before you truly need modularity. From there, you either (1) survive long enough to make an active decision to change how you work (2) you die and architecture doesn't matter.
"Modular monolith" is a nice framing for teams who are at that transition point.
Agree. IMO modularity/coupling is the main issue. My issue w/ the microservice architecture is that it solves the modularity problem almost as a side effect of itself but introduces a whole host of new ones that people do not anticipate.
Yes, if you, at the outset, say we will separate things into separate services, you will get separated services. However, you do NOT need to take on the extra complexity that comes with communication between services, remote dependency management, and additional infrastructure to reduce coupling.
I explicitly claim to use a "citadel" archicture [0] when talking about breaking off services for very similar reasons. Having a single microservice split out from a monolith is a totally valid application of microservices, but I've had better results in conversation when I market it properly.
I've found this to go so far as to have "monolith" understood to mean "single instance of a system with state held in memory".
This is one of the things I love about Elixir and OTP+The Beam that underpin it all. It's really great that you can just slowly (as is sensible) move your system over to message passing across many VMs (and machines) before you need to move to a more bespoke service oriented architecture.
With Umbrella apps this can already be setup from the start and you can break off bits of your app into new distinct message-able components (micro services you could say) as you like while still being able to share code/modules as needed.
The other thing I'd say is you can introspect how these "services" are running in realtime with IEX and it feels like magic being able to run functions in your application like some sort of realtime interactive debugger. One of the biggest issues I had with micro-services is figuring out what the hell is going wrong with them (in dev or prod) without the layers of extra work you need to add in terms of granular logging, service mesh and introspection etc.
Creating a lot of actors and messaging for the business logic of your application is considered an anti-pattern in Elixir, and a typical novice mistake. Applications in Elixir are structured using functions that are in modules that call other functions in other modules. Yes you can use OTP applications to isolate dependencies but none of this is done with the intent to more easily break up your app into a bunch of micro-services.
Which is a distinct feature made for breaking up the logic of your applications into smaller, domain bounded libraries. Umbrella apps are for the most part like regular libraries, just internal to the project which let's you do neat things like share config and deployment across them.
They don't require interacting with OTP functionality unless you make them that way and I think the OP was crossing some wires there.
No, you do not even need OTP functionality on the child project, that's my point. Not everything uses OTP.
Edit: We may be talking past each other, from Sasa Juric:
"It’s worth noting that poison is a simple OTP app that doesn’t start any process. Such applications are called library applications." Which is what I'm thinking of. He also says "Finally, it’s worth mentioning that any project you build will mix will generate an OTP app."
I was mainly talking about you don't have to use anything like GenServer or other OTP functionality with the split out app so they're more like the library application but that is still in fact an OTP application even if you're not using any of the downstream OTP features.
Each separate child project is still an OTP application, even if you do not use "OTP" features in them. OTP app is just the term for that artifact, similar to assembly or archive in other languages but it is not only terminology, each one will have `start` called when the VM starts, even if they don't spawn any servers.
I've argued for a long time that microservices should grow out of a monolith like an amoeba. The monolith grows until there is a clear section that can be carved off. Often the first section is security/auth, but from there it's going to be application specific. A modulith could be just another step in the carve up process.
But, there is no right answer here. Application domain, team size, team experience, etc... all matter and mean a solution for one team may not work for another and vice versa.
In my experience with enterprise software one of the things that cause most trouble is premature modularization (sibling to the famous premature optimization).
Just like I can't understand how people can come up with the right tests before the code in TDD, I can't understand how people can come up with the right microservices before they start developing the solution.
Perhaps a better way to think of "writing the tests first" in TDD approaches (or, more generally, test-first development, which is a term that has gone out of favor) is that you write test cases in that testing framework to express your intent and then from there can start to exercise them to ensure correctness. It's not a crime to change them later if you realize you have to tweak a return type. But writing them up front necessitates a greater depth of thinking than just jumping in to start hacking.
Not doing this runs the risk of testing what you did, not testing that it's right. You can get lucky, and do it right the first time. Or you can make a bunch of tests that test to make sure nothing changed, with no eye as to whether it's what anybody actually wanted in the first place.
Sitting down with somebody for whom TDD is natural is educational and makes it very hard to argue against it, particularly for library code.
The issue with most TDD or even BDD I have seen is that it is usually worthless...
You end up testing that numbers came out of a function or that some thing was called, but it doesn't actually solve the real issue which is to get your use cases correct. Instead it encourages you to break down the problem into a bunch of unrelated bits, it doesn't actually check that your approach is correct or what the customer wants, just that you wrote some bits of code which do things, whether those things are the right things...
As it is often practiced it is usually a failsafe for people who struggle to write code at all.
Acceptance tests are too happy pathy, integration tests are rarely done or deemed 'unnecessary', so the only place left to 'think about the problem' becomes actually writing/designing the code. And so tests tend to come last, because you already decided what works, and they check nonsense.
For truly difficult code, such as a mathematical algorithm which is difficult to break down, unit tests make sense, the majority of "when X is true do A, when X is false do B" of unit tests are utter garbage.
Well when you hire 40 engineers and 4-5 dev managers, each manager wants an application they can call their own, so they divvy up the potential product into what they feel are reasonable bits. Hence: micro services. It’s the same reason React was created: to more easily Conways-Law (used as a verb) the codebase
On the TDD argument, you should know what a return for a function f should be when given an argument b. Ideally, you have a set of arguments B to which b belongs, and a set of results C to which the return belongs. Your tests codifies mapping examples from B to C so that you can discover an f that produces the mapping. Take another step and generate random valid inputs and you can have property-based testing. Add a sufficient type system, and a lot of your properties can be covered by the type system itself, and input generators can be derived from the types automatically. So your tests can be very strong evidence, but not proof, that your function under test works as expected. There's always a risk that you modeled the input or output incorrectly, especially if you are not the intended user of the function. But that's why you need user validation prior to final deployment to production.
Likewise, with a microservice architecture, you have requirements that define a set of data C that must be available for an application that provides get/post/put/delete/events in a set B to your service over a transport protocol. You need to provide access to this data via the same transport protocol, transform the input protocol to a specified output protocol.
You also have operational concerns, like logging that takes messages in a set C and stores them. And monitoring, and authorization, etc. These are present in every request/response cycle.
So, you now split the application into request/response services across the routing boundary-> 1 route = 1 backing model. That service can call other apis as needed. And that's it. It's not hard. It's not even module-level split depolyment. It's function-level deployment in most serverless architectures that is recommended because it offers the most isolation, while combining services makes deployment easier, that's mostly a case of splitting deployment across several deployment templates that are all alike and can be managed as sets by deployment technologies like Cloudformation and Terraform [1].
You can also think of boundaries like this: services in any SOA are just like software modules in any program - they should obey open/closed and have strong cohesion [2] to belong in a singular deployment service.
Then you measure and monitor. If two services always scale together, and mutually call each other, it's likely that they are actually one module and you won't effect cohesion by deploying them as a single service to replace the two existing ones. easing ops overhead.
Not deploying and running as a monolith doesn't mean not putting the code to be run into the same scm/multiproject build as a monorepo for easy shared cross-service message schema refactoring, dependency management, and version ingredient. That comes with its own set of problems -- service projects within the repo that do not change or use new portions of the comm message message schema shouldn't redeploy with new shared artifact dependencies; basically everything should still deploy incrementally and independently, scaling it is hard (see Alphabet/Google's or Twitter's monorepo management practices, for example); but there seems to be an extra scale rank beyond Enterprise size that applies to, it's very unlikely you are in that category, and if you are you'll know it immediately.
We like to market microservice architecture as an engineering concern. But it's really about ops costs in the end. Lambdas for services that aren't constantly active tend to cost less than containers/vps/compute instances.
> from there it's going to be application specific
It is actually sensible to keep all the business logic layer(s) in the monolith while it is possible. Easier to grasp the domain for new team members, easier to bound context.
Have you seen this done well more than a few times? Honest question because it’s something I think everyone agrees with as being a good idea but it never gets actually done. It’s definitely not an industry practice that big monoliths eventually get split up in modules and quality increases. It’s something you have to fight for and actively pursue.
I’ve been on projects with very good developers, who were very bought into the idea of separating modules in the monolith, yet for one reason or another it never got done to an extent where you could see the quality improving. This leads me to believe this technique just isn’t practical for most teams, for reasons that are more psychological than technical.
It’s just something about how people approach a single codebase that leads to that, it’s harder to think about contracts and communication when everything is in the same language.
But if you force them to communicate over the wire to something in a different stack then all of a sudden the contracts become clear and boundaries well defined.
It is not a binary flip between monolith and modular monolith, it is on a gradual scale, and I saw teams moving toward modularity with a various degree of success. They may not even use the term of distributed monolith to name their approach. Sometimes, they do it to keep the monolith maintainable, sometimes as the first steps towards microservices - defining boundaries, simplifying dependencies certainly help against the antipattern of distributed monolith.
Ideally, the decision to build modular monolith should be made and implemented from the very start of the project. Some frameworks like Django help with keeping separation.
I found that fitness functions help with policies and contracts. You run then in your CI/CD and they raise alarm when they detect contract / access violation across the boundaries.
I've been quite happy with multi module projects over the years. I.e. where your modules are a directed graph of dependencies. Whatever modular structure you have right now is never perfect, but refactoring them can be straightforward, which does require active pursuit of the ideal, yes. You start such a project with an educated guess about the module structure, then the team goes heads down developing stuff, then you come up for air and refactor the modules. If you don't do that part then things get increasingly crufty. I think a best practice in these environments is to not worship DRY too much, and instead encourage people to do non-scalable things in the leaf libraries, then later refactor re-usable stuff and move it to the branch libraries.
It helps very much to be in a language and build environment that has first class modularity support. I.e. good build tooling around them and good IDE support. And at the language level, good privacy constraints at the package level, so the deeper libraries aren't exposing too much by accident.
What modules patterns have I seen work over the years? Generally, having business logic and data access apis lower in the tree, and branching out for different things that need to be deployed differently, either because they will deploy to different platforms (say, you're running business logic in a web server vs a backend job), or because they are deployed on different schedules by different teams (services). A nice thing about the architecture is that when you have a bunch of services, your decision as to what code runs on what services becomes more natural and flexible, e.g. you might want to move computationally expensive code to another service that runs on specialized hardware.
But you need to refactor and budget time to that refactoring. Which I think is true in any architecture--it's just often more do-able to do in one pass with the multi module approach.
In my experience, i've seen the modular code approach more often than separate deployment approach, quite possibly because the latter is still a bit harder to do when compared to just having 1 instance (or people are just lazy and don't want to take the risk of breaking things that were working previously for future gains), but sooner or later the question of scalability does come up, at least in successful projects.
Let me tell you, as someone who has delivered a critical code fix for business continuity after midnight a few times, slapping N instances of an app runtime in a data center somewhere is way easier than having to struggle with optimizations and introduce more complexity in the form of caches, or write out Hibernate queries as really long and hard to debug SQL because people previously didn't care enough about performance testing or simply didn't have a feasible way to simulate the loads that the system could run into, all while knowing that if your monolith also contains scheduled processes, none of your optimizations will even matter, because those badly optimized processes will eat up all of the resources and crash the app anyways.
In short, the architecture that you choose will also help you mitigate certain risks. Which ones you should pay attention to, however, depends on the specifics of your system and any compliance requirements etc. Personally, as a developer, fault tolerance is up there among the things that impact the quality of my life the most, and it's pretty hard to do it well in a monolith.
In my eyes the problem with contracts is also worthy of discussion, though my view is a bit different - there will always be people who will mess things up, regardless of whether you expect them to use modules someone else wrote and contribute to a codebase while following some set of standards or expectations, or whether you expect them to use some web API in a sane manner. I've seen systems that refuse to acknowledge that they've been given a 404 for a request numerous times (in a business process where the data cannot reappear) and just keep making the same request ad infinitum, whenever the scheduled process on their side needs to run.
So, having a web API contract can make managing responsibility etc. easier, however if no one has their eye on the overall architecture and how things are supposed to fit together (and if you don't have instrumentation in place to actually tell you whether things do fit together in the way you expect), then you're in for a world of hurt.
To that end, when people need to work with distributed systems of any sort, i urge them to consider introducing APM tools as well, such as Apache Skywalking: https://skywalking.apache.org/ (sub-par interface, but simple to set up, supports a decent variety of technologies and can be self hosted on prem)
Or, you know, at least have log shipping in place, like Graylog: https://www.graylog.org/ (simpler to setup than Elastic Stack, pretty okay as far as the functionality goes, also can be self hosted on prem)
There is a more serious downside that you don’t mention: splitting things into modules takes time and involves making decisions you likely don’t know the answer to. When starting a new product, the most important thing is to get something up and running as quickly as possible so that people can try it and give you feedback. Based on the feedback you receive, you may realize that you need to build something quite different than what you have. I’ve seen plenty of successful products with shoddy engineering, and I’ve seen plenty of well engineered products fail. Success of a product is not correlated with how well it’s engineered. Speed is often the most important factor.
Selling effectively is the most important factor, not speed. Speed is second as a factor (and obviously very important). That's actually what you're describing when you say success of a product is not correlated to how well it's engineered. It's correlated to how well you can sell what you have to the audience/customers you need. That's why some start-ups can even get jumpstarted without having a functional product via pre-product sign-ups and sales. Getting to selling as reasonably quickly as you can, in other words.
Go when you have something to sell. That's what the MVP is about.
Which also isn't the same as me saying that speed doesn't matter - it matters less than how well you sell. It's better to sell at a 10/10 skill level, and have your speed be 8/10, than vice versa (and that will rarely not be the case). Those are bound-together qualities as it pertains to success, so if you sell at 10/10 and your speed is 1/10, you're at a high risk of failure. Give on speed before you give on selling and don't give too much on either.
> There is a more serious downside that you don’t mention: splitting things into modules takes time and involves making decisions you likely don’t know the answer to.
Partially agreed. Domain driven design can help with answering some of those questions, as can drilling down into what the actual requirements are, otherwise you're perhaps reaching for your code editor before even having an idea of what you're supposed to build.
As for the inevitable changing requirements, most of the refactoring tools nowadays are also pretty reasonable, so adding an interface, or getting rid of an interface, creating a new package, or even in-lining a bunch of code isn't too hard. You just need to know how to use your tools and set aside time for managing technical debt, which, if not done, will cause more problems down the road in other ways.
> Speed is often the most important factor.
If you're an entrepreneur or even just a business person who cares just about shipping the feature, sure. If you're an engineer who expects their system to work correctly and do so for the years to come, and, more importantly, remain easy to modify, scale and reason about, then no, speed is not the most important factor.
Some business frameworks like COBIT talk about the alignment between the tech and business, but in my experience their priorities will often be at odds. Thus, both sides will need to give up bits of what they're after and compromise.
If you lean too heavily into the pace of development direction, you'll write unmaintainable garbage which may or may not be your problem if you dip and go work for another company, but it will definitely be someone else's problem. Thus, i think that software engineering could use a bit more of actual engineering it.
Not necessarily 50 page requirement docs that don't conform to reality and that no one cares about or reads, but actually occasionally slowing down and thinking about the codebases that they're working on. Right now, i've been working on one of the codebases in a project on and off for about 4 years - it's not even a system component, but rather just some business software that's important to the clients. In my personal experience, focusing just on speed wouldn't have been sustainable past 1 year, since the codebase is now already hundreds of thousands of lines long.
For a contrast, consider how your OS would work if it were developed just while focusing on the speed of development.
> Speed [of delivery] is often the most important factor.
Depends on the org and the app type. If banks "moved fast and broke things" using millions of dollars, they'd be shut down or sued into oblivion. If it's merely showing dancing cats to teeny-boppers, sure, move fast and break things because there's nothing of worth being broken.
I agree with your position, I'm a big fan of the modular monolith approach. I took a look at your post. This is one thing that jumped out to me:
> Because the people who design programming languages have decided that implementing logic to deal with distributed systems at the language construct level... isn't worth it
I'm not sure if this is just a dead end or something really interesting. The only language I really know that [does this is Erlang](https://www.erlang.org/doc/reference_manual/distributed.html), though it's done at the VM / library level and not technically at the language level (meaning no special syntax for it). What goes into a language is tricky, because languages tend to hide many operational characteristics.
Threads are a good example of that, not many languages have a ton of syntax related to threads. Often it's just a library. Or, even if there is syntax, it's only related to a subset of threading functionality (i.e. Java's `synchronized`).
So there might not be much devotion of language to architectural concerns because that is changing so much over time. No one was talking about microservices in the 90s. Plus, the ideal case is a compiler that's smart enough to abstract that stuff from you.
Recent languages do have syntax related to -concurrency- though.
Languages pre multi-core probably provided a VM/library, as you say, for threading, and then said "generally you should minimize threads/concurrency" (which for mainstream languages were the same thing).
Languages since then have embraced concurrency at the syntax level, even if not embracing threading. Node has event listeners and callbacks (and added async/await as a syntactical nicety), go has goroutines (with special syntax to indicate it), etc.
It's interesting that while languages have sought to find ways to better express concurrency since it became necessary to really use the chips found in the underlying hardware, they largely haven't sought to provide ways to better express distribution, leaving that largely to the user (and which has necessitated the creation of abstracted orchestration layers like K8s). Erlang's fairly unique in having distribution being something that can be treated transparently at the language level.
Mind you, that also has to do with its actor based concurrency mechanism; the reason sending a message to a remote process can be treated the same as sending a message to a local process is because the guarantees are the same (i.e., "you may or may not get a response back; if you expect one you should probably still have a timeout"). Other languages that started with stronger local guarantees can't add transparent remote calls, because those remote calls would have additional failure cases you'd need to account for (i.e., Java RMI is supposed to feel like calling a function, but it feels completely different than calling a local function. Golang channels are synchronous and blocking rather than asynchronous and non-blocking, etc. In each case you have a bunch of new failure conditions to think about and address; in Erlang you design with those in mind from the beginning)
Patterns and automation supporting modularization hasn't received the attention that patterns and automation around services has over the past 10 years.
In practice, modularization raises uncomfortable questions about ownership which means many critical modules become somewhat abandoned and easily turn into Frankensteins. Can you really change the spec of that module without impacting the unknown use cases it supports? Tooling is not in a position to help you answer that question without high discipline across the team, and we all know what happens if we raise the question on Slack: crickets.
Because services offer clear ownership boundaries and effective tooling across SDLC, even though the overheads of maintenance are higher versus modules, the questions are easier and teams can move forward with their work with fewer stakeholders involved.
Modules have higher requirements for self discipline the services. Precisely because the boundaries are so much easier to cross.
And also because it is harder to guard module from changes made by other teams. Both technically and politically, the service is more likely to be done by a single team who understands it. Module is more likely to be modified by many people from multiple teams who are just guessing what it does.
I think anyone who speaks about monolith today assumes you do use modules appropriately.
I mean for "starting with a monolith and splitting out when necessary" you kind need property structures code (i.e code which uses modules and isn't too tightly coupler between them).
Through in my experience you should avoid modularizing code with the thought of maybe splitting it into parts later on. That's kinda defeats the point of starting with a monolith to some degree. Instead modularize it with the thought of making it easy to refactor and use domain logic as main criterium for deciding what goes where when possible ( instead of technical details).
To be fair, modular monoliths weren't talked about much before microservices. I'm thinking about the heyday of Rails, where there were bitter debates over whether you should even move critical business logic out of Rails. I got the impression that most devs didn't even care. (That was sort of the point where I saw that I was a bit of an odd duck as far as developers go if I cared about this stuff, and pivoted my career.) I really enjoy contexts where I can make modular monoliths, however. I'm thinking of mobile/desktop apps here, mostly.
Software design is something of a lost art currently. This is partially due to the current zeitgeist believing that anything that automated tools cannot create/enforce can't be that important in the first place. Of course, there's a whole swath of concerns that cannot be addressed with static or dynamic analysis, or even different languages.
Microservices is a response to the fact that a monolith without any real design moves quickly but can easily converge on spaghetti due to the fact that everything is within reach. It enables teams to create an island of code that talks to other islands only via narrowly defined channels. Additionally, they support larger dev teams naturally due to their isolationist take, and reinforce the belief that "more developers = always better" in the process. In other words, they mesh perfectly with the business context that a lot of software is being built in.
Factor in the fact that $COOL_COMPANY uses them, and you have a winner.
Modular design is decades old though. Rails devs argue about it because it's against the rails way, and that makes things harder for them. But I've been modularizing my Java webapps since I graduated from college, 18 years ago.
But also, microservices weren't pioneered by rails devs. They were pioneered by huge companies, and they definitely have a role to play there, as you point out.
What I think nobody talks about is: (1) the legitimate reason for breaking up services into multiple address spaces, and (2) that using different versions of the runtime, different build systems, and different tools for logging, ORM, etc. in different microservices is slavery, not freedom.
(1) is that some parts of a system have radically different performance requirements than other parts of the system. For instance 98% of a web backend might be perfectly fine written in Ruby or PHP but 2% of it really wants everything in RAM with packed data structures and is better off done in Java, Go or Rust.
(2) The run of the mill engineering manager seems to get absolutely ecstatic when they find microservices means they can run JDK 7 in one VM, run JDK 8 in another VM, run JDK 13 in another VM. Even more so when they realize they are 'free' to use a different build system in different areas of the code, when they are 'free' to use Log4J in one place, use Slf4J someplace etc, use Guava 13 here, Guava 17 there, etc.
The rank and file person who has to actually do the work is going to be driven batty by all the important-but-not-fashionable things being different each and every time they do some 'simple' task such as compiling the software and deploying it.
If you standardize all of the little things across a set of microservices you probably get better development velocity than with a monolith because developers can build (e.g. "make", "mvn install") smaller services more quickly.
If on the other hand the devs need to learn a new way to do everything for each microservice, they are going to pay back everything they gained and then some with having to figure out different practices used in different areas.
(Throw docker into the mix, where you might need to wrangle 2G of files to deploy 2k worth of changes in development 100 times to fix a ticket you can really wreck your productivity, yet people really account for "where does the time go" when they are building and rebuilding their software over and over and over and over again.)
In my humble opinion microservices are "hot" because in theory you can scale a lot with them if you are able to do cloud provisioning.
Microservices needs DevOps+Orchestration Service.
A good example of microservices architecture is how K8s is designed: I think it is an overkill for most average needs, so think twice before entering in microservice trip tunnel.
Microservices solve pretty much one problem: you have a larger organization (> 10 devs, certainly > 100) and as a result the coordination overhead between those devs and their respective managers and stakeholders is significantly limiting overall forward progress. This will manifest in various concrete ways such as "microservices allow independent component release and deployment cycles" and "microservices allow fine grain scaling", and "microservices allow components written in different languages", but really it's all Conway.
This is a pretty critical point. The drum I tend to beat is that the positive read of microservices is that they make your code reflect your org chart.
(If they don't do this, and that usually resolves into developers each owning a bunch of microservices that are related concepts but distinct and are context-switching all the live-long day, you've created the reverse of a big ball of mud: you've created a tar pit. Companies used to brag to me when they had 250 microservices with 100 developers, and I don't think any of those companies are going concerns.)
i always think of microservices as a product an organization offers to itself. If you don't have the team including managers and even marketing to run an internal product then you probably shouldn't be doing the microservice thing.
Well in my eyes equating microservices with Kubernetes is a problem in of itself. I actually wrote about Docker Swarm as a simpler and more usable alternative to it (for smaller/simpler deployments), though some other folks also enjoy Hashicorp Nomad which is also nice (another article on my blog, won't link here not to be spammy myself).
If you evaluate your circumstances and find that microservices could be good for you, then there are certainly options to do them more easily. In my eyes some of the ideas that have popped up, like 12 Factor Apps https://12factor.net/ can be immensely useful for both microservices and even monoliths.
So i guess it's all very situational and a lot of the effort is finding out what's suitable for your particular circumstances. For example, i made the page over at https://apturicovid.lv/#en When the app was released and the page was getting hundreds of thousands of views due to all of the news coverage, scaling out to something like 8 instances was a really simple and adequate fix to not break under the load.
We've been following this modular monolith approach as well (for 3 years now), bounded contexts and all, but our architectural end goal is still "mostly microservices". Maybe it's specific to PHP and the frameworks we use (what the modulith is written in) but the startup time of the monolith is just unacceptable for our processing rates (for every request the framework has to be reinitialized again, with all the 9000 services via DI and what not). Microservices tend to be much more lightweight (you don't even need DI frameworks), and monolith also encourages synchronous calls (calling a bounded context's API in memory is so simple) which has been detrimental to our performance and stability, because microservices, at least in our architecture, encourage event-based communication which is more scalable, and allows clean retries on failure etc. But again, your mileage may vary, maybe it's specific to our tools.
In my experience, long build times mostly come from not caring about build times. Not from large codebases. A surprising number of developers don't think build times (including running all the tests) are important. They'll just go "ok, so now it takes N minutes to build" rather than thinking "hold on, this shouldn't take more than a few seconds - what's going on here?"
Tried it twice, never pulled its weight. It introduces abstraction layers everywhere, as a premature optimisation, even though they might never be needed, and its a bad fit for more verbose and statically typed languages due to all the ceremony thats required. Anyone made similar experiences?
I've mixed feelings about it. As advertised it can be somewhat useful for business rules-rich apps, yes, but most of the times the application side of the project will just grow exponentially faster than your core - especially if you don't keep an eye on commits, as some of your coworkers may choose the easy way out and push business logic on the application side instead of refactoring... It's a clean architecture from a business logic standpoint, indeed, but it doesn't do a lot to keep the outside of the hexagone well organized (beside the port/adapter pattern), and that side of the app can grow into a mess really fast if your team lack experience.
Not as many abstraction layers than in a classic J2EE app, though. It's not that bad.
I use it all the time. It is my default architecture choice. It works great in my opinion. I wonder why you think there is a lot of ceremony needed? The core idea doesn’t require much. The Hexagonal Architecture idea can be implemented in many ways. I recommend Googling “Elm Architecture” to see an example. It isn’t described as being a Hexagonal Architecture but I will argue that it is one of the best ways to implement it I have seen. It has zero ceremony (a few functions define the whole architecture).
I agree. Started with a simple React for frontend and NestJS for backend. Now I am running microservices for distributing third-party widgets, my search engine, and the analytics.
Works well and it actually simplifies things a lot, each service has its repository, pipeline and permissions, developers don't need to understand the whole application to code.
You also don't have to start with Kubernetes to make microservices work, many tools can act as in-betweens. I am using app engine from gcloud, yes it's a lot of abstraction over kubernetes and it is overpriced, but I don't care. It works perfectly for these use cases and even if overpriced, it stays a low absolute value.
The caveat is that you really need to start off with a "stateless mindset".
I even wonder why the word "monolith" got such a bad connotation that it is now used synonymously to "big ball of mud".
I mean, monoliths in the original sense (Washington Monument, that 2001 - A Space Odyssee moon thing, ...) are all but messy.
It's usually not obvious where to put the seams ahead of time so you can cut them when you need to split into microservices.
Plus keeping the API boundaries clean costs time and resources, and it's tempting to violate them just to launch this one feature. This extra discipline doesn't have any payoff in the short term, and it has unknown payoff in the long term because you're not sure you drew the boundaries ahead of time anyway.
So I think in practice what happens is you create a monolith and just eat the cost of untangling it when the team gets too big or whatever.
> I wonder why no one ever talks about architectures in the middle between those two
Because that's the default. It doesn't make intuitive sense to integrate every separate service you develop internally into one huge bulky thing, nor to split up every little feature into even more small services that need a ton of management mechanisms and glue to even be functional. Only after advocates for both extremes sprung up does it make sense to invent, as you did, a new word for the thing in the middle. It's just the sensible thing to do in most situations.
Agree. But start with modules immediately. SOLID applies to modules as well as classes. [And if we're talking about Rust, then substitute "crate" wherever I write "module"]. Basically be constantly refactoring your code into classes (or equiv.) and the classes into modules.
It's super easy to join multiple modules into a single application: the linker, JVM, whatever does it automatically. It's insanely hard to break a monolithic application into modules.
The problem with modular monoliths is the relative difficulty of managing dependencies in the code. It’s easy to hack the modular interfaces or expose internals, which can make it hard to maintain and painful to pull out into another service down the road.
“But I would never hack an interface like that!” you say. Oh, you sweet summer child.
IME it's the drive for microservices that encourage the modular monolith. That is to say, the monolith is typically loosely modular (but still with many rule-breakages) until the push for microservices starts _then_ the big refactor to modularity begins.
Splitting code into modules has the same downsides as splitting it into microservices. You can still end up making the wrong splits and needing to back track on things you once thought were modular but no longer are.
The logistics of microservices are rarely the hard part. It's the long term maintenance. Everyone who's ever maintained a "core" library knows the same pain, at some point you just end up making sacrifices just to get things to work.
> Splitting code into modules has the same downsides as splitting it into microservices.
Not to be pedantic, but it has some of the same downsides. Microservices have other major downsides in that they bring in all the fallacies of network computing. Even if you manage to stabilize these in the end they just waste so much time in development and debugging.
There is a lot of talk about monoliths vs microservices lately.. I just want to throw into the ring that you can do both at the same time. easily. And noone is going to kill you for it either.
maybe we are getting caught up in sematics because its christmas, but "monorepo/monolith/microservices/etc" is -just- the way you organize your code.
Developing a montolith for years but now you have written a 15 line golang http api that converts pdfs to stardust and put it into on a dedicted server in your office? welp thats a microservice.
Did you write a 150 repo application that can not be deployed seperatly anyway? welp thats a monolith.
You can also build a microservice ecosystem without kubernetes on your local network. We have done it for years with virtual machines. Software defined networking just makes things more elegant.
So dont stop using microservices because its "hard" or start writing monoliths because its "easy", because none of that is true in the long run.
What is true is that you have a group of people trying to code for a common goal. The way you reach that goal together defines how you organize your code.
> Developing a montolith for years but now you have written a 15 line golang http api that converts pdfs to stardust and put it into on a dedicted server in your office? welp thats a microservice.
But the 15 lines of Golang are not just 15 lines of Golang in production. You need:
- auth? Who can talk to your service? Perhaps ip whitelisting?
- monitoring? How do you know if you service is up and running? If it's down, you need alerts as well. What if there is a memory problem (because code is not optimal)?
- how do you deploy the service? Plain ansible or perhaps k8s? Just scp? Depending on your solution, how do you implement rollbacks?
- what about security regarding outdated packages the Go app is using? You need to monitor it as well.
And so on. The moment you need to store data that somehow needs to be in sycn with the monolith's data, everything gets more complicated.
Many of these points you're mentioning is exactly why k8s was developed. Yes it makes deploying simple applications unnecessary hard, but it make deploying more complicated applications WAY more manageable.
So in the k8s world:
- auth: service meshes, network policies, ...
- monitoring: tons of tooling there to streamline that
- deploy: this at scale is trickier than you'd think, many seem to assume k8s on it's own here is the magic dust they need. But GitOps with ArgoCD + helm has worked pretty well at scale in my experience.
- Security is a CI problem, and you have that with every single language, not just Go. See Log4j.
Kubernetes is my bread & butter, but I do realise this has way too much overhead for small applications. However, once you reach a certain scale, it solves many of the really really hard problems by streamlining how you look at applications from an infrastructure and deployment side of things. But yes - you need dedicated people who understand k8s and know what the hell they're doing - and that's in my experience a challenge on it's own.
Let's also dispel a myth that k8s is only suitable for microservices. I have clients that are running completely separate monolith applications on k8s, but enough of those that managing them 'the old way' became very challenging, and moving these to k8s in the end simplified thing. But getting there was a very painful process.
1. auth? probably an internal service, so don't expose it to the outside network.
2. monitoring? if the service is being used anywhere at all, the client will throw some sort of exception if its unreachable.
memory problem? it should take <1 day to ensure the code for such a small service does not leak memory. if it does have memory leaks anyways, just basic cpu/mem usage monitoring on your hosts will expose it. then ssh in, run `top` voila now you know which service is responsible.
3. deployment? if its a go service, literally a bash script to scp over the binary and an upstart daemon to monitor/restart the binary.
rollback? ok, checkout previous version on git, recompile, redeploy. maybe the whole process is wrapped in a bash script or assisted by a CI/CD build job.
4. security? well ok, PDFs can be vulnerable to parser attacks. so lock down the permissions and network rules on the service.
Overall this setup would work perfectly fine in a small/medium company and take 5-10x less time than doing everything the FAANG way. i don't think we should jump to calling these best practices without understanding the context in which the service lives.
I agree more or less with 1 and 4 mostly. But for monitoring either you would have to monitor the service calling this microservice or need to have a way to detect error.
> if it does have memory leaks anyways, just basic cpu/mem usage monitoring on your hosts
Who keeps on monitoring like this? How frequently would you do it? In a startup there are somewhere in the range of 5 microservice of that scale per programmer and daily monitoring of each service by doing top is not feasible.
> 3. deployment? if its a go service, literally a bash script to scp over the binary and an upstart daemon to monitor/restart the binary.
Your solution literally is more complex than simple jenkins or ansible script for build then kubectl rollout restart yet is lot more fragile. Anyways the point stands that you need to have a way for deployment
My larger point is basically just against dogma and “best practices”. Every decision has tradeoffs and is highly dependent on the larger organizational context.
For example, kubectl rollout assumes that your service is already packaged as a container, you are already running a k8s cluster and the team knows how to use it. In that context, maybe your method is a lot better. But in another context where k8s is not adopted and the ops team is skilled at linux admin but not at k8s, my way might be better. There’s no one true way and there never will be. Technical decisions cannot be made in a vacuum.
> Overall this setup would work perfectly fine in a small/medium company and take 5-10x less time than doing everything the FAANG way.
The point was never comparing it to the FAANG way. The point is: it's easier (at the beginning) to maintain ONE monolith (and all the production stuff related to it) than N microservices.
It's easy to be snarky and dismissive about these things, but i've stood in the offices of a governmental org and have looked at people who are unable to receive their healthcare services queueing up because some similarly neglected system refused to work. One that also had a lot of the basics left out.
Figuring out what was wrong with it was hard, because the logging was inconsistent, all over the place in where it logged and also insufficient.
The deployments and environments were inconsistent, the sysadmins on clients's side manually changed stuff in the .war archives, all the way up to library versions, which was horrendous from a reproducibility perspective.
The project was also severely out of date, not only package wise, but also because the new versions that had been developed actually weren't in prod.
After ripping out the system's guts and replacing it with something that worked, i became way more careful when running into amateur hour projects like that and about managing the risks and eventual breakdown of them. I suggest that you do the same.
Don't put yourself at risk, especially if some sort of a liability about the state of the system could land on you. What are you going to do when the system gets breached and a whole bunch of personal data gets leaked?
> ...then don't spend hundreds of thousands on infrastructure.
I find it curious how we went from doing the basics of software development that would minimize risks and be helpful to almost any project out there to this.
To clarify, i agree with the point that you'll need to prioritize different components based on what matters the most, but i don't think that you can't have a common standard set of tools and practices for all of them. Let me address all of the points with examples.
> Auth: nobody knows IP address of our server anyway, don't bother with that. And for extra security we have secret port number.
Port scanning means that none of your ports are secret.
JWT is trivial to implement in most languages. Even basic auth is better than nothing with HTTPS, you don't always need mTLS or the more complicated solutions, but you need something.
This should take a few days to a week to implement. Edit: probably a day or less if you have an easily reusable library for this.
> Monitoring? Well, we have our clients for that. They'll call us if something happens.
This is not viable if you have SLAs or just enjoy sleeping and not getting paged.
There are free monitoring solutions out there, such as Zabbix, Nagios, Prometheus & Grafana and others. Preconfigured OS templates also mean that you just need the monitoring appliance and an agent on the node you want to monitor in most cases.
This should take close to a week to implement. Edit: probably an hour or less if you already have a server up and just need to add a node.
This is an error prone way to do things, as the experience of Knight Capital showed: https://dougseven.com/2014/04/17/knightmare-a-devops-caution.... In addition, manual configuration changes lead to configuration drift and after a few years you'll have little idea about who changed what and when.
In contrast, setting up Ansible and versioning your config, as well as using containers for the actual software releases alongside fully automated CI cycles addresses all of those problems. In regards to rollbacks, if you have automated DB migrations, you might have to spend some time writing reverse migrations for all of the DDL changes.
This should take between one to two weeks to implement. Edit: probably a day or less once per project with small fixes here and there.
> Outdated packages? We call those stable packages.
Log4j might be stable, but it also leads to RCEs. This is not a good argument, at least for as long as the software packages that we use are beyond our ability to control or comprehend.
This should be a regular automated process, that alerts you about outdated and/or insecure packages, at least use npm audit or something, or proactive scanning like OpenVAS. This should take close to a week to implement.
All of the mentioned software can easily run on a single node with 2 CPU cores and about 8 GB of RAM. I know this, because i did all of the above in a project of mine (mentioned technologies might have been changed, though). Of course, doing that over N projects will probably increase the total time, especially if they have been badly written.
In my eyes that's a worthwhile investment, since when you finally have that across all of your apps, you can develop with better certainty that you won't have to stay late and deliver 2 or 3 hotfixes after your release, which goes hand in hand with test coverage and automated testing.
That is hardly hundreds of thousands of dollars/euros/whatever on infrastructure. And if the personnel costs are like that, i'd like to work over there, then.
There are plenty of solutions to authentication. But really, don't implement a user system if it is not needed. There are plenty of other ways to secure on applications, which are way out of scope for this discussion.
The main point is, that one should never spend a "a few days to a week" to implement a feature that at best i useless and at worst is detrimental to the service stood up.
Implement auth, if it is needed, implement monitoring, CI, CD, dependency monitoring, testing, everything, if it is needed.
But don't implement it as dogmatic consequences of doing software development.
And regarding the spend: one week worth of work could be USD 8k. So just the initial implementation of your JWT based authentication system is 4% into the "hundreds of thousands of dollars". Then you need to factor in the extra complexity on maintenance and before you know it we do not talk about hundreds of thousands of dollars but millions...
I feel like "spaghetti garbage now, we'll fix it later" is a big part of why startups fail to execute well. Yeah you saved $8000 by launching your thing in a completely unmaintainable way, but now it's both harder to maintain and more likely to need it. Literally the first time it breaks you will probably lose that cost advantage just because it takes so long to debug.
The point you should have made is that dogmatic approaches usually produce a lot of waste, but the example you gave us exactly why teams end up that way. Otherwise people come up with bullshit hacks like you describe and the entire team pays for it.
> But really, don't implement a user system if it is not needed.
Sure, i'm not necessarily advocating for a full blown RBAC implementation or something like that, merely something so that when your API is accidentally exposed to the rest of the world, it's not used for no good (at least immediately).
> Implement auth, if it is needed, implement monitoring, CI, CD, dependency monitoring, testing, everything, if it is needed.
> But don't implement it as dogmatic consequences of doing software development.
Now this is a bit harder to talk about, since the views here will probably be polarized. I'd argue that if you're developing software for someone else, software that will be paid for (and in many cases even in pro-bono development), most of that is needed, unless you just don't care about the risks that you (or someone else) might have to deal with otherwise.
If there's an API, i want it to at the very least have basicauth in front of it, because of the reasons mentioned above.
If there is a server running somewhere, i want to be alerted when something goes wrong with it, see its current and historic resource usage and get all of the other benefits running a few apt/yum commands and editing a config file would get me, as opposed to discovering that some bottleneck in the system is slowing down everything else because the memory utilization is routinely hitting the limits because someone left a bad JVM GC config in there somewhere.
If something's being built, i want it to be done by a server in a reasonably reproducible and automated manner, with the tasks are described in code that's versioned, so i'm not stuck in some hellscape where i'm told: "Okay, person X built this app around 2017 on their laptop, so you should be able to do that too. What do you mean, some-random-lib.jar is not in the classpath? I don't know, just get this working, okay? Instructions? why would you need those?"
Furthermore, if there is code, i want to be sure that it will work after i change it, rather than introducing a new feature and seeing years of legacy cruft crumble before my eyes, and to take blame for all of it. Manual testing will never be sufficient and integration tests aren't exactly easy in many circumstances, such as when the app doesn't even have an API but just a server side rendered web interface, which would mean that you need something like Selenium for the tests, the technical complexity of which would just make them even more half baked than unit tests would be.
Plus, if i ever stumble upon a codebase that lacks decent comments or even design docs/issue management, i will want to know what i'm looking at and just reading the code will never be enough to understand the context behind everything but if there are at least tests in place, then things will be slightly less miserable.
I'm tired of hating the work that i have to do because of the neglect of others, so i want to do better. If not for those who will come after me, then at least for myself in a year or so.
Do i do all of that for every single personal project of mine? Not necessarily, i cherrypick whatever i feel is appropriate (e.g. server monitoring and web monitoring for everything, tests for things with "business logic", CI/CD for everything that runs on a server not a local script etc.), but the beauty is that once you have at least the basics going in one of your projects, it's pretty easy to carry them over to others, oftentimes even to different stacks.
Of course, one can also talk about enterprise projects vs startups, not just personal projects, but a lot of it all depends on the environment you're in.
As for the money, i think that's pretty nice that you're paid decently over there! Here in Latvia i got about 1700 euros last month (net). So that's about 425 euros a week, or more like 850 if you take taxes and other expenses for the employer into account. That's a far cry from 100k. So that is also situational.
Offtopic, but if you really are paid 425 euros per week then you are seriously underpaid even for Eastern European standards. There are (relatively rare, but still) jobs that pay this much per day.
Yep, i've heard that before. Currently thinking of finishing the modernization of the projects that i'm doing at my current place of work and then maybe looking at other opportunities.
However, i just had a look at https://www.algas.lv/en, one site that aggregates the local salary information, based on a variety of stats. As someone with a Master's degree and ~5 years of experience and whose job description states that i'm supposed to primarily work in Java (even though i do full stack and DevOps now), i input those stats and looked at what information they have so far.
The average net salary monthly figures, at least according to the site, are:
So there's definitely a bit of variance, but it's still in the same ballpark. Of course, there are better companies out there, but the reality for many is that they're not rewarded for their work with all that much money.
USD 8k/w is not the pay a employee will receive, but the cost of the operation of that one employee, and it is ballpark numbers -- I work in Europe also, I am a Danish citizen.
Again, I advocate developing in a timely manner, and not do over engineering (neither under engineering).
> USD 8k/w is not the pay a employee will receive, but the cost of the operation of that one employee
I made the above response with that in mind.
> And regarding the spend: one week worth of work could be USD 8k.
The original claim was that one week's worth could be 8000 USD, or let's say roughly 7094 EUR. That comes out to 28376 EUR per month.
Last month i made around 1700 EUR, so it's possible to calculate approximately how much my work cost to my employer. Let's do that with a calculator here: https://kalkulatori.lv/en/algas-kalkulators
After inputting the data that's relevant to me, i got the following:
Gross salary 2654.51 EUR
Social tax 278.72 EUR
Personal income tax from income till 1667 EUR 277.66 EUR
Personal income tax (987.51 EUR), from part ... 227.13 EUR
Social tax, employer's part 626.20 EUR
Business risk fee 0.36 EUR
Total employer's expenses 3281.07 EUR
It should be apparent that 28376 EUR is a far cry from 3281 EUR, which is how much my work cost to my employer.
Thus, per week, 7094 EUR is also a far cry from 820 EUR, which is how much my work cost to my employer.
Also, 820 is actually pretty close to my initial guess of 850 EUR.
Of course, it's possible to argue that either i'm underpaid individually, or that many of my countrymen in Latvia are underpaid in general (on which i elaborated in an adjacent comment https://news.ycombinator.com/item?id=29595158), but then the question becomes... so what?
Does that mean that if you're in a well paid country like US, then you cannot afford proper development practices due to all of the payroll expenses that would cause? While that may well be, to me that sounds weird and plain backwards - if that were really true, then US would outsource even more to countries like mine and these outsourced systems would work amazingly well, since you can supposedly afford a team of developers here for what would buy you a single developer over there. And yet, most systems are still insecure, buggy and slow.
Maybe someone else is pocketing a lot of the money they receive in these countries, and is simply charging high consulting rates? The prevalence of WITCH companies here is telling, but that's a discussion for another time.
I really can’t tell how serious you are. I’m too aware that what you describe often is exactly how it works in practice. It’s just that very few admit it in public. :)
Yup. This is why I think that microservices require a stronger operational plattform, but then it enables new and more effective ways of developing new tunctionality.
Our internal software plattform is getting to a point so it can answer most of these things - auth via the central OIDC providers, basic monitoring via annotations of the job's services, deployments via the orchestration and some infrastructure around it, including optional checks and automated rollbacks and automated vulnerability scanning on build-servers and for the running systems. It wouldn't be 15 lines of go, more like 15 lines, plus about 100-200 lines of terraform and/or yaml to get everything configured, and a ticket do register the service in the platform. It's pretty nice and our solution consultants like it very much.
The thing is - this took a team about a year to build and it'll take another half a year to get everything we currently want to do right. And it takes a non-trivial time to maintain and support all of this. This kind of infrastructure only makes business sense, because we have enough developers and consultants moving a lot faster with this.
Back when we were a lot smaller, it made a lot more sense to just push a single java monolith on VMs with chef or ansible, because that was a lot easier and quicker to get working correctly (for one thing).
Well, it quickly can become a mess because of people having different ideas about what microservice is, and also decrying things as "for microservices only" when for example I just want to offload auth and monitoring to a specialized service.
It's also a common trope when I'm dealing with k8s decriers - yes, you might have one application that you can easily deploy, but suddenly there are 15 other medium-weight applications that solve different important problems and you want them all ;)
P.S. Recently a common thing in my own architectures is separate keycloak deployment that all services either know how to use, or have it handled at separate request router (service mesh or ingress or loadbalancer)
The task of the Microservice is to convert the pdf to stardust and to return it to its sender. so no auth.
Furthermore its most likely only reachable through the local network, or at least should be if you want some stranger not to be able to also make stardust from pdfs.
Monitoring: are you trying to say that its a lot esaier to pick up one logfile thant lets say 15? because they should be aggregated somewhere anyway no?
Deployment: Depending on anything you listed how do i do anything? Of course if have to define it but if you want a fancy example: k8s argocd canary deployments done. I literally set it up once.
security? Really?
Please dont get this wrong but this feels to me like whataboutism but well here i go:
i implement security just the same way as i would in the monorepo. The thing/person/entity just has to look into more repositories ;)
It comes down do one sentence i think:
State is not shared, state is communicated.
A microservice runs as some (somewhat) privileged user, you may want some auth. Can everyone internally create sales tickets? Or can everyone just query them? If a team provides a library to run, and you run it, you still only run as whatever user you have access to.
Monitoring: it's easier to look at a stack trace, including some other team's external library, than a HTTP error code 500.
Deployment is certainly easier when you're just shipping code and a build. You don't have to faff around with the previous instance running, maybe having some active connections/transactions/whatever, needing to launch a new one. Maybe it's not hard overall, but less fun.
For monitoring I’d also say the alerting side of things can be done via the older monolith. That is, catch exceptions and log/re-raise them as “PdfServiceException”.
Maybe it is just me, but I always understood that properly designed microservices have their own specific datastore, which is not shared with other microservices even if these all collaborate to the same process.
If this is actually (still) true, that means that "the way you organize your code" is a bit simplistic. Your example of an "http api that converts pdfs to ..." is surely a valid example of a microservice, but most business products have to handle much more "state" than those, and this will create further complications which go far beyond "how to organize your code" (and make monoliths more appealing).
Well I dont believe that i need a seperate datastore, but yes i need to communicate my state much more.
The PDF Example for instance:
I have to provide all assets that will be converted to startdust or at least links for them
I have to define where the finished stardust would be send to (e.g http post)
but its function is nonetheless independent. the epepheral disk on your container is a datastore too. If you need what is stored actually for longer than the request.... that is another story.
IMHO Microservices done well should actually cut a whole vertical through your applications feature space. So not only should it be responsible completely for its own storage of data but it should be responsible for how that data is shown on the front end (or as close to that as you can realistically achieve). A microservice should ideally be reviews or left navigation not customer authentication or order processing.
Single writer multiple readers, ideally with permissions to enforce, is a useful hybrid. Enforce going through the front door if you want side-effects. Reporting and ad-hoc data enrichment can be painful to materialize otherwise.
When you have multiple bits of code responsible for writing the same columns, maintaining global invariants becomes much harder.
I can still see rationale for exceptions to the rule, e.g. status fields for rows which logically represent some ongoing process or job, where a UI controller may set something to "processing" and the background job sets it to "processed" when it's complete. But there are ways to close the loop here. For example, implement the background job by having the job processing system invoke the same system which owns the field and sets it in response to the UI, like this:
UI start ---> Back End ---> status := processing
Back End ---> add job to job queue
Job Queue ---> Back End does work
Back End ---> status := processed
Caveat: I am really not qualified to discuss the nuances (because I have never used microservices so the little I know is based on reading a bit on those here and on other online forums).
"Single writer multiple readers", yes, this is what I would probably use, but yet again, wasn't the "promise" of Microservices being able to work in total isolation?
If I have one table (e.g. "Customer") which is written by one specific microservice and read by a dozen or more... what happens when I decide that I have to change the schema because the current representation of "Zip Code" is not adequate anymore because, I dunno, we started dealing with UK customers now?
Lo and behold, I have to change code in 13 Microservices - the one actually writing to it, and the 12 more that only need to get the data to show or print or convert to JSON or whatever... ¯\_(ツ)_/¯
Database schemas are generally isomorphic with RPC data structures - usually a subset one way or the other. If you had a service which handed out Zip Codes and the structure of Zip Codes changed (maybe you used numbers and now you realize you need to use strings, which I infer from your UK example) you'd have an isomorphic problem: updating 12 consumers and 1 producer for the new type in the RPC message.
Having an RPC to migrate means you can put code on the read path to support backward compatibility while migrating. But you can do the same thing with the database - support both fields until all readers have ported to the new field - and since big database migrations are expensive, in practice large systems do migrations that way anyhow. Introduce the new field, start writing to it, back-fill it, port readers across individually (there may be many read locations even within a single service), drop the old field.
If a system evolved from a monolith, one way to partition services is to provide views to limit set of visible columns. That's another point at which you can put shims for compatibility while porting schemas.
First you publish the next version of your service with the new data, then the disparate teams in charge of 12 clients (not you) update the rest of the application, then the old version is retired and every team has coordinated properly. Microservices allow lean coordination, basically just asking every team when they think they'll be ready without anyone messing with someone else's code.
actually you would only have to change the one reader and the one writer.
The services themselves do not differ, but are fanned out. E.G. so 100 workers watch your pdf queue to generate pdfs.
If your zip code is really used by many services, the question is how you want to communicate those objects and there are really a lot of choices out there. GRPC / Soap / Shared model package / and the list goes one.
On the other hand i have obsverved that people push full blown models around just to end up reading one field of it in their monolith. I believe the PHP frameworks of the past where the biggest offenders like that.
When i was working with sugarcrm for instance, it was not uncommon to pass huuuuge objects between functions because you needed 1 field. It's PHP so by default, function arguments are passed by value, so you can already see where the fun begins.
My point is precisely that: if you have to handle large quantity of state (e.g.: travel agency handling diverse item bookings to sell as a complete holiday packages - note that this includes having conflicts on inventory, like "cruise cabin categories" or "hotel rooms") microservices add latency by "replacing function calls with RPC", and gain you... an unspecified advantage in terms of... deployment? The possibility to have hundreds of developers working on the system in parallel?
I have always worked on medium-size monoliths during most of my career, and "ah, if we had 137 developers all working on this everything would be magically solved, but alas, we have a monolith" was a sentence I uttered (or heard) exactly 0 times so far.
Thank you for this as this is exactly how its implemented in a recent project of mine.
I didn't want to put unneeded information in the first post but here i have some space ;)
What I ended up doing was:
- two sqs queues, one for processing one for failures after 3 tries
- the go backend reads from its queue and the message contains a json object with all the information needed for pdf generation
- the backend posts the file back to the "internal api" of the application, which saves it in s3 and also sets processing to true which means the user can go and download it.
This results in:
- my pdf generated the way i want it (sorry i dont actually make stardust)
- the queues are only reachable through the assosiated iam roles
- if a job failes it will be added to the failed queue, which sends me an email
- if i recieve an email by the failed queue, 90% of the time i have found another edge case
- since the message already contains everything that i need to build the pdf, my edge case has just defined itself
- for the postback i supply an api token that is only valid for that one file
All this means my client has total flexibility in deployment.
Well, it's all about the data responsibility: who is the owner of the data, how others can access the data. Once you have defined these, you see that you can "share the access" with other microservices (for example read only mode on a view), as long as the ownership and the access rules are preserved.
Yes, a view would be exactly how I would address the problem I described in my other answer above ("What happens if I need to change zip code representation in a data source that is read by multiple microservices?").
But this also means that we are now back into "YesSQL" territory, and specifically that we have to use a RDBMS which allows us to create Views.
Goodbye NoSQL, goodbye Key+Valus datastore. (Or maybe you will just create an extra "newZipCode" and mantain it in parallel with "ZipCode" allowing every other consumer to adapt at their leisure...?).
So it is another step back to "more traditional ways" to design a system... or a recipe for a disaster as soon as you start dealing with significant quantities of "state".
> "monorepo/monolith/microservices/etc" is -just- the way you organize your code
I don't think this is true.
I think — at least as far as I've observed — microservices in practice means replacing various functions calls with slow and error-prone network requests.
> There is a lot of talk about monoliths vs microservices lately.
Actually it's been going on for years, and it's always the same argument. People think they're thought leaders for saying "Start with monoliths and only move to microservices if you absolutely need to!"
It's a pretty obvious conclusion to anyone who has worked in both environments, or have had to migrate from one to the other, so it's not particularly insightful. And yet here were are, 5+ years later saying the same thing over and over again.
> It's a pretty obvious conclusion to anyone who has worked in both environments, or have had to migrate from one to the other, so it's not particularly insightful.
It should be obvious, but apparently it's not. So many architecture astronauts drinking the kool-aid and making a mess. Premature microservices can easily kill a product.
>monorepo/monolith/microservices/etc" is -just- the way you organize your code.
It’s also about how you deploy your code. If you have 1000 micro services do you have 1000 deployment pipelines? If so how do you manage those pipelines? If not, you sacrifice independent deployment of each micro service.
I don't even think the deployment pipelines are the problem. If you have 1000 microservices you have 2^1000 possible states of your application based on any of those services being up or down (reality is much more complex). It is genuinely hard to keep that number of services up and running so you then need to be extraordinarily diligent about how you handle failures.
If you have that many services you could define a set of generic pipeline versions that you maintain. Then you can tag each service with which type of pipeline to use. The rest is solved with automation.
yes and no. i totally get what you are saying but this problem has been solved lately in my opinion.
Also deployment is part of code organistaion no? i like to point out the fact that i explicitly mentioned you do not need to decide, you can use both at the same time but i would like to try to answer your point anyway:
Maybe I am just spoiled with go and github, but those deployment pipelines can be directly integrated into your repository. The same way I can counter argue that your build ci and deployment phase will take significantly more time too and if you change a variable from uppercase to lowercase you will wait for minutes too.
I come from a c# background a long time ago and this has been true with it for eons:
https://xkcd.com/303/
Another thing that I have noticed is that its easily scriptable too. What I end up doing is to provide a makefile into my repository too. This way i can define the build process amongst other things eaisly.
In the end: We have a predefined spec that will create the desired state on a host of our chosing.
Ansible really does not care if you deploy 1 or 1000 "services" for instance!
Tools Like ArgoCD will also deploy stuff when you commit to your master.
There is tooling for everyone available, but what happnes quite often is that the people in charge of the company expected the developers to be all knowing entities that need to define everything end to end. The Kubernetes Space is vast because we are still defining and communication its future.
But recently? I am trying to think of something that would not be completly automateable in 2021
Clarify the microservice architecture concept with "how you are going to deploy your system", as per your example, is exactly what I'm trying to explain to my teams since the microservice architecture inception. There are too many concepts conflating into the "microservice" term: code architecture(separation of concerns), source code organization, deployment, etc. This is very confusing, which is the reason why it's now common to say that microservices are "hard".
Exactly, it doesn't have to be one or the other. So far, i've been using a monolith for core functionality and microservices for independent processing tasks.
That way, i can attach as much functionality as i want without bloating the main app and processing scales with demand.
Unless you have a strong technical or organizational reason to use microservices, using microservices is just more work to achieve the same results.
Organizational reason would be multiple people/teams who don't want or can't talk much to each other, so they develop pieces of a larger system as relatively independent projects, with clear API and responsibility boundaries. Frontend/backend style web development is an example of such approach, even though we don't typically call these parts "microservices".
A technical reason I can see is some component of a system actually having to be written in a different stack, or to be run in a separate location for business reasons (separate physical computer, separate VMs or containers don't count). Like a firmware running on an IoT system. Or most of the system uses python, but there's a really good library in java for solving some very specific problem, so let's use it.
If neither of these reasons stands, you don't have a microservice architecture, you have a distributed monolith. You just replaced some function calls with RPC. RPC call which takes a much a longer time than a local one, and can randomly fail. Most of your microservices are written in a single stack, so you refactor common parts into a library, but then different services are stuck to use different versions of this library. You end up with a much slower and a much more fragile system which is harder to work on for no good reason.
How about deployment speed? If I’ve got a microservice collecting events off a queue and writing a csv out to S3 on a schedule, it’s really nice to be able to add a column and deploy in minutes without having to rebuild and deploy a giant monolith. It also allows for fine grained permissions: that service can only read from that specific queue and write to that specific bucket.
People throw around “distributed monolith” like it’s a dirty phrase but I’ve found it actually a very pleasant environment to work in.
That's a fallicy. You've optimized for one use case, but you've made everything else more complicated as a consequence.
Deploying a single monolith is faster than deploying 10 microservices, especially if you find yourself in the model where your microservices share code, you've ended up with a distributed monolith instead of microservices.
> You've optimized for one use case, but you've made everything else more complicated as a consequence.
Yes, but that use case happens to be something that I need to do 5 times a day - make a small (<200 line) code change to one small part of the distributed monolith, and deploy it immediately.
This also means that if something goes wrong, I can rollback those changes immediately without rolling back the work of any of the other 50 engineers in the company.
Very little signoff required, very small blast radius.
Say your little service just changed how it parsed backticks. Now that innocuous change may affect none of the immediately connected microservices but another services three hops away relied on the old behavior of your parser through some complex business rules driven logic. Now go test and later troubleshoot that vs standing up a single monolithic jar on your laptop and seeing the exception stack trace tell you exactly what you broke.
You would only deploy 10 microservices if all 10 changed at once. Why are all 10 changing at once?
> especially if you find yourself in the model where your microservices share code
Good architecture decisions trump the monolith vs microservices argument. I'm not saying cross-service shared code is inherently bad, but it does have a bad smell.
I don't know, is it really harder to deploy 10 services? Isn't it all automated? The organization overhead is lower because you can leave most of the app untouched and only deploy what you need to.
You could screw it up and make it harder for yourself but its not guaranteed either way.
> People throw around “distributed monolith” like it’s a dirty phrase but I’ve found it actually a very pleasant environment to work in.
For ones I saw, the answer to the question "how do I run our project on a single laptop?" is "haha, you don't". That makes features which could take hours to implement take weeks. But deployment is 5 minutes instead of… 15? Not too worthy of a trade-off.
Your CSV writer probably has both technical and organizational reasons being an independent unit of development. Or, in other words, something which appeared organically, rather than someone deciding months prior before a project even started that authentication and user profiles should live in two parallel universes.
You bring up an excellent point. As of now, it is impossible for me to run my company's backend ecosystem on my machine locally. What we do could easily be done by one simple monolith, but our eng. lead is obsessed with overengineered microservice architecture, so nobody can actually develop locally. Everything on the backend is done by a bunch of lambdas behind an API gateway.
I got so burnt out developing in that environment that I asked to become a pure frontend dev so that I wouldn't have to deal with it anymore.
Often with monoliths, once they go into production service at scale, the deployment can be "3 days, arranged 4 weeks in advance, requiring 9 sign offs including one VP"
I would call that a megalith, and not to be cute but to give a sense of scale which is useful. I think there is a point in which a monolith grows so big that is painfully obvious that size became an obstacle greater than the benefits.
Pain tolerance differs so the label gets applied at different sized monoliths.
> implement change on that branch (in this case adding a column), test
> deploy to prod
I realize the build and deployment process may be more complex than that making it hard... but it doesn't have to be.
I agree that a microservice OR even another system (a collection of services) is a good solution if you need to make quick iterative changes, and you can't do so with your current system.
That workflow is exactly what I do, but it's on a small codebase rather than on a big one.
The benefits of working on a small repo include:
* Faster compilation (only depend on exactly the libraries you need)
* Faster testing (only need to run the tests which are actually relevant)
* Easier to understand for new joiners because there's just less code to sift through
* Faster startup (this is probably java specific - the classloader is slooow)
* No (okay, fewer) rebase wars
* Easy to see what's currently being worked on because the Pull Requests tab isn't 200 long
* Very fast rollbacks without needing to recompile a git revert (just hotswap the container for that particular service). You can't do this in a monolith without risking rolling back someone else's important changes.
I worked at a place that had a few dozen services and one DB, worked on by dozens of developers. It was amazing. Oh and one repo, that made a huge difference.
one strong operational reason I have seen recently is resource management.
The monolith where most API endpoints are instant and use constant memory, but some use much more memory and can be slower... is tough.Like if you just give a bunch of memory to each process now you're overprovisioning and if you try to be strict you run into quality of service issues.
If you split out homogenous API endpoints into various groups you now have well behaved processes that are each acting similarly. One process could be very small, another could be much larger (but handle only one kind of request), etc...
of course the problem with standard microservice-y stuff is now you gotta have N different applications be able to speak your stack. The idea of a monolith with feature sets is tempting... but also can negate chunks of microservice advantages.
Ultimately the microservice-y "everything is an API" can work well even as a monolith, and you would then have the flexibility to improve things operationally later.
At the same time, you have to be at a pretty huge scale before resource over-provisioning really hurts the bottom line. You can buy a lot of compute for the price of a single engineer's salary, and it usually takes more than one engineer to support a microservice architecture. Most applications hit problems scaling the database vertically long before they exhaust resources at the application level.
Hmm I get what you’re saying of course but in certain domains (think B2B SaaS) you might be running some compute-intensive stuff enough to where the differential is an issue.
Imagine 95% of your workload can run in 100 megs but 5% requires 1000 megs. You can overprovision of course but in a world of containers if you can isolate the 5% and route it you’re going to have a lot less in terms of operational headaches
I mean, this is kind of how microservices should be done. Start with a MVP monolith then carve off microservices if needed (performance or large team size).
The problem is when the lead dev has been huffing the architecture paint too hard and starts prematurely spinning up microservices because it feels good.
Guy who hates micro services here (worked at a startup that tried to adopt them because the CTO's consulting friend who works in enterprise convinced him of how great it was).
From what I can tell, micro services are primarily a solution to an organizational problem and not a technical one and I think people trip over this. If you have 600 developers working on the same back-end then you can benefit from micro services, because you can't possibly fit 600 developers into one standup. You'd never be able to coordinate.
There are rare technical reasons to choose micro services and there is value to micro services when they're used correctly but if you don't have those precise problems then you don't need micro services. List would include situations like having a subsystem that utilizes far greater resources then any other sub system and would be more cost effective running on specialized hardware
Most importantly tho for going 0 to 1 is context. Context is king and always will be. Start ups should not do enterprise services patterns because their not enterprises and neither is the guy coding alone in his bedroom. Your not Netflix. Your not Google. Those companies are going to be so large that they'll encounter every kind of architectural problem. All a start up needs to do is communicate, release, communicate, iterate, repeat. It's the equivalent of asking yourself what did NASA engineers do to send a rocket to the Moon when you're just a guy trying to start a car and get it down the road. Baby steps, for the love of god, baby steps.
Things like upgrading the language version should not require coordination between multiple dev teams.
Of course there's 100s of permutations that work. Optimize for your situation. And if you have no clue what the right call is, go as simple as possible until it breaks on you.
Absolutely agree. The point of microservices is to separate the concerns from an organizational point of view, not for technical reasons.
Most of the advantages attributed to microservices can also be achieved with a monolith architecture using a sane and rigorous design.
I find it frustrating at work when I see teams of 50 people having issues coordinating work because they have a monolith, while we often unnecessarily spread a team of 5 people among 10 microservices.
> There are rare technical reasons to choose micro services
Is this just the difference between working in an infrastructure oriented space vs. product oriented? in infrastructure i find that it is often the case that most logical applications should be decomposed into multiple services given scaling and blast radius concerns where having everything in a single application would increase the impact of outages.
It's not just communication issues. Things like framework and library updates become extremely difficult when the app is so large it is beyond comprehension for a single person. Multiple smaller upgrades are easier than one massive one. Especially when you have people working on it at the same time you try to update it.
Not only that, extremely risky. It's an "all or nothing" approach versus starting at low risk and moving up.
I think this is really important to remember with log4j and other vulnerabilities cropping up. It really sucks updating a single dependency version and having your dependency tree explode in conflict. It sucks even more when that prevents you from updating all your apps since they're all part of the same monolithic codebase
I’ve heard before that microservices deployment scheme solves one particular task: if you get traction, you’ll be ready for scaling. If you can’t do that you are already dead, cause being unable to get 10x more users with a click (when they come) means your competition will do that instead. Is that still true?
No it isn't and to my knowledge it never was. My previous cto seemed to believe this too but dealing with greater throughput can be handled in several ways and the first would be scaling your infrastructure. If you're using aws you can do this easily with elastic beanstalk, a load balancer and some triggers so that the system knows to create a new ec2 instance. The thing is tho your application would have to be built in such a way to support this. The more stateless your backend service the better for this and that usually comes down to session management. If you're using jwt and your backend doesn't keep any sessions your probs good. If your backend is keeping sessions then you need that centralized otherwise you'll have multiple instances using different session information and if your load balancer is directing traffic to instances on a round Robin or bases on based on throughput or something you'll end up with some confusing results on your client device.
So you have to do extra work to make a monolith scale but micro services doesn't have to be that extra work. Much cheaper to spin up redis and have all your backend instances use redis for caching sessions, etc then it is to split your app in to micro services.
Monolith and Micro-services at different times during the progression of a business can have their places. I have experienced the pains of having to work with both within the same company over 9+ years and here is what I think are forces that can pull you in either direction.
A monolith makes developing initially a lot easier. Over 15 years though, you are bound to have developers of various calibre leave their mark on it. Many don't even understand good modelling and inevitably drive an otherwise healthy monolith with well defined boundaries into a soup of couplings between domain concepts that should not know about each other. In theory though it is totally possible to have micro systems within the same monolith code if you model things that way.
Eventually, your team will flip the table and start thinking how to avoid the problems they are having with the monolith and decide to do down the micro-services way. In most situations developers are likely to face a lack of commitment from the business to spend time/money on a technical problem they do not understand but have to support the development of for a lengthy period of time. Most developers will compromise by building pseudo micro-services without their own databases which send requests to the monolith where the core business logic had to stay.
The benefit of micro-services IMO is driven from being able to separate business domains in a way that keeps each responsibility simple to understand for a new comer and compact enough to not hide a great deal of complexity. It's worth saying this is a very hard thing to achieve and the average developer shop won't have the experience to be able to pull it off.
This is all to say, regardless of Monolith or Micro-services architecture, the key is experience and discipline and without a good amount of both in your team the outcome is unlikely to be a success over a long enough period of time. This is the curse of a business that lives long enough.
In some languages, you can enforce boundaries within a monolith nicely using the build system. The key is to break the build up into a hierarchically structured set of libraries somehow where each library only gets to use those libraries it is allowed to depend on architecturally. Independent bits of business logic would need to go into libraries that cannot "see" each other when they are compiled. Everything would still be part of a single build process and still be linked together into a single big program at the end.
The exact method depends on the language. In C/C++, you'd limit the include paths for each library. In C#, you'd have to compile different assemblies. And so on.
I think you didn't quite get the point of engineers of different levels of quality, talent and opinions working on the same monoliths.
Eventually they tear down any boundary, even those in the build system.
Developer discipline is something that eludes many companies for lack of enough high quality engineers and awareness for the problem in upper management. It's easier to quibble over formatting guidelines.
But the problem with this is it's a technical solution to a social problem.
If your developers are writing crap code in a monolith they're going to continue writing crap code in microservices but now you have new problems of deployment, observability, performance, debugging, etc etc.
As an aside I have a sneaking probably ahistorical suspicion microservices hype happened because people realised Ruby (or similar trendy dynamic languages) often ended up being a write only framework and rather than try and recover some codebase sanity people would rather abandon the code entirely and chase the new codebase high.
> If your developers are writing crap code in a monolith they're going to continue writing crap code in microservices but now you have new problems of deployment, observability, performance, debugging, etc etc.
Anecdotally I witnessed this once. There was this huge ball of mud we had that worked okay-ish. Then the architects decided "hey microservices could solve this", so we started building out microservices that became a distributed ball of mud. Every microservice shared and passed a singular data model across ~30 microservices, which made things interesting when we needed to change that model. Also, we took mediocre developers and asked them to apply rigor they didn't have in developing these services to that they were "prepared" for the failures that happen with distributed systems.
The big upside to management though was that we could farm out parts of the system to different teams on different sides of the planet and have each of them build out the microservices, with each team having different standards as to what is acceptable (what response messages look like, what coding standards should be, ect). All of this was less of a technical problem and more of a managment one, but we felt the pain of it as it was made into a technical problem.
I know. Which is why I have architectural tests that scream if someone uses the wrong module from the wrong place (adds a dependency that isn't what we want).
Of course, any dev could just remove the test, and by that tear down that boundary too. But that test being removed is much more obvious to the code reviewer who would not have noticed the dependency being snuck in. The architectural tests can also contain comments that form a high level documentation of the architecture. Unlike "architectural documents" in the form of some up front word document, this type tends to stay up to date.
In .NET I do it om the binaries, I reflect over the assemblies and list the dependencies.
In the same run one can also validate that there are no incompatible transitive dependencies.
Within assemblies (between namespaces) is much harder unfortunately. That means assemblies have to be made to make module boundaries obvious, even though it’s a bit of an antipattern. There are tools such as nsdepcop that watch namespace dependencies, but it’s brittle and a lot of work.
So your argument is that bad developers will mess things up. I am sure most would agree with that argument. What does that have to do with monolith vs. micro-services? Bad developers will make a mess of micro-services as well.
You most definitely can enforce boundaries in libraries. Simple make sure that each library can compile and work with only the dependencies available it is allowed to have.
Code review is usually a bad place to catch design flaws, unless it's done early. Usually a bad design means re-doing a lot of the work. That means either getting management buy in or not hitting expected velocity. If not communicated well it can also lead to bad feelings between the writer and reviewer.
Where I work the code review process has mostly broken down. There is just no bandwidth to get anything done beyond the most basic sanity checks. To actually make someone improve their code I'd need the time, energy, authority, good will and probably other things to explain and teach the other programmer what and how to do it differently. But shit needs to get done and if the code works it's hard to convince management why I should invest so much of our time essentially redoing work that doesn't need redoing.
I suspect that this bandwidth problem exists elsewhere also.
If a team is not disciplined/skilled enough to build a well structured monolith the chances they can build and support a microservices solution which is a distributed system with orders of magnitude more failure modes and requires an order of magnitude more tooling and testing is pretty much 0.
> Eventually, your team will flip the table and start thinking how to avoid the problems they are having with the monolith and decide to do down the micro-services way.
So, yet another comment saying that modularity requires a distributed system.
That's not what I was saying. I'm saying your team is likely to want to do something drastic and is likely to go down the micro-services path. This is not a comment on correctness. This is a behavioral prediction.
Maybe I'm too old, but I don't even want to have to worry about all that. I think in terms of functions and I don't care if they are being called remotely or local.
That was the promise back in the day of J2EE, and it seems to me Microservices are just a rehash of that same promise.
Which never really worked out with J2EE - it was mostly invented to sell big servers and expensive consultants, which is how Sun made money?
These days I sometimes don't even bother to make extra interface classes for everything. If I need it, I can still insert that, but no need to do it up front.
And "DevOps" is just a ploy to make developers do more work, that was formerly done by admins.
Writing every Microservice in a new language also seems like a huge headache. The only bonus is that you can attract developers by being able to promise them that they can work with some shiny new technology. Maybe that is actually the main reason for doing that?
Otherwise, again, perhaps it is my age, but I prefer to minimize the dependencies. Do I really want to have to learn a whole new programming language just so that I can fix a bug in some Microservice. I personally don't want to.
DevOps is a sad story both for devs and for ops. It was supposed to treat operations as a SW problem thus take away the toll and draw devs in. In reality for most places it either means that devs also do pipelines and operations or that operations were rebranded and are using "DevOps" tooling to do operations.
To me, DevOps means SDEs can troubleshoot, diagnose, and fix app-related prod issues. SREs don’t have the app-level knowledge for that, so it’s wasteful for them to be involved. SREs should be paving paths for SDEs and responding to infrastructure ops issues.
I also think silos are unhealthy. SDEs need to have some overlapping responsibilities with SREs. Otherwise, you’ll likely have a “throw it over the fence” culture
I guess the idea is fine but the implementation tends to be about using 1 employee in multiple roles and working 60-hour work weeks while paying just one salary. You got DevOps, DevTest, then DevSecOps, etc. It´s insane, the next step is probably being a DevTestSecOps sleeping under the company desk.
I am 100% not interested in ops. Deploying is handled by other way more qualified than I.
I never expect a JS expert to build elixir, I don't expect an elixir expert to write bash, and I don't expect a bash expert to know about switches and cabling.
I don't understand where you draw the line..
Should the designer who also crafts the css do ops too?
I think high quality comes from specialists. Sharp knives in the hands of pro's
I suspect you don't mean it like that, but developers have to care just a little about operations. There's the classic stuff about developers who build stuff, because they didn't realize that ops could do the same with a few lines in a web server config, so they waste weeks on trivial stuff. There's also the issue that if you expect databases, queues, disks and so on to just be available, while not thinking about how you use them, then you can get bad performance or cause downtime and crashes in the worst case.
The initial idea of DevOps also seems to have been twisted to having developers do operations, rather than having the two teams work in tandem. If we absolutely must combine the two roles, I'd be more comfortable having operations do development work, but that hardly ideal either.
These days DevOps is used as a bludgeon to fire sysadmins and make devs pretend to manage infrastructure. The idea from the corporate viewpoint is to save money by not having anyone to handle systems stuff, just shove into "The CloudTM" as serverless microservices, then just restart it when it falls over instead of troubleshooting.
Yes, I'm cynical. Having to use javalike stuff to do what should be a simple shell script will do that to you.
Just like the preceding question, it depends on context.
In a huge organisation, it can work well to be very specialized.
In, for example, a smaller growing business with an engineering team of < 10, having a proper understanding of the context in which your code will run is a game-changer.
The goal is to sustainably build software, which only can be done if ops and maintenance is regarded a goal, a pillar to build upon.
Like TDD the design of software changes when you have experience with ops/maintenance. Take logging/tracing: what to log or trace is first and foremost a matter of experience with maintenance/ops!
Building in logging/tracing/monitoring from the get-go because you know that you want to know what goes wrong, when it goes wrong etc...
can one regard oneself as expert if you know only a part of the domain- I think not!
And this person is a DevOps person, in every org I've ever worked in.
Maybe "before my time" it was a webmaster or other admin. I think of "dev ops" as an umbrella term subsuming many specific titles and roles from the past, much like "data science".
It was a bit cheeky and off-hand. I am certainly interested in DevOps. If things can be automated, it is usually a good thing.
Still I think having extra admins for server maintenance, or nowadays DevOps, also makes sense in my opinion. In theory I should be interested, but in practice, somehow I can not really get passionate about the details of how to host things. So it would be better to have specialists who really care.
Ultimately there are also a lot of aspects that are different from the programming problems. It's difficult to keep up to date on both fronts.
I don't know. I totally agree on the premise that developers should be involved in ops as well.
But I'm a developer by heart and my heart aches whenever I see what we devs have wrought in the ops space. (Insert joke about the CNCF technology landscape map.)
There's just so many tech/tools/etc. involved that just reasonably "doing the ops stuff" on the side seems way unrealistic.
Sometimes I feel that all we've accomplished was job security for legions of consultants.
> But I'm a developer by heart and my heart aches whenever I see what we devs have wrought in the ops space.
At the same time, I am conflicted. I don't care for the toss-it-over-the-wall approach that used to be the norm, but I also don't like having dev's have to take more on than they are capable.
In an ideal environment, I would like to see a crashing together of developers and operations people on teams. What I mean is that for a given team you have several developers and one or two ops folks. This way there is less of an us vs them sentiment and teams can be held accountable for their solutions from ideation to running in production. Finally while it's not the sole responsibility of the dev's to manage their DevOps stuff, they will have more knowledge of how it works, and get to put input and ideas to it.
This was the original intent of DevOps - a cultural shift that put sysadmins/ops people into the dev team, thus have a better feedback loop between dev and prod, and tear down the wall between dev and ops. But now it's being used as a way to eliminate ops people and push it all into dev.
> "DevOps" is just a ploy to make developers do more work, that was formerly done by admins.
Developers cost a lot more than admins, so this seems silly.
Moreover “devops” people are supposed to be developers, just ones with a whole bunch of networking and admin skills as well. But few places outside Google have real SREs.
Former Netflix engineer and manager here. My advice:
Start a greenfield project using what you know (unless your main goal is learning) keeping things as simple as possible.
Microservices is more often an organization hack than a scaling hack.
Refactor to separate microservices when either:
1) the team is growing and needs to split into multiple teams, or
2) high traffic forces you to scale horizontally.
#1 is more likely to happen first. At 35-50 people a common limiting factor is coordination between engineers. A set of teams with each team developing 1 or more services is a great way to keep all teams unblocked because each team can deploy separately. You can also partition the business complexity into those separate teams to further reduce the coordination burden.
> Refactor to separate microservices when either: 1) the team is growing and needs to split into multiple teams
I've heard this before, and I just don't get it. I've worked on multiple monoliths where hundreds of engineers contribute, and it's fine. You have to invest a bit in tooling and recommended patterns to keep things from going crazy, but you kind of need to do that either way.
> At 35-50 people a common limiting factor is coordination between engineers
So don't coordinate? If engineers working on different aspects of the codebase need to coordinate, that feels like something is wrong architecturally.
Ok, maybe a better way to say it is that having teams independently develop services is a good way to reduce the coordination tax if you have high coordination costs. If your environment doesn't have that problem I guess this doesn't apply.
Coordination between engineers was a frequent activity everywhere I've been regardless of how well built the systems were. For example: a new requirement for the customers signing up in a given country to have features X, Y, and Z enabled. In a large organization there are probably a few teams that will be involved that make that happen. The question is how to coordinate them.
Many companies try to solve it with top-down decision making, prioritizing certainty but hampering productivity (some teams have to wait) and strictly limiting risky innovation (nothing can be done without approval).
Independent teams (each developing independent services and acting without top-down approval) is a different way to coordinate development that values productivity (keeping everyone unblocked) and innovation (finding better ways of doing things).
> You have to invest a bit in tooling and recommended patterns to keep things from going crazy, but you kind of need to do that either way.
Aha, here's a difference. If we're talking about the same things, common tools and patterns don't need to be enforced. Independent teams can pursue very different patterns without needing to agree with each other. This is a big advantage if you don't like being told what to do by people who are pretty remote to the problem you're solving. Different people react differently to that. Netflix teams tended to be staffed with very experienced and skilled people (no junior or mid level positions) so there wasn't much value in one engineer dictating architecture, patterns, or tooling to another. Nearly all tooling was opt-in, and the good tools were the de facto standards. But if you came up with a better tool or pattern, you had the freedom to try it out. This is how independence fostered innovation, and why microservices were useful in such an environment.
> Independent teams (each developing independent services and acting without top-down approval) is a different way to coordinate development that values productivity (keeping everyone unblocked) and innovation (finding better ways of doing things).
I've had the opposite experience. In the monolith, anyone can make the necessary changes, because it's all one codebase that everyone is familiar with. At most, you might need some help/pairing/approvals from experts in particular areas, but in general any team can work independently.
By comparison, in the microservices world, many teams either don't want you to touch their service, or are using a tech stack so unfamiliar to you that it would take too long to be productive. And there's a rat's nest of interdependent microservices, so you end up begging other teams to adjust their roadmap to fit you in.
> Independent teams can pursue very different patterns without needing to agree with each other.
I see this as more downside than benefit. If everyone is using different tech stacks, it's harder for people to move between and contribute to different teams. And you end up with situations where one team uses Java, while another uses Scala, which brings in extra complexity to satisfy what are essentially aesthetic preferences.
When you had hundreds of engineers contributing, how did you manage releases?
We have a large number of teams working on a shared monolith, and a large number of teams working with separately releasable services.
One of our main drivers transitioning to the latter is the productivity gains that the teams get when they can release just a single set of changes on their own and release them on-demand (and during business hours).
For us, we release the monolith nightly with everyone's changes in it (not to all customers, but concentric releases). We find that the teams that are able to release on their own are happier and more productive.
At the place where this worked well, we released small changes throughout the day. We released the monolith around 50 times per day. Developers release their own changes in small groups. We deployed the main branch, and only used short-lived (< 1-2 weeks max) branches. Used feature flags to control when customers actually saw the features (as needed.)
Presumably the idea is that you make all changes backwards compatible with the currently running version and continuously roll them out in a progressive manner (like 1% of users get the new version, then 10%, etc.).
What would stop you from splitting the monolith into libraries and having different teams maintain those libraries? It seems to give you the organisational advantages you want without paying the added networking complexity cost. The way I see it, a micro-service is a library + networking. Remove the networking and you have a library that can be compiled into a monolith.
This is exactly how I feel. Great to hear it from someone with Netflix experience. So, so many organizations jump headfirst into microservices before they even realize what that entails just because they heard it's the trendy new thing Netflix is doing.
If you make your code and architecture simple from the get-go, then you can refactor to microservices when you know you really need it.
It wasn't really a horror, and those charts are a little misleading. I'll try to explain.
Plenty of others who were also there at the time might see it differently, but IMO this was a trade off to get higher productivity and higher resiliency at the cost of higher complexity. So it was a feature not a bug.
When the cloud migration began in 2010, we had a straightforward architecture of a handful of Java monoliths and a giant Oracle database all running in a local data center. The DVD service made all the revenue but the future was the streaming service. It would turn out over the next 10 years we would need to grow engineering headcount over 10x to support the growing business complexity. Some of that complexity was foreseen and the intent was to address it with a culture analogous to a cellular ecosystem.
There are many successful designs in nature that evolved to a complexity beyond our current understanding, like the human body or even the protein signalling pathways within individual cells. These systems tend to have excellent homeostasis and survivability in the face of novel stimuli. Notably, each cell is fairly independent: it can receive signals but it acts on its own, and its internal complexity can be hidden from its neighbors.
We created a culture where teams were independent units. The sayings were, "loosely coupled and tightly aligned," and, "no red tape to ship." This is the opposite of an architect making top-down decisions. Instead we prioritized local decision making within teams which tends to be fast and effective, so this worked out well for productivity. But the side effect is that the overall system grew a bit beyond the limit of any one person to easily understand it. Thus the automated tooling to visualize the graph. But the dependency graph itself was rarely a problem. Any given developer was usually able to trace all the requests they were responsible for all the way down the dependency graph, and that's what really matters for development and debugging. Generally, no one needed to grok the zoomed out picture -- even during outages, problems were typically root caused to a particular service, and not related to the whole dependency graph.
But the dependency graph makes for a compelling graphic, so it gets people's attention. The real story is, "productive organizational culture made of independent units."
The only issue I have with microservices is when you're dealing with atomic things. Like in a monolith you'd probably just stick it all in a database transaction. But I can't find any good reads about how to deal with this in a distributed fashion. There's always caveats and ultimately the advice "just try not to do it" but at some point you will probably have an atomic action and you don't want to break your existing architecture.
Distributed transactions? Two stage commits? Just do it and rely on the fact you have 99.9% uptime and it's _probably_ not going to fail?
The need to do a cross-service atomic operation indicates that you chose the wrong service boundaries in your architecture.
And, since it's microservices, it's near impossible to refactor it, while it could have been a simple thing to reorganize some code in a monolith (where it is also a good idea to make sure that DB transactions don't span wildly different parts of the source code, but the refactor to make that happen is easier).
This is the big downside of microservices -- not the difficulty of doing atomic operations, but the difficulty of changing the architecture once you realize you drew the wrong service boundaries.
Microservices is great as long as you choose the perfect service boundaries when you start. To me, that's like saying you always write bug-free code the first time -- it's not doable in practice for large complex projects -- hence I'm not a fan of microservices...
Yes, this sort of explains my situation. A requirement appears down the line that just completely breaks the service boundaries.
An example being something like an online gun store, you have a perfect service that handles orders. It's completely isolated and works fine. But now, 2 years later some local government has asked you "whenever someone buys a gun, you need to call into our webservice the moment the order is placed so we know a gun was sold, and you need to successfully do it, or you can't sell the item"
Now you've got a situation where you need an atomic operation, place the order and call the web service, or don't do it at all. You could say just place the order, do the web service call asynchronously and then delete the order afterwards. But you might not have that choice depending on what the regulations say. You can't do it before you place the order because what if payment fails?
The order service should not have any idea about blocking orders in specific scenarios. And now the architecture has broken down. Do you add this to the order service and break it's single responsibility? Will this be a bigger problem in the future and do you need to completely rearchitect your solution?
I would say this is another problem. If an external call to a web service is involved, then you can NEVER have an atomic call in the first place. One always needs to just have a state machine to navigate these cases.
Even with a monolith, what if you have a power-off at the wrong moment?
What you are describing here is to me pretty much the job description of a backend programmer to me -- think through and prepare for what happens if power disappears between code line N and code line N+1 in all situations.
In your specific example one would probably use a reserve/capture flow with the payment services provider; first get a reservation for the amount, then do the external webservice call, then finally to a capture call.
In our code we pretty much always write "I am about to call external webservice" to our database in one DB transaction (as an event), then call the external webservice, and finally if we get a response, write "I am done calling external webservice" as an event. And then there's a background worker that sits and monitors for cases of "about-to-call events without matching completed-events within 5 minutes", and does according required actions to clean up.
If a monolith "solves" this problem then I would say the monolith is buggy. A monolith should also be able to always have a sudden power-off without misbehaving.
A power-off between line N and N+1 in a monolith is pretty much the same as a call between two microservices failing at the wrong moment. Not a qualitative difference only a quantitive one (in that power-off MAY be more rare than network errors).
Where the difference is is in the things that an ACID database allows you to commit atomically (changes to your internal data either all happening or none happening).
Well that's the thing isn't it. As soon as you move away from the atomicity of a relational database you can't guarantee anything. And then we, like you do to, resort to cleanup jobs everywhere trying to rectify problems.
I think that's one of the things people rarely think of when moving to microservices. Just how much effort needs to be made to rectify errors.
You can always guarantee atomicity. You will just have to implement it yourself (what is not easy, but always possible, unless there are conflicting requisites of performance and network distribution).
And yes, the cleanup jobs are part of how you implement it. But you shouldn't be "trying to rectify the problems", you should be rectifying the problems, with certainty.
Create the order, and set the status to pending.
Keep checking the web service until it allows the transaction.
Set the status to authorized and set the status to payment.
Keep trying the payment until it succeeds, officially set the order and set status to ordered.
I really find it hard to believe that the regulations won't allow you to check the web service for authorization before creating the order. If that's really the case then create the order and check, and if it doesn't work, then cancel the status of the order and retry. It's only a few rows in the database. If this happens often then show the data to your local politician or what not and tell them they need to add more flexibility to the regulation.
a) redraw the transaction boundaries, aka. avoid it, or
b) don't do it (see below),
c) idempotency -- so you can just retry everything until you succeed.
You can do distributed transactions, but it's a huge pain operationally.
(b) is not always as far fetched as you might think. The vast majority of Real World (e.g. inventory, order fulfilment generally, etc.) systems are actually much more like Eventual Consistency in practice, so you don't gain anything by being transactional when your Source of Truth is a real world thing. Inventory is a great example. It doesn't matter if your DB says that an item is in stock if there simply isn't an item on the shelf when it comes time to fulfill an order for that item. Tracking such things is always imperfect and there already other systems overlaid which can handle the failure case, e.g. refund the order, place on backorder, have a CSR find a replacement item, etc. etc. (Of course you want reasonable approximate values for analytics, logistics, etc.)
Most people give up and do orchestration instead at some point.
Fortunately there are not that many things in the world that need to be 100% atomic so you can get away with a lot.
For your own microservices you generally have at least the option of "fixing" the problem properly even if it's at great expense.
But then you hit external systems and the problem resurfaces.
You can go crazy thinking about this stuff, at a certain point most business logic starts to look like connectors keeping different weird databases in sync, often poorly.
Pure crud api? Oh that's a database where client is responsible for orchestration (create folder, upload document...) and some operations might be atomic but there are no transactions for you. Also, the atomicity of any particular operation is not actually guaranteed so it could change next week.
Sending an email or an SMS? You're committing to a far away "database" but actually you never know if the commit was successful or not.
Payments are a weird one. You can do this perfectly good distributed transaction and then it fails months after it succeeded!
Travel booking? runs away screaming so many databases.
There are actually some patterns to deal with this, such as Saga - I'm actually working on a project (not open-source yet) related to this specific problem. You can reach me at can@hazelcast.com if you want to learn more.
Sagas are definitely what OP should look into here. Quite frankly, transactions are a bit of a fools errand beyond a certain scale, yet are still treated with some absolute purity from the days of Cobb. If you have multiple changes that need to be made, the "all or nothing" approach makes it really simple to deal with and manage.
Sagas are about recognizing the individual changes that are necessary, and dealing with the success or failure of them at a higher level. This is complicated though as the developer and the business now need to have a specific conversation around what happens if A succeeds and B fails? Does A need to get "rolled back"? Does B need to be retried? Does C need to wait until B succeeds before proceeding? That all brings in a level of complexity and the only answer is to manage and use the appropriate patterns and tools to do so. Wanting to go back to a land where you can just wrap it all in a transaction so that you get one nice boolean indicating success at the end is quite frankly just naïve. The real world doesn't work like that.
Well, there are distributed transaction systems you could use, but usually it's a good idea to ensure things happen inside one transaction in one microservice (also leads to better code in general, IMO - keep your transaction scope as small as possible)
Well, a simple solution is to only have one middleware that uses database connections, and have all other things around it be purely functional. (Though that may mean using their own databases, of course).
What do you mean by that? What if I have 2 services, each with their own databases, but the action is supposed to be atomic. E.g. there's a foreign key that links the data.
They're both gonna do their own database functions.
All ideas that are good in principle, become absurd the moment they are elevated to a kind of dogma, applied to every problem, no matter if it makes sense to do so or no.
Microservices are no exception from that rule, and often repeat the same mistake as OOP did with its promise of "reusable code".
Does it sometimes make sense to break some larger services up in smaller ones? Yes.
Does it make sense to "factor out" every minor thing of the implementation into something that can individually be manhandled into a docker container because at some point in the far future, someone may save a few minutes of typing by talking to that service? No.
Why not? Because on the 1:1000 chance that what the service does is actually exactly what that other thing requires, it will probably take more time to implement an interface than it would to simply implement a clone of the functionality.
I've seen organizations that have hundreds of developers organized in 5-10 man teams, each managing their microservice. I think it tends to happen when a large organization decides to get down with the kids and start to do microservices.
Conway's law enters into it in a lot of ways. Because of the way the people are organized into tiny isolated teams, the code shares that shape too. There is an event horizon one team/service away, beyond which nobody knows what happens or who wrote the code.
What you get is that the services actually don't do that much, except take a request from one service, translate it to an internal format, perform some trivial operation, then translate it to another external format and pass it on to another service. It's a lot of code, but it's not a lot of logic. Add to that maintaining the test and prod environments as code, and suddenly it looks like this is a lot of work, but you've essentially gotten a hundred people to do work that three people could probably accomplish if it wasn't for this pathological misapplication of an architectural pattern.
Going for microservices without a central director role is indeed madness and leads to inefficiency.
My employer has a landscape like that, hundreds of microservices each managed by a different team (some teams manage multiple). However, we have an enterprise architecture group whose job it is to keep an overview and make sure every microservice is meaningful and fulfills a clear role for the organization. Every project presents their architecture to this group as well as a group of peers and this often results in changes that increase cohesion and avoid redundant work. We had a few semi-interconnected monoliths before, and from what I’m told (I joined after the microservice transition) the new way is better.
However, I still wouldn’t recommend microservices to a new team / org starting from scratch. IMHO microservices only make sense when the system grows so vast it cannot be understood in its entirety by a single person.
> However, I still wouldn’t recommend microservices to a new team / org starting from scratch. IMHO microservices only make sense when the system grows so vast it cannot be understood in its entirety by a single person.
I wouldn't go that far. The problem is prescribing a stock design solution to every problem without even considering the problem domain or what benefits it will bring.
There are domains where this style of programming is an absolute benefit, even at smaller scales, and it's really nothing new either. A lot of the patterns in microservice design rhyme rather well with what Erlang has done for decades.
> The exact reason why I compare microservices to OOP ;-)
That's just because some guru -- Uncle Bob, or even more probably Martin Fowler, I think -- some (rather long by now) time ago wrote a lot of examples along the lines of "methods should be at most five, preferably three, lines long" in Java.
If you look for later examples of the same kind of recommendation you'll probably find they're mostly written in funtional languages nowadays, so you could just as well say "that's why I compare microservices to FP".
I agree, but I think this kind of "ad absurdum" endpoint is principally an organisational problem.
Companies are hard. They have histories. Different people with opposing ideas. Dissonance between how the company operates and how it operates on paper. Conflicting personal interests, politics. Stories they tell...
They need some sort of rough-cut ideology, system or whatnot. A way of adjudicating right or wrong. That way arguments can be settled. People can be onboarded more easily. The company's decisions become more legible.
A simple requirement to report on important decisions, the choices available and reasoning behind the choice... stuff like this has consequences. An ethos is more flexible than a ruleset, and provides a way of navigating organisational issues.
People working on their own or in small autonomous groups don't tend to go overboard as easily.
In some companies teams are not divided in a way that follow technical faultlines, but rather after "product owners" so that the only valid divison lines are exteral facing superficial aspects.
E.g. think a car company where you are not organized as "engine team" and "transmission team", but rather "sedan team" and "SUV team", and the engine and transmission just need to happen somehow.
The microservices fad and "every team own their own services" fad combined can really get performance to a halt in such a setting.
Product owners are suddenly the unwilling and unwitting chief architects.
At least with a monolith, everything is reasonably standardized and people from different teams can be expected to contribute to larger parts of it..
In what way do Microservices even help? It seems to me you still have to synchronize to be sure that the Microservice from team B does exactly the things that are specified in the new version?
Is it not easier to have a pull request that says "this will do thing x", you merge it into your monolith, and then you can see in the git log that this version will indeed to x?
How do Microservice organizations even manage that? Is it the reason that Atlassian has a billion dollar evaluation, because people need it to keep track?
Well, not git, but modularity was invented for that.
You can have a modular monolith that works just as well with 100 people as something service-oriented would. The difference lies in the level of discipline needed. It's much easier to "just go in and make that field public because it makes my implementation easier" when you have a modular monolith. With microservices, you are more explicitly changing an external API by doing that.
Yes, it's the same thing. But somehow, psychologically, people feel worse about changing a networked API than making an identifier public instead of private.
Edit: I forgot, there's one more thing: with service orientation, you can deploy in finer grains. You shouldn't have heavily stateful services, but if you do (and you always do!), it can be cumbersome to redeploy them. At that point, it's nice to be able to deploy only the parts that changed, and avoid touching the stateful stuff.
The only thing you need to synchronise is API, no?
Which is where version numbers comes in.
To me, if youre working with a lot of modules/micro services, lots of modules should be able to sit on old versions and develop independently (which is the crucial part for 100 developer scenario)
This is the right take on this. All tech people here that rave on that microservices really make their life easier even though they are working in a small team for an entire product are looking through rose colored glasses while chucking down the koolaid and also not the intended audience. The tech complexity is hardly ever worth it unless you are a large Corp.
Org complexity is a valid point. Sure you can solve it using microservices. But in this particular case (solving org complexity) such "microservice" is akin to a library with some RPC. You might as well have each team developing their "microservice" as a shared / statically linkable lib. Same thing in this context.
I feel like microservices are a solution for magpie developers. It's hard to keep a hundred engineers excited about working in an aging Java stack when there's all these shiny new tools out there. But maybe that's just my perspective, coming from a consultancy firm whose devs wanted to stay on the cutting edge.
I think this leads to silos.
MicroServices written in different lang's mean Java Dev X can't maintain Python service Y.. Not in an efficient way.
What's worse, Java Dev can't move to Team Y without upskilling not only on the service, but also the lang, so they get pidgin holed. She also can't move because she's the only Java dev left.
Except when the team division in the company does not map to any natural service API boundaries, yet it is insisted that each "team" own their services.
Then microservices increase organizational complexity too.
Suddenly product owners and middle management are chief architects without even knowing it.
Except that you do not need microservices to solve organisational problems. You need, as has always been done, to have well-defined modules with well-defined interfaces.
If there’s no organizational barrier (e.g. microservices architecture, separate repos with strict permissions) that will prevent devs from leaking abstractions across technical boundaries, those well-defined modules and interfaces will devolve into a big ball of mud.
I say this with the assumption that the team is large and members regularly come and go.
Microservices is an organizational optimization. It can allow one team to manage and deploy their own subsystem with minimal coordination with other teams. This is a useful thing, but be aware what it's useful for.
If each of your developers manages a microservice, that probably reflects that you do no actual teamwork.
Last five years I'm looking at how my colleagues are struggling to dismantle a king of monoliths to services to enable various teams to move at faster pace and scale different parts independently.
Those people couldn't be more wrong. Monoliths are not your friend. Deployment times will be longer, requirements for hosts where you deploy it will be bigger, you will end up with _huge_ instances to rent because some parts of you monolith want more RAM and some want more compute. You'll have to scale more than you really need to because some parts of your monolith are more scalable than another, etc, etc, etc.
You'll be unable to use different technologies to optimise that one particular use case, because it has to be deployed together with everything else and you totally don't have any infrastructure to call into another service. You'll be stuck with "one size fits all" technological choices for every component in your system.
Monolith is good basically only on PoC stage. Once you have business running, you need to dismantle it ASAP.
Good points, but I want to point out some caveats to a few of them. I overall disagree with your conclusion of "Once you have business running, you need to dismantle it ASAP". I think it's a case-by-case thing, and you're ignoring the significant complexity costs of a microservices approach.
> Deployment times will be longer
Not necessarily. I'd say that when you have multiple changes across boundaries that need to go out together, monolith deployments can actually be faster as you only need to do a rolling update of 1 container instead of N. But if by "deployment time" you mean the time between deploys, I agree. But also...so what? As long as your deployment times are good enough, it doesn't really need to be faster.
> requirements for hosts where you deploy it will be bigger
True
> you will end up with _huge_ instances
Not necessarily. Depends on the tech stack, framework, etc. I've seen .NET (and even Rails) monoliths that are huge and only use hundreds of MB/a GB or two of RAM. But I've also seen Java monoliths using 12GB+ to handle small amounts of traffic, so YMMV.
> You'll have to scale more than you really need to because some parts of your monolith are more scalable than another
A problem often easily fixed with throwing a bit of $$$ at the problem, depending on your scale and per-request profitability.
microservices are such a bad idea for most companies out there.
Specially for companies doing onprem SW that suddenly want to "go to the cloud with microservices" without realising that micorservices is the end destination of a long journey involving serious DevOps and a mentality radically different from onprem.
It is just so easy to get it or implement it plainly wrong that it is a disservice to most company to suggest ms without a huge warning sign.
But...it the CV zeitgeist as Design Patterns and SOLID were a decade ago. These days if you don't do ms and don't deploy to k8s you're not worth your salt as a dev. And if you're a company doing monolith you're not worth the time and money of worthy investors. Our field is pop culture. It's all about belonging and
plain averse for history. Which is why we go in endless cycles of dogma/disillusionment.
I'm sorry if I sound cynic but you get some cynicism when you see the same movie about the great silver bullet the 3rd or 4th time around.
Nice article! Although I think you are overdramatizing microservices complexity a little.
- Kubernetes is rather a harder way to build microservices.
- DB is not an obligatory part of microservices.
- Kafka isn't as well. It's a specific solution for specific cases when you need part of your system to be based on an events stream.
- Jenkins is not necessary, you can still deploy stuff with a local bash script and you need containerization whether you are on Microservices or Monolyth architecture
- Kibana, Prometheus, Zipkin are not required. But I think you need both logs aggregation and monitoring even if you have just a Monolith with horizontal scalability.
Also, all this is assuming you are not using out of the box Cloud solutions.
I originally had my search engine running on a kubernetes-style setup off mk8s.
The code is a microservice-esque architecture. Some of the services are a bit chonky, but overall it's roughly along those lines, besides the search engine I've got a lot of random small services doing a lot of things for personal use, scraping weather forecasts and aggregating podcasts and running a reddit frontend I built.
I'd gone for kubernetes mostly because I wanted to dick around with the technology. I'm exposed to it at work and couldn't get along with it, so I figured we may get on better terms if I got to set it up myself. Turns out, no, I still don't get along with it.
Long story short, it's such a resource hog I ended up getting rid of it. Now I run everything on bare metal debian, no containers no nothing. Systemd for service management, logrotate+grep instead of kibana, I do run prometheus but I've gotten rid of grafana which was just eating resources and not doing anything useful. Git hooks instead of jenkins.
I think I got something like 30 Gb of additional free RAM doing this. Not that any of these things use a lot of resources, but all of them combined do. Everything works a lot more reliably. No more mysterious container-restarts, nothing ever stuck in weird docker sync limbo, no waiting 2 minutes for an idle kubernetes to decide to create a container. It's great. It's 100 times easier to figure out what goes wrong, when things go wrong.
I do think monoliths are underrated in a lot of cases, but sometimes it's nice to be able to restart parts of your application. A search engine is a great example of this. If I restart the index, it takes some 5 minutes to boot up because it needs to chew through hundreds of gigabytes of data to do so. But the way it's built, I can for example just restart the query parser, that takes just a few seconds. If my entire application was like the query parser, it would probably make much more sense as a monolith.
> - DB is not an obligatory part of microservices.
If the microservices don't have their own store, but are all mucking around in a shared data store, everything will be much harder. I wouldn't even call that a microservice, it's a distributed something. It can work, sure.
Back when I was studying CS in the early 90s, it wasn't obvious at all that I am going to work with a DB anytime in my career. I loved the subject, I passed with A*. But I thought I am not going to see it later, because I didn't plan to work for a bank or some large enterprise.
Then, in about two years, everything changed. Suddenly, every new web project (and web was also novel) included a MySQL DB. That's when the idea about the three tier architecture was born. And since then, a few generations of engineers have been raised that can't think of a computer system without a central DB.
I'm telling this because in microservices I see the opportunity to rethink that concept. I've built and run some microservices based systems and the biggest benefit wasn't technical, but organizational. Once, the system was split into small services, each with its own permanent storage (when needed) of any kind, that freed the teams to develop and publish code on their own. As long as they respected communication interfaces between teams, everything worked.
Of course, you have to drop, or at least weaken, some of ACID requirements. Sometimes, that means modifying a business rule. For example, you can rely on eventual consistency instead of hard consistency, or replenishing the data from external sources instead of durability.
Otherwise, I agree with the author that if you are starting alone or in a small team, it's best to start with a monolith. With time, as the team gets bigger and the system becomes more complex, your initial monolith will become just another microservice.
I'd see a "distributed something" that takes an input, processes it in multiple ways, possibly passing it around to some APIs or queuing it somewhere without ever needing to store it in its own dedicated area to be a good idea.
That could probably the best instance of something that can be built outside of the monolith and can be manipulated separately.
DB isn't needed. Our microservices pipeline either uses MQ, EFS or S3 for the destination for another pipeline to pick up. Unless you count those 3 as DBs ;)
Yeah I would say those are key value document based DB. Just silicon valley hipster coming up with cool names to call something different so it is a bit easier for developer to use. Anything does permeant storage are DB.
>"Jenkins is not necessary, you can still deploy stuff with a local bash script and you need containerization whether you are on Microservices or Monolyth architecture"
This is what I do. Single bash script when ran on bare OS can install and configure all dependencies, create database from backup, build and start said monolith. All steps are optional and depend on command line parameters.
Since I deploy on dedicated servers I have no real need for containers. So my maintenance tasks are - ssh to dedicated server and run that script when needed. Every once in a while run the same thing on fresh local VM to make sure everything installs, configures, builds and works from the scratch.
Of course this advice doesn't always make sense, but it makes sense more often than people want to admit. Not that this is an original claim, if anybody remembers the "majestic monolith."
Simply put, the microservice path makes a lot more sense when your organization is so big that you don't really know what other teams are doing all the time and need a clear delineation of areas of responsibility, or maybe if you have very large volumes you're dealing with. That doesn't describe most orgs, but if it describes you consider microservices.
Oftentimes however with microservices it's the tail wagging the dog. Microservices architectures become such a burden that they strain the capacity of the existing team which leads to more hiring which thanks to Conway's law leads to more microservices being built which leads to more operational and architectural overhead which leads to more hiring...
That can definitely happen and be painful. I now work at a very large organization, though, and the benefits of microservice design are obvious (as were the pains of monolithic ones when I was working on an old legacy monolith in the same place).
Here's my take on microservices: It's a form of modularization. Like all modularization it can be useful to uncouple larger parts of a system - to enforce an architecture, avoid unintended cross-dependencies, make explicit the dependencies that are intended, to allow teams to work more independently, and to allow the different parts of the system to be be developed, compiled, built, deployed (and sometimes scaled) independently.
But all this comes at a cost - you have to know where the module boundaries are, because moving these is much harder. The flexibility that you've gained within each module comes at the cost of making it harder to move the border between modules. It's very easy to put down a module border in the right spot, and you make a refactor (needed for cleanup or performance improvement) go from tricky to impossible. E.g. that piece of info needed in this module now canonically lives in the other one. Or we accidentally add a couple N's to the time complexity of an algorithm that works on data in multiple modules.
But getting the borders right on the first try is hard, unlikely even. Where those borders should be depends on a lot of things - the user domain, the performance characteristics, the development team(s) (Conway's law and all that) and the ways in which the business requirements change. For that reason, I think most successful modularizations are either done by breaking down an existing monolith (providing these insights) or by basing it on a known running system that is "close enough" in its needs and environment.
More and more I think of OOP and services as the same thing at different scales. Objects are services, dependency injection is your service discovery/orchestration layer. Your monolith is already a microservice architecture.
In the end, extracting a microservice from a monolith built this way is just a matter of moving the implementation of the object to a separate application, and making the object a frontend for talking to that application.
The single biggest reason OOP gets a bad reputation is because lots of languages insist on treating OOP as the be-all end-all of code structure (Java and Ruby are particularly bad examples) and insist on trying to shoehorn this sort of logic into tiny dumb pieces of pure data.
For the life of me, I never understood, nor will ever understand, why people think making RPCs is easier or leads to better design than making normal function calls ("split your code into microservices, it will make your code modular, smaller and easier to understand!").
There are legitimate reasons to put a network between one piece of code and another, but "modularity" is not one of them.
Microservices doesn't necessarily mean K8S running on self-managed compute instances as in the example given in the article.
The main mistake of the article is that the premise is "microservices" but then the examples are about infrastructure (K8S etc), but the 2 things are not tied, you can have that same architecture cited in the article running monolith instances for example, and it would be every bit as complex as managing the micorservices example, without any of the advantages.
Fully managed serverless solutions like Lambda/Cloudflare Workers/etc, managed through SAM or Serverless Framework solves most of the problems cited in the article.
The whole infra in our case is API Gateway + Lambda + DynamoDB, this infra is not more complex to manage than our old monolith which was on ALB + EC2 + RDS.
Deployment is one "sam deploy" away (done in Github Actions), this is actually even simpler than our previous monolith.
I disagree, microservices are an architectural concept related to the software, not to the infrastructure.
Whether you are using containers or VPS or serverless or bare metal for your infrastructure, that's completely unrelated to the concept of microservices: you can deploy either a monolith or microservices in any of the above.
As an example you can deploy a monolith on Lambda[1] or you can deploy microservices on bare metal using one of the several self managed serverless engines available[2].
I've been a professional programmer since the late 1990s. What I observe is that many people flocked to microservices because they had only known monoliths and they were sick of the limitations inherent to them. Now we have a generation of engineers who have only known (or been taught) microservices and a backlash has developed because they are sick of the limitations inherent to them.
This is my reading of the anti-OOP / anti-inheritance movement as well. To spice up Bjarne Stroustrup (inventor of c++), "There are only two kinds of programming languages: those that don't make it and those that people bitch about."
I am just using services with my medium sized application.
They are not "micro", but they separate the domains and different concerns pretty well.
I have the deployment setup more or less like a monolith, but still having separation of concerns with my services. And stateless service runners.. Fair enough I have the state in a single (mirrored) database. But this works perfectly fine for the medium sized app.
Not sure why everything must be black or white.
I've felt SOA is the easiest to grow because it encourages you to swap out concrete implementations as requirements change. For example, IUserService can start off with a "local" UserService implementation that makes direct calls to a database. Once you signup with an IdP this might become UserServiceAzureAD/Okta/Auth0. Unlike microservices, I keep my compile-time guarantees that IUser continues to have the properties I require without any tooling.
Given the rhetoric here I worry that I'm the only person who's genuinely swapped out their implementation. The ol' "N-tier is stupid - you're never going to change the database" comment couldn't be more wrong.
Worked with a huge monoliths, business critical, predictable usage, easy debugging, deployment simple, onboarding quick even with less documentation.
Worked with microservices, business critical, predictable usage, less documentation here onboarding took long time spent on how it works, Hard to debug, never needed to scale up. (Why microservices ? )
Lessons learnt:
Problem is never with the architecture. Why you choose one over the other is the question to ask when you start on.
One problem I've never seen mentioned regarding microservices is what happens when the organization that produced them has moved from growing to stagnating and finally into decline. If microservices ship the org-chart, what happens when that organization fails to attract engineers?
I worked for a company that microserviced themselves into a pit. We had a huge layoff which completely killed the morale of the remaining engineering staff. Over the next 3 years, more and more engineers left and the company refused to replace them. What started out as teams (~5 people) responsible for ~3 microservices ended up as teams of ~3 people responsible for ~6 microservices.
It was kind of cool to be responsible for more architecture and see how it all fit together, but the sad reality was that too many of the microservices that had been stood up were stitching together disparate data from other microservices to then perform what a single SQL query against a RDBMS was doing.
Managing complexity is hard, no matter the approach to it. Microservices define boundaries around component inputs and output and try to reduce complexity by black boxing functionality. Monoliths try to reduce complexity by managing code (languages, libraries, databases, etc). I'm not sure there really is a good answer between the two because over time:
1. Vendor (and open source) blackboxes get introduced to monoliths, and you end up with a monolith built around a bunch of microservices.
2. Common tooling has a huge benefit, and copy-pasta code gets the job done faster than everything being custom every time. So you end up with microservices that every task ends up importing some giant framework and uses common libraries - so the monolith gets imported to the microservice.
Software gets complex, faster than we all like to believe... It seems like software has gravity, attracts complexity, and ends up being a big ball of mud in the end, almost every time.
While it is easy to start with monolith, it's not easy to just go from monolith to Microservices in case it is determined that's the best path forward. Often organizations don't have the luxury to "learn" by doing a monolith first and then determining the boundaries. In most cases, market pressures, limited budgets keep monoliths monoliths. OTOH, my experience is that it is easier to combine overly granular Microservices into just right size Microservices on an ongoing basis. And yes, infra requirements are bigger for Microservices, but often it is highly automate-able.
I think, one of the reasons, besides all the usual reasons, that keeps Microservices in vogue is the ability to align with organizational structure and enabling independent-ish teams.
Came here to say this. I think most of us suffer from this symptom of wanting to make the system more easily modifiable later etc. by "decoupling" various components, but as a result turn it into an unnecessarily complex blob that ironically makes our lives harder not easier down the road, in many cases.
If anything, the overarching theme should be "make it simple, but not simpler" and try not to increase the number of components in the system without necessity, until and when the proper time comes. Doing things like proper testing and CI/CD from the start are much more important because that's what actually allows one to refactor the system later to introduce additional components, de-coupling etc. where needed.
Otherwise, all of this is just shifting the complexity around.
Seems, to me, that the author is poor at designing microservices.
Using his example: the login, session and user services should be only one service (something like Keycloak), there's no advantage to splitting this up, so why would you?
Analytics service should never be a dependency of another serivce, but rely on service discovery and a preset telemetry/analytics/rpc endpoint which each of the consumer-facing services implements.
Has anyone claimed that microservices will fend of poor design? There's never a silver bullet.
What has worked well for us, something that IMO combines the best out of both worlds:
* break down the problem into sensible components. for a reporting system I'm working on atm. we're using one component per type of source data (postgres, XLSX files, XML), one component for transformations (based on pandas) and one component for the document exporter.
* let those components talk through http requests with each other, including an OpenAPI specification that we use to generate a simple swagger GUI for each endpoint as well as to validate and test result schemas.
* deploy as AWS lambda - let amazon worry about scaling this instead of having our own Kubernetes etc.
* BUT we have a very thin shim in front that locally emulates API gateway and runs all the lambda code in a single process.
- (big) advantage: we can easily debug & test business code changes locally.
only once that is correct we start worrying about deployment, which is all done as IaC and doesn't give us trouble that often.
- (minor) disadvantage: the various dependencies of lambdas can conflict with each other locally, so gotta be careful with library versions etc.
doing so the scaling works quite well, there is no up-front infrastructure cost
and code tends to stay nice and clean because devs CANNOT just import something from another component - common functionality needs to first be refactored into
a commons module that we add in each lambda, which puts a nice point for sanity
checking what goes in there.
I didnt't even get a quater of the way down the page before I stopped reading.
As soon as the author started listing things like k8s as needed for microservices it shows they havent stopped to think out side the box. there is no reason you can't run your set of microservices as 3-4 docker containers on the same host, no load balancers, no k8s, no log aggrigation, etc etc etc.
If your application makes sence as microservices you don't need to start with all of that, so including it all in the cost of startup makes no sense at all, as your application starts to scale out and need them, add them at that time, your going to need most of it for a monolith application as well, and some of them you may NEVER need (k8s for example, there is no reason you can't run your application on just plain old compute infrastructure, you don't even need to look at the "cloud" that old box in the corner of your office might be all you need for the project)
if you stop and remember that "microservices" just means small single function services, not things like k8s you will probably find that you can actully do a lot less work if you go down that road by letting other exsisting projects so a whole bunch of the work for you and save you re-inventing the wheel to get your project finished and out the door.
I think starting with a decently written monolith is a great idea, the two things that I always saw that _make things bad_ (fsu) are auth, encapsulation (with no domain boundaries, typically 1 class with lots of responsibilities that are only useful in specific contexts but depend on common data and are hard to refactor, classic 3k lines of python code for the User class) and ORMs.
Often these things (auth and orm) are pushed by frameworks and don't scale well because of the magic they use to make things work.
I think today if i had to start from scratch would use lambdas as much as possible for all common external facing logic (auth etc) and GET routes (for scalability) and have a monolith (with very simple orm features) that implements business logic on top of a managed db (like a dynamodb or aurora).
somewhere in the comment I have seen elixir being mentioned and I really like the approach elixir (and phoenix) have over organizing the domains in a monolith, together with a little reactive approach and of course optimize for deletion https://www.netlify.com/blog/2020/10/28/optimize-for-deletio...
My understanding from the HTML Template days is that a monolith contained both the business and presentation layer together. I made plenty of those with Struts 1.x and Spring MVC. Micro services were pitch as a means to separate the front end and back end.
I'm a bit confused on microservices vs monolith in the modern SPA context. If I have an application that's front end is React and back end is Go and the two communicate over REST/HTTP+RPC, do I have microservices or a monolith?
Having worked on some cloud native/ cloud first apps, I would love to see a language or framework a long the lines of a class being a lambda/cloud function, and when calling a method of that class (it doesn’t have to be OOP, someone smarter than me must figure that out) the language itself will sort all the http requests and deployments and junk for you and your project becomes scalable, serveless services but with code that is actually coherent and readable.
You'd probably be interested in Unison ( www.unisonweb.org ). It's a Haskell-style functional language, which aims to make the sort of distributed computing you're talking about easy. The central conceits are:
1. Functions are hashable
2. The codebase and runtime are deployed uniformly to every node
3. Networking these nodes allows functions to be cross-referenced and passed by their hash values
I agree to not start with microservices.. But its better to not wait too long after the project is growing.
Regarding his points:
- Infrastructure requirements:
You don't need all that stuff! You can have multiple services run on PaaS services / Cloud Run, and you dont need to deal with all the kubernetes stuff. Even if you prefer K8S, then you still dont need a service mesh from the start. Datadog and Gitlab brings you very far with hardly any work on your side.
- Faster Deployments
My point: 80 microservices? Crazy .. Why? Just have a service per business domain, and try unifying the CI/CD stack.
We had 1 big Monolith which would deploy 5 times per day, buy every deploy took around 1 hour.
Having 5 to 10 services, that all deploy within 5 minutes is so much nicer to work with.
- The Supporting Culture
This is important: Architecture follows company organization, and vice versa. Every team often owns 1 or 2 services. Teams should be organized by domain. Business boundaries should be agreed upon in a higher level.
Sitting in your dev corner, building services without talking and aligning with the Product owners & MT is a recipe for distaster imho. The services should solve a problem that PO's understand. You should have alignment.
- Better Fault Isolation
We never said it was going to be easy... It requires a different way of building your system. You need to think distributed systems.
The only addition: Often it is useful to design with microservices because it means you don't have to create and maintain them yourself. Even a database is a microservice. PhPMyAdmin (and similar) is a microservice, so are other Open Source projects I like: Celery, Sentry, Mailhog, Jupyter, Nextcloud even Caddy or Nginx. All could be considered microservices. Integrating them into your platform often makes a lot of sense.
For example, right now I’m building a system that will take in a URL, crawls the webpage, and does some processing on the data. The whole process takes a good 10 seconds.
I designed this as a micro service, simply because I know URLs will need to be queued up. Should this have been done as a monolith? Or am I right that micro services was actually the right approach?
In rails this would be an activejob (part of the mono), and spinning up workers would be trivial.
The "workers" would pull off a redis queue, and you could add different jobs to the queue easy-peasy.
That's fair, maybe I'm already planning to far ahead. But I ended up using RabbitMQ and Node (since i wanted to use Puppeteer for this, I actually needed to render the webpage).
I did something like this too, and figured I'd rather do it for free as a lambda running under Vercel rather than create a droplet or something else and pay for it...
I have worked in companies which had monolith software, but they have been large and established businesses.
I will not name names, but I can say this...
All but one had constant, repetitive and quite painful failures, not fixable, it seemed.
One company was an exemption, with some failures but a clear culture of pushing issues while implementing new features.
And one company I have been with from startup days till 150 employees had a microservice infrastructure, I have never seen so little downtime, such a smooth back office, front end, database, reporting system.
The cto was owner and main developer, if something went awry, he would start working on the fix within minutes, no matter the day or time. The fastest issue response time ever.
2 lifetime downtimes of a handful minutes for the full service, and a couple components which didn't work for maybe an hour or sometimes overnight.
I have to say though, when one microservice broke, it took down more tangential services than one would think, but other than that, hands down the best software I ever worked with.
I hate the idea of replacing function calls with network requests for organizational reasons, even for a fast network in the same datacenter you add several orders of magnitude of latency to the calls. If the problem is that groups can't work independently of each other in the monolith, it should be solvable with modularization and defining APIs for those modules.
> If you’re going with a microservice:
> A Kubernetes cluster
> A load balancer
> Multiple compute instances[...]
> Jaeger/Zipkin for distributed tracing
To be fair, using K8s + helm I was able to install logging, grafana, prometheus very easily. If you leverage on Helm3 and Bitnami [1] helm charts, you can go fast.
Also, you can use pipeline (like github/bitbucket pipelines) to deploy and remove jenkins completly: I have done it and it is a viable solution (although with some lock-in).
So the complexity is a bit less if you study enough well your setup, but you must take time to plan your solution.
After three years of K8s study, in my humble option K8s is far better compared to docker swarm, even for tiny projects, with k8s as a cloud managed solution (even small provider had it nowadays).
Another point of confusion among many folks is monorepos. Some consider "monorepos = monoliths", which is more like comparing apples to oranges imo.
Monorepos provide some of the benefits of monoliths in terms of
1) making it easier to refactor your entire codebase at once
2) sharing some common dependencies like framework versions etc., which makes it much easier to keep those up to date and more secure, as a result
3) sharing some common utilities like unit testing harnesses etc.
4) other things
At the same time, monorepos don't force one into a monolith architecture at all, an in fact can provide most of the same benefits of microservices in terms of separate deployments etc.
The most important lesson in my mind is, there's no "panacea" or "perfect structure" as a prescribed way to do things. Every system has its own demands (which also change over time), and the only "should" is to tailor things on a case-by-case basis, adjusting to a better structure when needed.
Microservices are just a way to implement a distributed system.
The problem seems to be that quite a number of teams don't have any formation about system design (mono/distributed/mixed).
Most teams go for microservices because of hype and because they see an spagheti monolith and believe the problem is the monolith and not the rotten badly modularized code.
> What if the product (department) doesn’t give a damn about the underlying system architecture? I mean shall they?
They should not. Either it works correctly, or it doesn't. Even buildings need to be useful, even though people might admire the architecture. No non-technical person will admire your microservice or whatever architecture.
IMO, the big advantage of microservices over monoliths (if they're done right) is reducing iteration times. It's just a lot quicker to build/run/etc. I think monoliths are a fine starting point but once your iteration times get slow, that's when it's time to break it up.
It's a big tradeoff for maintenance complexity and cognitive load though, and people often don't realize how big that tradeoff is. Chasing bugs and maintaining the spiderweb of network connections between all your services can quickly become a nightmare. A distributed ball of mud instead of a monolithic ball of mud, but a ball of mud nonetheless.
Personally I lean towards building a monolith first, then breaking out features into separate services if you need it. But I've worked on teams that advocated for microservices needlessly, and it was 100% cargo cult behavior: "Netflix does it, so we should too."
Another anecdote: my current company, a startup, could've launched its product a year earlier than it did, but our engineering lead insisted on pre-engineering a Rube Goldberg machine of a microservices backend before we even had any prospective customers. Took months of engineering time and headaches to grok all of it, when in reality, one monolith and a basic database could've done the job for years before we'd ever have to scale past that.
But microservices architecture looks great on a resume, so /shrug
That's awful, and probably the most egregious example I've heard of a cargo cult gone wrong. The longer I'm in this field the more disappointed I am by the influence that trendy groupthink has. I know no field is immune to it, but it seems more pervasive in software.
Yeah, I don't think there's really a best approach here. I know where I work right now, we have this giant java app that is just a nightmare to even get running. I've been working on the microservices, and they do have all the downsides you're talking about, but I can get that stack up and flying super fast, whereas this giant java app takes 3 minutes to even be (barely) functional, and has so many configuration options that out of this 1000 line config file it's hard to find the 3 things you might actually care about.
There is no single development, in either technology or management technique, which by itself promises even one order-of-magnitude improvement within a decade in productivity, in reliability, in simplicity. -- Fred Brooks
Sussman summed up the problem nicely: "We really don't know how to compute!" So we latch onto whatever semi-plausible idea some consultant cooks up, like flowcharts, structured programming, agile, object-oriented programming, test-driven development, microservices, and countless other things.
Microservices impose a transport layer over whatever it is you were doing before. So that's one extra point of failure that you've got to contend with. Complex problems require complex solutions. Sure, there are better and worse ways of doing things, but there are no miracles.
In these discussions, people tend to forget the unsexy problem: coordinating work between humans. The reason an architecture is chosen is often purely a result of Conway's law. Just like the code organized on disk is often purely a result of how your language or build tool prefers you organize your code. Subconscious bias controls a lot of what we do, so designing systems partly requires an understanding of psychology and management techniques.
If that sounds crazy, just ask W.E. Deming. The reason Toyota came up with TPS was not because they were geniuses, or the discovery of some "magical architecture". They just focused on how to always be improving quality and efficiency - which at first makes no sense at all, and sounds exactly like premature optimization. But the proof is in the pudding.
Someone please write an article, "Don't start with architecture some dude suggested because of their ego".
There are cases when monolith is bad, and when microservice is the must.
Take the healthy approach.
The article is garbage with no real understanding how real microservices works:
The deployment:
Modern microservices are working using templates. You deploy it using that directly from you gitlab/github. You copy paste your artifact and it's there. The builds are taking 2 minutes to build, sometimes less, means you can quickly react on some issue as opposed to 30 minutes old school java monolith. Deployments are build in the same cluster you use for everything else. CI job runner is just another application in your cluster. So if your cluster is down, everything is down.
The culture part:
We use templates, where you have all the libraries , tracing in place. In fact when this request is coming we have some similar functionality written, so we reply to product, oh, this feature is very similar to feature X, we'll copy it, while we discuss some schedule thing our developer renamed similar project did commit and it's already deployed automatically to the dev cluster, the rest of the team joined to development. There is a bad pattern when you need to update your templates. This is tradeoff of approach, you don't libraries as a concept. Hence that you can have half services migrated, half services don't, that's a bonus. The cons is that you need scripts to push everything immediately.
Better Fault Isolation:
Yes, you might have settings down and core functionality working, means you have less SLA breaking events. Saves you money and customers.
Same thing with error handling. If it's just tooling you copy paste a different set of tooling. If the error logging is not implemented in a proper way in the code... It's no different from monolith, it's just errors in code. But things like tracing are already part of the template so for basic evens handlers are traced from deploy #1.
This article sounds like someone who's never successfully implemented either solution. Things that are wrong so far:
Monolithic apps need monitoring (e.g. Prometheus) just as much as microservices.
Monolithic apps can have scaling issues if you only have one database instance (especially if you're write-heavy), so you may need to shard anyway.
Monolithic apps will probably want a messaging system or queue to handle asynchronous tasks (e.g. sending e-mail, exporting data).
Microservices do not require kubernetes. You can run them fine on other systems; at my last job, we just ran uwsgi processes on bare metal.
Microservices do not require a realtime messaging system. You can just use API calls over HTTP, or gRPC, or whatever else. You'll probably want some kind of message queue as above, but not as an integral part of your architecture, meaning it can just be Redis instead of Kafka.
Microservices do not require separate DB instances. If your microservices are all using separate schemas (which they should), you can migrate a service's schema to a new DB primary/replica set whenever you want. In fact, if you have one e.g. MySQL primary you can have multiple secondaries, each only replicating one schema, to handle read load from individual services (e.g. a primary write node and then separate read nodes for user data, product information, and everything else). When it's time to break out the user data into a separate database, just make the read replica the new primary for that DB and add a new read replica off that.
This dude just straight up doesn't know what he's talking about, and it sounds like his experience with microservices is following a bad 'microservices with golang, kafka, cassandra, prometheus, and grafana on k8s' tutorial.
Here's how you write baby's first microservices architecture (in whatever language you use):
1. Decide where the boundaries are in your given application; e.g. user data, product data, frontend rendering, payment systems
2. Write those services separately with HTTP/gRPC/whatever APIs as your interface.
3. For each API, also write a lightweight native interface library, e.g. user_interface, product_interface, payment_interface. Your services use this to call each other, and the method by which they communicate is an implementation detail left up to the interface library itself.
4. Each service gets its own database schema; e.g. user, product, payment, which all live on the same MySQL (or RDS or whatever) instance and read replica.
5. Everything has either its own port or its own hostname, so that your nginx instances can route requests correctly.
There, now you have a working system which behaves like a monolith (working via what seems like internal APIs) but is actually a microservice architecture whose individual components can be scaled, refactored, or rewritten without any changes to the rest of the system. When you swap out your django protobuf-over-HTTP payment processing backend for a Rust process taking gRPC calls over Redis queues, you change your interface file accordingly and literally no one else has to know.
It also means that your deployment times are faster, your unit testing is faster, your CI/CD is faster, and your application startup time is faster when you do have to do a service restart.
I'm not sure why this is so hard for people to understand.
recently I am getting more and more thoughtful about "accidental" complexity we add to our solutions in form of dependencies on external libs/module, frameworks such as IoC, log services anyone ;) and on the architectural side microservices etc.
It is why the Reverse Conway is preached. Define your desired architecture, and fit your organization to fit it. At that point your code base will reflect the organization of people working on it, and the output will reflect the desired architecture.
> If you’re going with a microservice:
A Kubernetes cluster
A load balancer
Multiple compute instances for running the app and hosting the K8S cluster
Try running a high-traffic high-functionality website without something equivalent to this. What's this magical single computer you'll be running it all on? You'll need a cluster of servers on a rack somewhere, managed the old fashioned way. You'll need equivalent tools to deploy to them and monitor them.
I think what this article should be getting at is that low-traffic sites shouldn't start with microservices, and maybe that's true. But if you're trying to build a business that scales rapidly, you want to be ready to scale your website from day one.
does stackoverflow count as high-traffic, high-functionality? pretty sure most people here would be very happy for their startup to have that level of traffic. it's no longer a single computer, sure, but, last time I read about it, it was monolithic.
The real pattern is to write a separate service for user auth and then put everything else into a monolith. The auth service is almost always the first one to be "broken out" so just get that out of the way first. Now you have maximum flexibility.
Microservices are supposed to be autonomous. Independent services with an own lifecycle. What this article is describing sounds more like what is called a distributed monolith. Too many horizontal dependencies will create problems no matter where they are located.
I worked on monoliths where the shortest theoretical amount of time from a commit to running in production is several hours. The way you in a controlled way can change small parts of system with microservices is incredibly useful.
What I do see though is people making microservices too small. Like one table in a monolith database becomes one microservice. Microservices is not about having everything loosely coupled. Cohesion rules for modules still applies.
The point here for me is team dynamics, maturity, and skill level.
If you have a team that can't competently write and maintain a monolith that is modular and built/tested via automation, then that team has no business trying to build and maintain microservices.
Microservices is kind of a superpower that allows you to choose where you want your complexity to be. You are supposed to start with something simple, so the microservices decision needs to come in later.
Not starting full in with microservices is a good pattern.
There’s also a product-related argument here as well. In order to start, most products require a lot of scaffolding—-modeling of business entities, general rw apis etc. These are well represented by the big “monolithic” frameworks like Rails or Django. There’s not a lot of value in developing a proprietary distributed paradigm for these. Having this core taken care of and flexible, then allows It’s devs to build more proprietary functionalities as distributed proprietary services. Building this core backbone for access to basic funtionalities, and then distributing the high-value technologies has been the recipe for many things I’ve built.
This feels very FUDy. It gives a bunch of examples of ways in which microservices can go wrong, without empirically examining those claims. It also excludes the middle: you can have "milliservices" (really, just service-oriented architecture) which do more than route a single endpoint, but still give flexibility and scaling.
We are a young startup in the ML-service space, with about a dozen engineers + data scientists. Our stack is based on all docker containerized python. We have 6 "micro"-services (not sure at which point they become micro) but each plays their own role, with ~4-30 REST endpoints each. It's been fantastic, and none of us are particularly experienced with microservices. We run on AWS east but you can spin most of the stack up locally with docker-compose. I don't even need k8s, if we wanted, we could probably deploy a local cluster with docker swarm.
- Fault isolation
I can't talk in depth but we were able to handle the recent east-1 outage with only parts of the stack degraded, others stayed functional.
Also, rollout is most likely when something will fail. Rolling back a services is way easier than disrupting the whole stack.
- Eliminating the technology lock
The ML container is a massive beast with all the usual heavy ML dependencies. None of the other containers need any of that.
- Easier understanding
Yep, definitely true in my experience. The API surface area is much smaller than the code under the hood, so it's easier to reason about the data flow in/out.
- Faster deployment
We can easily patch one service and roll it out, rolling out hotfixes to prod with no disruption in the time to run through the CI pipeline, or roll back a task spec, which is near-instant.
- Scalability
Emphatically so. The difference in scale between our least and most used service is over 100x.
We could probably get away with making the user-facing backend a monolith (and in fact it's the most monolithic, has the most endpoints) but for data pipelining, micro/"milliservices" has been a dream. I don't even know how it would work as a monolith.
As with everything in this field, it all depends on use-case and tradeoffs. If your services each handle roughly the same load, the independent scaling argument weakens. If your business logic is complex, tightly coupled, and fits on a single box, you'll waste a ton of cycles just communicating.
> Service A could be written in Java, service B could be written in Go, service C could be written in Whitespace, if you’re brave enough.
I've always found this to be such a huge strawman. I've yet to encounter an organization that uses a variety of technologies. Usually it's one of the top most popular, sometimes two, rarely three different languages. Devs migrate between teams, hiring is done to match the skills already onboard. Coding styles (even if they differ slightly between teams) usually follow corp "best practices"
Even if it's a multitude of services, the culture of the codebase is usually very monolithic.
Anecdotally, I've seen the strawman become real. Usually it's not more than two different languages. But at my current job, we've got a split between Typescript, vanilla Node.js, and Scala on the backend. All the results of various migrations and refactors spearheaded by different people at different times but not completed. The current vision is to have a unified Scala backend someday, but I'm not holding my breath.
My impression, which seems to be supported by this article? Is that when you're small, you can actually get up & running with an MVP with a non-serverless monolithic approach and scale for a bit before you hit a wall. And only at that point: 1) Scaling becomes much more complex and 2) Monolithic infrastructure prevents you from easily implementing best-of-breed solutions within segments of your functionality to optimize various functionality. And, from a cloud services $$ POV, it's cheaper and perhaps require less dev time, though I'm guessing that will depend on the project
This seems reasonable? At least at the early & early-mid stages. If you make it that far and see things like scaling issue in your future, it seems like you should also be at a stage of growth where you'll be getting reasonable funding offers and can invest the resources into migrating away from monolithic.
Disclaimer: My opinion here is formed mostly from following a not-completely-dissimilar process even when working with things more on a monolithic side of things: I'll use a high-powered workstation to spin up a vm's on the same host, and then if I need to I can migrate individual vm's to their own better-resourced instances on other hardware to scale things. I did this some years ago with a Hadoop cluster and the process worked out nicely.
Although as it turned out, that example didn't last long: Hadoop was overkill because I overestimated the bigness of my data, which turned out to only be on the bigger side of small. Or smaller side of medium. When I had a rethink on it, I wrote some python code against a the primary sql-based data source & used Keras to do what I needed instead: Iterating each pieceon a nicely-spec'ed workstation took an hour or so, and a full end-to-end run maybe 3-4 hours.
But this is kind of my point: Starting out, it's easy to thing "Oh I need to plan for ever possible eventuality & level of scale." No, you don't.
And a final caveat to this: YMMV since circumstances differ from project to project. But these are things to consider before you automatically go for slicing each piece of functionality into grains of sand with their own microservice.
On the point about technology lock I've thought that the move to microservices would mean allowing different programming languages, but whenever I've asked existing companies if they would be open to other languages it's usually a no. Part of it seems to be decent enough reasons but I usually think they could be overcome with a reasonable amount of work where the tradeoff may be worth it. I usually suspect it's more social reasons. My company also made the move to microservices and I was able to use another programming language but there was strong social pressure to conform.
For me it highly depends on the size of software, the time to market, the time the software is supposed to work without major rewrites.
If I am doing a blog application or an website for one time event, I would pick a monolith.
If I start on ERP, a checkout solution I would pick microservices.
Also if the man power is low but I expect growth in the future, I might go for a monolith broken in separate projects with minimal dependencies between modules so it wouldn't be terribly difficult to break it into microservices when the need and man power arrives.
When microservices were new they brought a whole slew of new technologies and architecture patterns with them: SPAs, JWT authentication, micro frameworks, REST/Graphql, containerization. Things that solved problems people were having with the previous monolithic approach, all above the lack of maintainability and composability. So I see the term microservice today not as something that's measured in lines of code, but above all by embracing the new technology landscape that came from it.
> One or more (relational) databases, depending on whether you’re gonna go with single database per service or not
This imho is where serious complications can come in. A single database for all services is a good trade off if you want the nice parts of microservice decoupling but not the headaches of a distributed system. Just perhaps don’t call it “microservices” to avoid having to deal with arguments from purists who want to explain why this is not true microservices, etc.
Services, serverless, and monolith don't have to be exclusive and can be applied in different areas where they are good fits. You obviously can't have a pure monolith and everyone knows that because of geographical distribution, scaling, and availability, let along languages having niches.
Monoliths are your friends. SOA is your friend. Serverless is your friend. You don't hang out with all your same friends and do the same things with them in real life, either.
Don't start with doing everything faangs do, and hiring grand wizard architects who do nothing but evaluate what third party b2b software we can sign deals with
I feel like a lot of microservice advocates fail to price in the overhead introduced when you split stuff up into multiple independent communicating units.
Example: Replacing a simple database query (effectively instant) and relaying the data as a local variable, with poking a seperately hosted microservice which ends up adding 20ms of overhead doing a HTTPS request, encoding and de-encoding the result in JSON, etc.
Actually monoliths maybe faster than microservices.
Consider monolith -> cpu l1 cache -> end user
Microservice -> account servive -> foo service -> l1 cache -> end user
Ie monolith goes directly to cou cache
Microservice goes to network calls which are mich slower than cpu cache.
Also with microservices you will need distributed application performance metric for call tracing. Distributed central logging. Container orchestration platform to run all the services.
The best start right now is to make two micro services:
All your services in one executable (one docker image)
Your UI (in another docker image)
From there you're free to separate things out more and more over time if you so choose. Having those two separate makes it so that you are required to at least have some kind of ingress controller that maps requests one way and UI assets another, so expanding should come easy at that point.
Product Owner: Hey guys, I came up with this really great feature. Our competitors are already doing it so we gotta do it quickly. Is it possible to do it in 2 weeks?
Team: No, we need 10 weeks because scale
* Product Owner jumps out of the window (obviously realizing product is doomed and company never get another client without this feature)
I'm curious, has anyone leveraged bazel's package visibility rules to improve isolation within a monolith? one of the things i don't like about monoliths is that tight coupling is often silent rather than noisy, whereas if someone is editing package visibility rules to make library internals A visible to package B i know exactly what's going on.
Google uses visibility rules all the time (with Blaze, which is basically Bazel). Mixed with the repository's OWNERS system which uses a hierarchical tree of permissions files, you can't take a dependency on someone else's private packages without their permission.
It depends on the age of the company and the size of the products.
A typical startup company can kick the tires with a monolithic architecture just to deliver a prototype quickly, but a more stable company that builds a products to scale will need to separate some of the services. They may have a big monolith chunk with a bunch of other “new” micro services.
I find with microservices, they are a fake separation of concerns. When I have to work on something, and it covers two or three services, I'm actually working on a mono app.
I've found smaller "service" classes that do one job meet the same need. One or two public methods "perform" and "valid" seem to work perfectly.
It seems to me that the ratio of hate-for-microservices/companies-actually-using-microservices is way out of wack. I know that there are some high-profile cases like Uber, but in nearly every conversation I see online or in person, we're all in agreement that going all in on microservices is hype-driven nonsense. Maybe I'm wrong though.
Example of decoupling things in goblins framework (nodejs):
one domain give one package/module, domains are now dependencies
actors models can act as services, one actor in a domain is a service with an API communicating with other trought event loops or tcp/ip (microservice?)
we can develop and debug the whole system in a mono-repo <- the monolith is the repository of code.
Avoid any advice that touts itself as gospel! Do your own DD for the product or services you’re building. Maybe you’re never going to hit more than 100 users in total or maybe you’re building a static website. Take some time and evaluate the pros and cons of your architecture and seek advice from those who have come before you!
I remember reading sam newman and if I am not wrong, he says something along the lines of the team size, problem at hand and few other facts determine which one to choose. Nothing is a silver bullet. Trite, but it's having knowledge of multiple architectures and using the wisdom to choose when to use one.
The idea of microservices sometimes reminds me of visual programming looking at infrastructure diagrams. Like instead of writing the code and executing it. You implement variables (databases), conditions (lambda), loops (pubsub/sqs) etc as separate entities and then put them together.
The problem with monoliths is that they're written fast and chock full of business logic that isn't reusable anywhere else. With microservices, if done correctly a la hasura, your business logic is abstracted to config, or to the highest level of abstraction your business allows.
I have never been in an environment where business logic could come even remotely close to being definable via config, even if config was some kind of fancy DSL.
Every product I've worked on had deep interdependencies between major parts of the code _at least logically_, because _that was the product_.
At some point you can't factor out the business logic anymore without factoring out the entire business, in my experience. This is a major reason software is so tricky.
Still, I'm fascinated that this could exist somewhere. Have you seen it in the wild?
Every time I open the site and it pops up some weird "subscribe to this and that awesome content" in my face, I immediately close it. Each time I really hope that owner has some analytics set up and they will learn that having this thing is not a good thing to do to your readers.
Monoliths and microservices are two bad ways to develop software. Monoliths are rife with hidden dependencies and microservices tend to collapse from even simple faults
Need to rightsize the modules. More than one, but just a few with boundaries chosen around recovering from faults
I tend to build monoliths, because I've got a sub-two-pizza team to work with. If I could throw dozens of people at the problem, and moreover, needed to invent things to keep that many people busy, then I think microservices would be more interesting.
It has the disadvantages of lock-in, but this is why I love ECS. 80% of the advantages of K8S for 20% of the complexity. Even hosted K8S requires a lot of config and concepts that are just baked into ECS.
Your microservices pains are legitimate. That's why we built the Control Plane (https://controlplane.com) platform. Our customers deploy microservices in seconds and get unbreakable endpoints, even when AWS, GCP or Azure go completely down.
They get free logging, metrics, secrets management, load balancing, auto-scaling, MTLS between services, service discovery, TLS, intelligent DNS routing to the nearest healthy cluster and much more.
Multi region and multi cloud used to be hard. Now they are as natural as clicking a button.
Before you write part two of your article, give the https://controlplane.com platform a try. One you've tried it - I'm 100% convinced you'll make a 180. I'm happy to personally demo it for you.
At one of the hyperscalers myself. We are actively merging components for performance. We will keep internal modular boundaries where we can and there will be some small number of abstraction violations for performance.
I always do this, but how do when to re-architect or at what level of granularity to look for optimizations? Do we burn time looking at async processing, do we just scale the cluster, etc
How many more articles do we need to drive this point home? First it was a fashion to develop using microservices. Now writing about not using them is in vogue :-).
I like monoliths... which are modular inside... no need for microservices, easy deployment even without Docker and Kubernetes... just a single binary...
I can't understand why a blog has cookies and has facebook hooks at this point in time but then teaching architecture undermining whatever the main argument was at the moment of landing.
The point in time where you're splitting your codebase up in modules (or maybe are a proponent of hexagonal architecture and have designed it that way from the beginning), leading to being able to put functionality behind feature flags. That way, you can still run it either as a single instance monolith, or a set of horizontally scaled instances with a few particular feature flags enabled (e.g. multiple web API instances) and maybe some others as vertically scaled monoliths (e.g. scheduled report instance).
I wrote more about that approach on my blog, as one of the first articles, "Moduliths: because we need to scale, but we also cannot afford microservices": https://blog.kronis.dev/articles/modulith-because-we-need-to...
In my eyes, the good part is that you can work with one codebase and do refactoring easily across all of it, have better scalability than just a monolith without all of the ops complexity from the outset, while also not having to worry as much about shared code, or perhaps approach the issue gently, by being able to extract code packages at first.
The only serious negatives is that this approach is still more limited than microservices, for example, compilation times in static languages would suffer and depending on how big your project is, there will just be a bit of overhead everywhere, and not every framework supports that approach easily.