Hacker News new | past | comments | ask | show | jobs | submit login

The interesting thing about microservices is not that it lets you split up your code on module boundaries. Obviously you can (and should!) do that inside any codebase.

The thing about microservices is that it breaks up your data and deployment on module boundaries.

Monoliths are monoliths not because they lack separation of concerns in code (something which lacks that is not a ‘monolith’, it is what’s called a ‘big ball of mud’)

Monoliths are monoliths because they have

- one set of shared dependencies

- one shared database

- one shared build pipeline

- one shared deployment process

- one shared test suite

- one shared entrypoint

As organizations and applications get larger these start to become liabilities.

Microservices are part of one solution to that (not a whole solution; not the only one).




Monoliths don’t actually look like that at scale. For example you can easily have multiple different data stores for different reasons, including multiple different kinds of databases. Here’s this tiny little internal relational database used internally, and there’s the giant tape library that’s archiving all this scientific data we actually care about. Here’s the hard real time system, and over there’s the billing data etc etc.

The value of a monolith is it looks like a single thing from outside that does something comprehensible, internally it still needs to actually work.


But all those data sources are connected to from the same runtime, right?

And to run it locally you need access to dev versions of all of them.

And when there’s a security vulnerability in your comment system your tape library gets wiped.


> But all those data sources are connected to from the same runtime, right?

Yes, this is an accurate assessment from what I've seen.

> And to run it locally you need access to dev versions of all of them.

In my experience, no. If the runtime never needs to access it because you're only doing development related to datastore A, it shouldn't fall over just because you haven't configured datastore B. Lots of easy ways to either skip this in the runtime or have a mocked interface.

> And when there’s a security vulnerability in your comment system your tape library gets wiped.

This one really depends but I think can be an accurate criticism of many systems. It's most true, I think, when you're at an in-between scale where you're big enough to be a target but haven't yet gotten big enough to afford more dedicated security testing at an application code level.


> But all those data sources are connected to from the same runtime, right?

Not always directly, often a modern wrapper is setup around a legacy system that was never designed for network access. This can easily mean two different build systems etc, but people argue about what is and isn’t a monolith at that point.

Nobody counts the database or OS as separate systems in these breakdowns so IMO the terms are somewhat flexible. Plenty of stories go “In the beginning someone built a spreadsheet … shell script … and the great beast was hidden behind a service. Woe be unto thee who dare dare disturb his slumber.”


This actually feels like a good example of the modularity that i talked about and feature flags. Of course, in some projects, it's not what one would call a new architecture (like in my blog post), but rather just careful usage of feature flags.

> But all those data sources are connected to from the same runtime, right?

Surely you could have multiple instances of your monolithic app:

  # Runs internally
  app_instance_1_admin_interface:
    environment:
      FEATURE_ENABLE_ADMIN_INTERFACE=true
  # Runs interally
  app_instance_2_tape_backup_job:
    environment:
      FEATURE_ENABLE_TAPE_BACKUP_JOB=true
  # Exposed externally through LB
  app_instance_3_comment_system:
    environment:
      FEATURE_ENABLE_COMMENT_SYSTEM=true
If the actual code doesn't violate the 12 Factor App principles, there should be no problems with these runtimes working in parallel: https://12factor.net/ (e.g. storing data in memory vs in something external like Redis, or using the file system for storage vs something like S3)

> And to run it locally you need access to dev versions of all of them.

With the above, that's no longer necessary. Even in the more traditional monolithic profiles without explicit feature flags at work, i still have different run profiles.

Do i want to connect to a live data source and work with some of the test data on the shared dev server? I can probably do that. Do i want to just mock the functionality instead and use some customizable data generation logic for testing? Maybe a local database instance that's running in a container so i don't have to deal with the VPN slowness? Or maybe switch between a local service that i have running locally and another one on the dev server, to see whether they differ in any way?

All of that is easily possible nowadays.

> And when there’s a security vulnerability in your comment system your tape library gets wiped.

Unless the code for the comment system isn't loaded, because the functionality isn't enabled.

This last bit is where i think everything falls apart. Too many frameworks out there are okay with "magic" - taking away control over how your code and its dependencies are initialized, oftentimes doing so dynamically with overcomplicated logic (such as DI in the Spring framework in Java), vs the startup of your application's threads being a matter of a long list of features and their corresponding feature flag/configuration checks in your programming language of choice.

Personally, i feel that in that particular regard, we'd benefit more from a lack of reflection, DSLs, configuration in XML/YAML etc., at least when you're trying to replace writing code in your actual programming language with those, as opposed to using any of them as simple key-value stores for your code to process.


You're talking about something very odd here... a monorepo, with a monolithic build output, but that... transforms into any of a number of different services at runtime based on configuration?

Is this meant to be simpler than straight separate codebase microservices?


This is actually quite a nice sweet spot on the mono/micro spectrum. Most bigger software shops I've worked at had this architecture, though it isn't always formally specified. Different servers run different subsets of monolith code and talk to specific data stores.

The benefits are numerous, though the big obvious problem does need a lot of consideration: with a growing codebase and engineering staff, it's easy to introduce calls into code/data stores from unexpected places, causing various issues.

I'd argue that so long as you pay attention to that problem as a habit/have strong norms around "think about what your code talks to, even indirectly", you can scale for a very long time with this architecture. It's not too hard to develop tooling to provide visibility into whats-called-where and test for/audit/track changes when new callers are added. If you invest in that tooling, you can enforce internal boundaries quite handily, while sidestepping a ton of the organizational and technical problems that come with microsevices.

Of course, if you start from the other end of the mono/micro spectrum and have a strong culture of e.g. "understand the service mesh really well and integrate with it as fully as possible" you can do really well with a microservice-oriented environment. So I guess this boils down to "invest in tooling and cultivate a culture of paying attention to your architectural norms and you will tend towards good engineering" ... who knew?


> You're talking about something very odd here... a monorepo, with a monolithic build output, but that... transforms into any of a number of different services at runtime based on configuration?

Shudder...a previous team's two primary services were configured in exactly this way (since before I arrived). Trust me, it isn't (and wasn't) a good idea. I had more important battles to fight than splitting them out (and that alone should tell you something of the situation they were in!).


Its really not odd at all...this is how compilers work...we have been doing it forever.

Microservices were a half baked solution to a non-problem, partly driven by corporate stupidity and charlotan 'security' experts - I'm sure big companies make it work at enough scale, but everything in a microservice architecture was achievable with configuration and hot-patching. Incidentally, you don't get rid of either with a MCS architecture, you just have more of it with more moving parts...absolute sphegetti mess nightmare.


It’s not that odd. Databases, print servers, or web servers for example do something similar with multiple copies of the same software running on a network with different settings. Using a single build for almost identical services running on classified and unclassified networks is what jumps to mind.


To what end?

Code reuse?


It can be. If you have two large services that need 99+% of the same code and their built by the same team it can be easier to maintain them as a single project.

A better example is something like a chain restaurant running their point of sale software at every location so they can keep operating when the internet is out. At the same time they want all that data on the same corporate network for analysis, record keeping, taxes etc.


> You're talking about something very odd here... a monorepo, with a monolithic build output, but that... transforms into any of a number of different services at runtime based on configuration?

I'd say that it's more uncommon than it is odd. The best example of this working out wonderfully is GitLab's Omnibus distribution - essentially one common package (e.g. in a container context) that has all of the functionality that you might want included inside of it, which is managed by feature flags: https://docs.gitlab.com/omnibus/

Here's an example of what's included: https://docs.gitlab.com/ee/administration/package_informatio...

Now, i wouldn't go as far as to bundle the actual DB with the apps that i develop (outside of databases for being able to test the instance more easily, like what SonarQube does, so you don't need an external DB to try out their product locally etc.), but in my experience having everything have consistent versions and testing that all of them work together makes for a really easy solution to administer.

Want to use the built in GitLab CI functionality for app builds? Just toggle it on! Are you using Jenkins or something else? No worries, leave it off.

Want to use the built in package registry for storing build artefacts? It's just another toggle! Are you using Nexus or something else? Once again, just leave it off.

Want SSL/TLS? There's a feature flag for that. Prefer to use external reverse proxy? Sure, go ahead.

Want monitoring with Prometheus? Just another feature flag. Low on resources and would prefer not to? It has got your back.

Now, one can argue about where to draw the line between pieces of software that make up your entire infrastructure vs the bits of functionality that should just belong within your app, but in my eyes the same approach can also work really nicely for modules in a largely monolithic codebase.

> Is this meant to be simpler than straight separate codebase microservices?

Quite a lot, actually!

If you want to do microservices properly, you'll need them to communicate with one another and therefore have internal APIs and clearly defined service boundaries, as well as plenty of code to deal with the risks posed by an unreliable network (e.g. any networked system). Not only that, but you'll also need solutions to make sense of it all - from service meshes, to distributed tracing. Also, you'll probably want to apply lots of DDD and before long changes in the business concepts will mean having to refactor code across multiple services. Oh, and testing will be difficult in practice, if you want to do reliable integration testing, as will local development be (do you launch everything locally? do you have the run configurations for that versioned? do you have resource limits set up properly? or do you just connect to shared dev environments, that might cause difficulties in logging, debugging and consistency with what you have locally?).

Microservices are good for solving a particular set of problems (e.g. multiple development teams, one per domain/service, or needing lots of scalability), but adding them to a project too early is sure to slow it down and possibly make it be unsuccessful if you don't have the pre-existing expertise and tools that they require. Many don't.

In contrast, consider the monolithic example above:

  - you have one codebase with shared code (e.g. your domain objects) not being a problem
  - if you want, you still can use multiple data stores or external integrations
  - calling into another module can be as easy as a direct procedure call in it
  - refactoring and testing both are now far more reliable and easy to do
  - ops becomes easier, since you can just run a single instance with all of the modules loaded, or split it up later as needed
I'd argue that up to a certain point, this sort of architecture actually scales better than either of the alternatives, in comparison to the regular monoliths it's just a bit slower to develop in that it requires you to think about boundaries between the packages/modules in your code, which i've seen not be done too often, leading to the "big ball of mud" type of architecture. So i guess in a way that can also be a feature of sorts?


I'd like to challenge one part of your comment - that microservices break up data on module boundaries. Yes, they encapsulate the data. However, the issue that causes spaghettification (whether internal to some mega monolith, across modules, or between microservices), is the semantic coupling related to needing to understand data models. Dependency hell arises when we need to share an agreed understanding about something across boundaries. When that agreed understanding has to change - microservices won't necessarily make your life easier.

This is not a screed against microservices. Just calling out that within a "domain of understanding", semantic coupling is pretty a fact of life.


that's not at all accurate of any of the monoliths I've worked on. This in particular describes exactly zero of them:

- one shared database

Usually there's one data access interface, but behind that interface there are multiple databases. This characterization doesn't even cover the most common of upgrades to data storage in monoliths: adding a caching layer to an existing database layer.


Okay, great. But you get that ten shared databases behind a shared caching layer isn't any better right? It's still all shared dependencies.


.NET Remoting, from 2002, was expressly designed to allow objects to be created either locally or on a different machine altogether.

I’m sure Java also had something very similar.

Monolith frameworks we’re always designed to be distributed.

The reason distributed code was not popular was because the hardware at the time did not justify it.

Further, treating hardware as cattle and not pets was not easy or possible because of the lack of a variety of technologies such as better devops tools, better and faster compilers, containerization, etc.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: