Hacker News new | past | comments | ask | show | jobs | submit login
Micro-monolith anti-pattern (chi.pl)
111 points by kiyanwang on April 13, 2017 | hide | past | favorite | 69 comments



There are a lot of truthy ideas here, but I feel like it still subscribes to this rosy picture of avoiding the pitfalls and winding up with an amazing scalable architecture that is an unmitigated win and fundamentally better than a monolith.

The only problem is that microservices add overhead and require more tuning and control than monoliths. This is especially true if you give teams carte blanche to build things however they want with no shared standards.

Being an early stage startup guy I am probably biased, but I would argue it's best to default to a monolith and try to keep things modular with an eye towards later extracting microservices as A) you find incongruent workloads and B) your team is growing and you need to create some isolation so not everyone has to know everything.

Only once you understand the problem domain and workloads are you fit to draw service boundaries.


Or to put it another way: There is a time when every feature fits in one team. There is also a time when any given one single feature grows too large for one team.

Recognize which time you're in and make the best choice for now (and some of your expected future).

If everyone designed everything with the expectation of Facebook levels of traffic, nothing would ever get done (your team isn't big enough), and that overhead would be wasted in 99.9% of cases.


I agree and will add: it's a good idea to focus on defining or using common and simple protocols and contracts.


> A requirement of a specific framework for all services.

Eh, I don't see this as an antipattern. Building disparate services but using the same frameworks/languages has tons of advantages and relatively few(none?) disadvantages.

A few:

- Engineers can move between teams and features more seamlessly

- Infrastructure and common functionality can be shared

- You will inevitably end up debugging someone else's code when it comes to on-call, that's gonna suck if it looks completely different/foreign to you than what you use day to day.


Sharing a framework for logging, routing, and other boring things is fine; sharing a framework with domain-specific logic is when you start getting into trouble. Now if you want to change something in that framework, you have to change and deploy every last one of your services.


> sharing domain specific logic

Isn't that just called a business requirement?

> you have to change and deploy every last one of your services.

How is that different to code duplication as the article recommends? When business requirements change all affected code points needs to change.. you can't just wish that away.

-

The real problem here is mismanaging seperation of concerns.

Cqs mostly solves the hard problems. But having shared read objects will, at times, require multiple systems to make use of the same domain logic. Whether you're better off with code duplication or a hard shared dep is a trade off up to you to figure out. I tend to go with a mix that makes sense based on expected frequency of change and how breaking the changes are likely to be.

Another microservices is also one answer but can be an unworkable bottleneck for a lot of these types of situations.


> Now if you want to change something in that framework, you have to change and deploy every last one of your services.

Do you? As long an appropriate versioning strategy is in place, it seems to me that you would only need to update the service that prompted the code change, leaving the other services to use the older version. If the other services need to have their code changed as well, then that'd be the case whether or not you used a shared library, except you'd have to update the code in 12 places instead of just one.


I don't know that many frameworks with domain-specific logic. It almost seems like a contradiction in terms.


A classic example that I've encountered would be using some custom web socket RPC protocol for communication between microservices instead of just using plain HTTP.

It can be tempting because an engineer thinks "I can optimize this and reduce the network overhead using this creative protocol." But now they have put themselves in a situation where if anything in that custom protocol changes they will have to update every single microservice in their backend ecosystem.


How is that not a problem in HTTP?

This is just API versioning, that needs to happen regardless of the protocol you choose.

edit I see you mean the protocol itself (like if gRPC goes from version 1 to 1.2 or something). Yes, that's a thing... most protocol developers are careful about backwards compatibility to allow for some services to be behind a few minors to account for this. Also, HTTP still isn't invulnerable to this...HTTP/2 is very much a thing.


That's really not domain specific though. It's just a proprietary protocol.


Now if you want to change something in that framework, you have to change and deploy every last one of your services.

would the alternative be changing every one of your services, then deploying?


Yet, that negates one of the few advantages services have for most people, that is the capacity of migrating between development tools without porting all the code, so they can use the best tool for the problem, or change away from obsolete tooling.

As often is the case, the best place is normally between those extrema. You probably want to restrict languages and frameworks, but you probably do not want to dictate any single one. Or, if you want to dictate a single one, you'll probably want a monolith.


My experience with this requirement was that each of our "microservices" ended up consuming between 0.5 to 1 GB of RAM just to start up and connect to a database and a message bus. And these were simple CRUD services. Developers with older laptops couldn't even run all of the dependencies for what they were working on, and even those with better hardware were pushing the limits of their machines just a few months after adopting this architectural strategy.


If every microservice uses 500M+ of RAM you most probably are using the wrong technology to implement your microservices.


Yeah, that was more the issue. Implementing them with just one technology would have been fine had more thought gone into that decision.


To expand a bit upon user5994461's statement, you're probably doing it wrong if you're working like this regardless of laptop. If a micro service depends on a micro service, you might be doing it wrong. Let's say there is a valid reason for that, is there anything we can do to simplify the situation to run on a laptop?

Say you boot up the recipes service that needs your auth service. In this scenario mock the auth service either directly in code (you should be able to do this if you use DI for the actual service repo) or by static responses for your work scenario into a proxy/stub server like SimpleHTTPServer from Python [1]. When you're recipe service calls out, it's hitting the localhost:8002 for the python service which returns the same general application/json-v1-auth data. You can even go a bit further and make the Python code adaptive based on input.

You might benefit from splitting the system apart with events too. For example, duplicate some of the authentication or authorization data across the micro services. Let's say your recipe service needs to know who can read a given recipe. Use JWT or some other tokening feature to say who the current caller is (in a secured way). In the recipe's service, keep track of that user's authorizations locally. JWT tells you they are authenticated, the local DB can tell you their authorization. Now you don't need the auth service since you can probably create valid tokens. When the user's authorization changes, an even tells the recipe service. While there is a small period of time in which the person might be able to access something they shouldn't, in a functioning distributed system this isn't too bad. If you can't handle such failures, you might now want a micro service architecture.

1 - https://docs.python.org/2/library/simplehttpserver.html


How do you deal with cache invalidation? That seems like a potential major security risk.

Caching other microservice's data in general just sounds like a bad idea.


In the examples above, the primary source of truth is the auth service. However, the rights get distributed (duplicated) to the various services. As I pointed out towards the end, there is a period of time where the auth service might say that a user lacks a group, but that the recipe service says the user has the group. During this window of time, the system will allow the user to access a feature that requires the group they are in fact no longer a part of.

The same is true if you put groups in a JWT. Caching is caching. You always have to weigh the benefits against the cons.

In general this is a problem with distributed systems. CAP and all that. What do you do if your network is partitioned such that the recipes service can service requests while the auth service can't notify the recipe service of changes? You can bring the whole system down. You can continue to allow operations based on the cached rights. You could limit operations if you haven't gotten a heart beat from the auth service in while.

Back to the example, say group changes occur rarely. Say over that, network partitions (including message system failures) occur rarely too. Chances of both happing is P(partition, msg_failure). Probably small chance. If you can live with that, you're life might be better. If you have hard requirements to always get the auth, you might need to couple. Even then you're not really guaranteed that the latest data from the auth service is the latest data concurrently.

Finally, even if you have coupling, you can still design the system to use profiles to pick the service. When you're running locally, don' go to the real auth system, just go to SimpleHTTPServer. Don't attempt to run the auth service locally. SimpleHTTPServer will probably use fewer resources than your micro service using some complex library.


That sounds orthogonal to what I was saying. I'm not advocating for bloated services. If you don't need module X, don't include it.


"laptop" => Don't expect to run distributed systems on old laptop.


I don't really think 3-4 micro services ought to be something one can't run locally relatively easily. Even with a brand new MBP I couldn't effectively develop full stack using just a handful of services without running out of memory. Though I suppose this is less about locking yourself into a single framework for your services, and more about not shipping insane amounts of bloat with each of your services.


> I don't really think 3-4 micro services ought to be something one can't run locally relatively easily.

To put things in perspective: There are in fact dozens of microservices running on every laptop out there. They're called "system processes" or "daemons".


The "micro-service" name is a fallacy, there is no reason for services to be "micro" in size.

The language runtimes and dependencies already add a lot of boilerplate, even for the simplest of services. You will also need databases and data sources in the same box.

Laptop have little memory. MacBook are no exceptions.


We have one of these and it is just a terrible internally developed framework, but we are stuck with it because all teams use it. If there was more freedom on the team level to pick better solutions, then the whole company would have likely standardized on something better by now.


But the argument is you have a poor framework which was internally developed. This is considered a technical debt, and not a fault of sticking with one framework. I don't think people are arguing for only one framework is allowed, but rather the point is to establish a platform approved list of software/libraries. If you have reasons to believe moving to another framework is better in the long run, talk and bring up a plan how to address it. But I will say often people (both developers and managements) will dismiss rewrite / attempt to replace components because they think it's better off to leave things alone. This is a double sword: low morale and excitement to work on a project despite the product is unstable, but fear of failure after replacing / rewriting. strong technical leadership is often absence.

This is actually quite important from both licensing and security point of view, and having such list available encourages discussions. I'd like to write one of my services in Go, but does my company have the resource to maintain Go application after I depart? What are the strong technical reasons that the service must be written in Go rather than in Python or in Java? If Go is approved, then I need to make sure management offers training / materials for developers to learn Go, and that I am available to train others (have good documentation, brown bag sessions and what not).


"I don't think people are arguing for only one framework is allowed"

Read the post I replied to. It advocates requiring a single framework.

Yes I am worse off because the framework sucks but there is no incentive for it to improve because everyone is forced to use it. I would love an approved list as you suggest because at least that would allow some choice, but even better than that would be working at a place that trusted me as an engineer to make those decisions myself.


Sounds like you just need to fix the tools you have. I'm not sure how allowing everyone to choose whatever they want would somehow end in standardization.


Teams are highly coupled by their dependence on a common framework: they can't choose their own tools, and you are at high risk of having to coordinate with another team (the framework team) to release a new feature, all of which are things microservices are supposed to eliminate. Therefore, it's an antipattern.


I've seen this happen where you develop a micro service and everything is working fine but the framework team makes a breaking change and doesn't tell you. Now your micro service is broken and it is somehow your fault because only your team is responsible for the micro service.


Yes, dependency management is never fun. I think the current approach we have in all languages fail short because we all depend on a text file declaring dependencies. We need to be able track the dependencies and versions. As part of the build process before merging code into a release branch or master branch, it's probably worth reading the flat file, parse the content, and check with a service (backed by a database).

-> depends

A -> B

B -> C

C -> nothing

D -> A

If A has changed, then on merging A, we should run D with new A. If C changed, then A, B, C, D are need to be tested. A flag or a tag for "non-backward compatible" should be added so on merging we can notify the developers (both producer of the library and consumer of the library) aware of breaking changes in their review queue. If breaking changes keep popping up, time for the tech leads and team managers of both sides to meet and understand how to avoid breaking changes so often.

I don't know, someone working with large tech stack should comment on their experience. But we can't just use a text file and an email hoping someone would follow up or realize.


in my experience its always worth doing extra work to avoid circular dependencies and couplings.


"Another dangerous decision is to use a lib for reusing code. This is particularly bad if the lib contains domain logic."

Question about this: if we're to take the above to heart, doesn't that imply that we may implement some common functionality multiple times? It seems there's a real tension here between "code re-use" as a generally laudable goal and the need to avoid sychronization of testing/deployments.

I'm curious how others out there have dealt with this particular issue.


In the end I think the idea of microservices as totally independent from each other is just a pipedream. There will always be dependencies between them. His solution to create more microservices for shared libs will create a whole other set of problems. In the end people just have to talk to each other and sometimes you just can't avoid that people have to synchronize their work.


I wrestle with this one all the time. A shared library creates a weak dependency between the systems that use the library. Making changes to that library may affect those other systems in unpredictable ways. In my opinion if you leave business and domain logic out of those libraries and have a well defined and tested service contract then it is reasonably safe to create shared libraries. I have found that maintaining backwards compatibility and using feature flags in these libraries goes a long way at making them relatively painless for the microservices that use them to deal with change. We also generally only use shared libraries for codifying best practices and conventions.

But really, it's a constant struggle. Micro-service teams are semi-autonomous and are going to recreate functionality at will. That's a communication problem mostly though.


I'm currently struggling with dealing with shared libraries at work. I think the challenge is that the whole concept of dev-ops is fairly new[1], and from my experience Java, .Net, Python, etc. weren't designed for easy management of versioned dependencies. You have tools like Maven, Nuget, and Pip that pop up because it's a real problem, but they're add-ons to the ecosystem rather than built-in at a foundational level. And more to the point, if you want to set up your own package server[2], it can be a huge effort to integrate it into your build process[3].

[1] Disclaimer: I've only been programming for ~10 years, and professionally for ~3 years

[2] Personally, I'm currently using Sonatype Nexus

[3] I'm currently grappling with trying to automate building a multi-targeted assembly .Net assembly, and the co-dependency between VS, msbuild, and Nuget gives me agita


I can speak best to pip, but versioned dependencies are a non-problem in Python because of it. Who cares if pip isn't baked into the standard library?


Versioned dependencies absolutely still are a problem, even with pip. You still need something like virtualenv to isolate dependencies for different projects. And virtualenv is a compete hack (granted, one that works; but still a hack)


What? Virtualenv is not a hack, and "problem" doesn't mean what you think it means if you think pip is a "problem".

You're barking up a tree that was cut down a few years ago.


pip ships with python now


If a library update requires all teams using the library to start using the new version simultaneously, then you're pretty tightly coupled. Not a good thing. But if different services can interoperate while running different versions of the library, then it's less challenging.

Combine that with a reasonable deprecation schedule on old versions of the library and let teams upgrade when they're ready.

Now you have the best of both worlds - a library that keeps moving forward with a known window (per the deprecation schedule) when they can completely drop old functionality, all while other teams are happily using the library.

Ofcourse, a number of things needed to pull this off:

* independent deployments so services don't automatically share the same library versions

* communication tools so library owners can publish a deprecation schedule and the right people are notified effectively

* verification mechanisms so library owners can make sure everybody has upgraded before pushing out breaking changes


Depends, the author suggests that one way to fix this is to extract the shared logic into a microservice of its own, but that may not be possible at all times.

I also think that functionally, there's a very small difference between a library with a public API and a microservice with a public API - the only main advantage is that microservices can be written in any language, but that may not be important for many businesses.

I think you should never "take it to heart", at least not in the meaning that you should follow advice slavically. It's just one more tradeoff you should be thinking of to avoid cargo culting.


> slavically

Nit, but you probably meant "slavishly" (even given that the author is Polish)


Usually the only thing that should be shared is base process bootstrap related code. So for example I've used a shared lib in a microservices environment which had common boilerplate code for starting a web server that connected to a database, and initializing a router.

The other microservices all included this shared library, and it benefited devs by providing them with an easy way to create a new microservice quickly without having to recreate a bunch of boilerplate. It also provided an easy way to centrally manage a few things like the version of express (the node.js web server we chose) and a few other things. By updating that central lib the rest of the microservices would all get upgraded automatically the next time we rebuilt them.


I'm not persuaded that teams should be this independent. It always leads to a poor experience for the user. Further, duplicating code and allowing this kind of hairball growth builds a spaghetti code system, but without adequate oversight to have sufficient design.


This is why a good CTO or VP of Engineering or whoever the team leads are reporting to is important. Someone has to keep the big picture in mind, both from a product and engineering point of view. A good leader can guide teams in the right direction to eliminate bad user experiences or lots of repeating code or other problems that arise from organizations built around microservices.


this. and frankly, probably the longest recognized anti-pattern in software engineering ever is when software architecture ends up materializing the management structure. Team = service pretty much concretizes that.


This blog post needs proofreading. There is obviously something interesting they want to talk about but the language is all over the shop!


I agree way too many typos. You can tell it was the lack of proofreading (not acceptable) not someone whose first language is not English (acceptable).


It's a .pl address, so I don't think English is his first language.


I applaud the effort the author put into writing an article in a non-native language...it's hard enough to write well in your mother tongue. But if you're going to do so and publish it for public consumption, at least have a native speaker proofread it for you. If you don't know one, I'm sure plenty of HN readers would be happy to help out.


Now that I read it again, I noticed the sentence structure reads like a non native English speaker. At first I just paid attention to words that were wrong that was read like a bad autocorrection.


and that his name is Tomasz Fijałkowski.


as English a name as John Smith


But one that suggests English might not be his first language.


I agree with most of the article accept the idea that you shouldn't have a front facade controller and that each team should be responsible for their own UI.

That isn't feasible in a lot of real world contexts.

- what if the same service is used by web and mobile applications?

- the backend services usually have access to data that shouldn't be exposed in the DMZ. By keeping all of the services that communicate with backend systems behind the firewall and having the front end services exposed in the DMZ, you have another layer of protection with the only port being opened between the DMZ and the internal network is port 80/443


This seems to argue that any microsevice shared by many other microservices is an antipattern.

Lots of apps use S3 as a common store. Is S3 an antipattern? The same goes for things like SQS, Google Pub/Sub, Google Datastore, etc. In my opinion, the idea that microservices must be isolated silos is misguided.

There are many things that are top-down (high-level control plane, many small individual, stateless parts controlled by it), but the opposite of this is the bottom-up "shared substrate". Things like data storages and buses that have the responsibility of storing state or shuffling data between actors.

In fact, I consider microservice that's a silo — that is, has its own private data store — to be an antipattern in many cases. To borrow an old phrase, data wants to be free. Today's backend patterns are very API-centric, with APIs acting as awkward gatekeepers to small amounts of data. It's a much better idea to invert this, Haskell-style: Let the data be first class, and orient your APIs to work on your data.


I'm not very experienced but the part about (non-)shared front-ends doesn't sound convincing to me at all. Why are some methods to combine front-end code bad (angular) but others are good (ESI tags, allegro)? If there is no coordination between front-ends, how do you get a unified user experience and a consistent design?


you can have a design document with examples and inspiration.


Yes, but why?

This seems to me the same principle as "no libraries, instead keep a spec and re-implement the domain logic in each service" discussed elsewhere in this thread.

You still have a common dependency of all services (the spec/design doc) and you'll still have to modify them all if the dependency changes. Except now you also have heaps of duplicate code.


It's easy to duplicate code. But hard to delete shared code.


This is a related issue and possible counter example - how do you define datastructures passed around microservices? Serialized objects ala protobufs? Versioned XML or JSON? Schema versioned docs?

This is a great opportunity to have shared libraries or at least schemas across different microservices. In fact having each service write their own parsers or serializers would be a mistake for me.


I think you would use protobuff, XML, JSON, etc. The idea would be that you encapsulate all of your services behind a gateway of sorts. External clients would interact with the gateway only, and internal services would just provide a one size fits all response. This way front end teams are responsible for their clients and there is very little change internally because every base is covered in the general purpose response.


i think a lot of people mix up micro services with distributed systems. two different services for example the smtp service and the http service should not need to communicate internaly. they should be able to deploy independent of each other. a mailform service will need to communicate with both, but a common language between all services is not needed. JSON is popular though.


I just want to call out that Akka doesn't require you only interopt between services also using Akka. If it did, it'd lose quite a lot of the benefits you want to get out of an Actor library. In truth, to interopt or not is a decision the developers make and it's also a decision they can (should) un-make.


> The shared database hidden behind some kind of ACL (anti-corruption layer) is a short term solution.

Wat


"That core advantages support quick delivery of new features."

What?


There are a number of typos and misspellings throughout the article, but it's possible to discern the general meaning anyway.


"That" should obviously be "these."




Consider applying for YC's W25 batch! Applications are open till Nov 12.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: