Hacker News new | past | comments | ask | show | jobs | submit login
Microservices Ending Up as a Distributed Monolith (infoq.com)
154 points by adamnemecek on Mar 3, 2016 | hide | past | web | favorite | 51 comments

There's compile-time DRY, and then there's run-time DRY.

* Compile-time DRY allows me to change my code quickly and reliably.

* Run-time DRY allows me to deploy services independently.

Sometimes these objectives are at odds. Sharing code among my services, can make my code less repetitive at compile time, but it can also lead to duplicated execution at run time. And vice versa.

The important thing (besides reading the latest blogs and following the latest tech trends, of course) is to figure out what your objective is.

For example, at my company, we want (1) to improve the ease of code development, and (2) to increase production stability for critical functionality.

So we have microservices that allow us to prevent non-critical issues from taking out critical functions. And we use extensive share code for HTTP, JSON, templating, logging, some business logic etc. in order to make development easier. We don't care about decoupling releases or segmenting production ownership. In other words, run time DRY isn't important to us. Our services are simply ways to intelligently distribute execution across limited resources.

Assess your objectives, and then choose the most solution that will meet them. Never do the reverse.

If you need separation of concerns and information hiding at runtime, this article has some good tips.

I find this pretty hilarious coming from the guy who wrote Hystrix of which was extremely tightly coupled to another library called Archaius of which pulls a gazillion other dependencies (that being said I still highly respect Ben).

I had to spend a couple of days decoupling the mess [1].

The other annoying thing with Hystrix is that its basically Request/Reply pattern except its even worse because it uses inheritance (command pattern) and a massive singleton (HystrixPlugins). Furthermore although it uses a streaming library (rxjava) it sort of disingenuously provides very little actual support for streaming.

You don't need to have separate executables to make things decoupled but rather a good build system and architecture. Some of the ways are avoiding Request/Reply, singletons, and the command pattern.

We achieve fairly good decoupling by using a language agnostic custom message bus (it's over RabbitMQ which is AMQP... tons of clients) but you could use an actor framework or as another poster (@platform) mentioned Storm and Erlang process if you want to stick to one language.

[1]: https://github.com/Netflix/Hystrix/pull/1083

"Do as I say, not as I do". It looks to me as if Ben bundled Archaius several years ago and just recently made it a "soft dependency" which isn't required. Couldn't you say he's learned that a "gazillion other dependencies" are bad over the past several years and has learned enough to share his knowledge and implement it?

And then critiquing other parts of his code which have nothing to do with his talk is completely unnecessary.

> And then critiquing other parts of his code which have nothing to do with his talk is completely unnecessary.

It's completely relevant as its many reasons why Hystrix is very coupling which is exactly what was discussed to avoid in the presentation. I think Ben would agree and I'm sure he has learned from his experience.

The tone might sound disparaging but as I said I truly hold both the library and Ben in very high regards (otherwise why would I fix it). I suppose I have a weird since of humor because I thought hey.. I just fixed that library that had these problems and the creator is talking about those problems :) .... NOT Ben is a moron and should have done it right.

Do you have some good examples of why not to use the Request/Reply pattern? I've been trying to understand why some distributed dataflow systems prefer pushing rather than pulling.

Let's say you want to push data from the server to a mobile/web app in "real time". Examples I've come across lately would be a chat application and real time map updates (a la Uber).

Now your clients could either: (1) Pull the server every 0.5 second, or (2) Open a connection that stays open and let the server push messages across the connection.

Request/Reply and pulling is not bad in itself, but in some cases it has two downsides: (1) It delays everything because you can only pull so often. (2) I's bad for performance and scalability because all those pulls have to be handled, even when there's nothing to return.

Now you want to create a backend that also avoids pulling in the relevant places.

I think it is important to always remember that all push models, under the covers, are just pull models at a lower level of abstraction. This goes all the way down to the interrupt level.

So when we say pull models are bad for performance and scalability, what we really mean is that the abstractions we are putting around our pull model at this level are more costly than the one level down (ie your message bus is pulling off a tcp/ip socket but avoiding http).

Depending on what kind of performance and scalability you are talking about, you can either address that by going an abstraction down (which is nearly always a performance booster that comes with a development time cost) or you can pull more, less often, at the high level of abstraction (smart batching protocols also exist at every level of abstraction).

Yes this is a fundamental problem with queues, message passing, streams or any pseudo push system.

You must respect backpressure because if you don't queues blow up and the only way to respect backpressure is to have the consumer request (ie pull) for more data. The trick is to make this a lazy async pull (reactive streams) and not a blocking pull (blocking queues) ie I'm ready for more data send it to me whenever.

RabbitMQ deals with this with ACKs, Prefetch count and heartbeats (as well as some other complicated techniques like TTL and queue length).

Your point is valid. But I was under the impression that lower level stuff often did not pull. At point does a TCP/IP connection pull for example? Bellow epll/kqueue? Or the keyboard->machine connection? I haven't done anything that lowlevel in a while though, so I could be wrong.

Remember that epoll/kqueue are just more efficient abstractions on top of OS level event loops. Those event loops are polling from driver* level queues and interrupts.

Interrupts are just* the cpu polling on interrupt lines in hardware.

There are myriad of reasons (I can't find a comprehensive link) but I will try to list one or two. Some even have mixed feelings since Request/Reply is so ubiquitous and very easy to understand/implement that it makes up for it.


Request/Reply is obviously not good if your problem is inherently push based ie Pub/Sub. This is because of performance reasons (could be argued) but also because Req/Rep does not fit naturally to pub/sub problems (you can think of pub/sub problems as live streams of data like stocks or chat systems).


Request/Reply pattern requires very smart endpoints. For some reason this is extolled heavily in the microservice crowd because it avoids single points of a failure. However smart endpoints need to deal with server discovery, timeouts, circuit breaking, metrics, backpressure and much more. This logic often gets put into a library and suddenly becomes part of ALL clients which as the presentation mentions is bad (this is what Hystrix does to some extent). For example having all your clients depend on ZooKeeper to find some servers is pretty coupling (this is what Archaius does).

That being said the above can be mitigated by making sure communication doesn't absolutely rely on the endpoints having the same intelligence.

Message Buses like RMQ avoid this issue because the pipe is smart. Your clients don't need to have the above logic which makes implementing clients in many languages far easier... but at a big cost: single point of failure and an extra network hop (broker).

Like smart endpoint problems smart pipe problems can be mitigated as well (e.g. RMQ has cluster support).

We use a mix of both patterns and in both cases have not-so-dumb endpoints and not-so-dumb pipes.

What does the custom message bus offer over rabbitmq? RMQ seems pretty fully featured already.

The custom part isn't really custom. The custom part is just using RMQ consistently.. clients follow some guideines. Its basically saying we support Cap'n Proto, JSON, and Protobuff for the message body (one day we will pick one but alas...) and do pretty much zero routing (each message type goes to its own queue).

However we do do some endpoint routing independent of RMQ where if the client pushes a message to the "bus" and we detect the same client can consume that message we will sometimes avoid sending it over RMQ (ie local publish) (basically a performance enhancement to encourage bus usage for decoupling while avoiding the network hop for low latency).

By using protocols like AMQP and serialization that are language agnostic and focusing on doing very little routing we could switch to zmq, kafka, HTTP2 or whatever is in vogue if we wanted to.

With REST you have serious contract complexity: URIs, HTTP headers, POST form parameters, HTTP method, query parameters and the HTTP body. With an async message bus The message is the contract.

You could be more generous in your reading and assume he has learned from past errors (hystrix).

I think the article misses some important context about scale and economic efficiency.

When you have 10,000 engineers then decoupling everything behind protocols and formats makes a huge amount of sense because the coordination costs are astronomical.

When you have 10 engineers then you need to standardise tooling and libraries because the costs associated with diversity in the codebase swamp the coordination costs.

I've worked on what I think could be called a "distributed monolith" and its not bad, services and clients can share interfaces, common build and deployment infrastructure for everything etc.. You still get the benefit of being able to scale out, separate failures and deploy components easily, w/ the benefits of standardizing around one language.

We do the same. A monolith gets copied to all stations which then run different parts of it. That way they all have the same infrastructure code available.

How do you deal with services that use different versions? Tight coupling means that to make changes to Service A you have to wait for Service B to make the changes...which can be a loooonnngg time if the team looking after Service B is different from your own.

I love this article. Microservices are as much about coupling as they are about scaling. It reminds me of the recent notion of disposable code. We need to get more comfortable with the idea that decoupling likely has more long term benefits than reusablilty.

I don't understand how Guava couples my services, could anyone explain me?

Let's say I have service A that uses Guava v18, and service B that uses Guava v19. Neither of them care what other one use, all they see is RPC (or HTTP) calls.

I was the guy at Netflix that had the task of moving the Company from Guava 10 to Guava 11. Guava 11 was backward-incompatible with Guava 10. On the Platform team (produced libraries the entire company consumed) we could only move to Guava 11 if EVERYONE moved to Guava 11. The only alternative would have been to shade Guava 11 into the platform library which wasn't pleasant.

That sounds like a real impractical way to run a platform team.

Moving the entire company codebase at the same time may be possible with a dozen developers, but when you reach Netflix size it seems impossible.

What am I missing?

It wasn't practical. But that's what we had at the time.

Just to chime in here -- I think this is a relatively common happening. I've found that companies with lots of teams developing software often look to "standardize" the software being used to create things like web services/pieces of infrastructure.

This often starts with creating a local nexus server (in java land), then is followed relatively quickly by creating a "X company commons" for "everyone" to use.

I see :)

That kind of problem is tricky, since nobody decided to make it that way.

A system worked great when it was instituted, and was never changed, but is now no longer good.

Anyway, if/when I become Netflix CTO, I will make sure this is fixed!

I should mention that the reason this was such a huge problem was that our dependency system at the time gave each group the option of using the "latest" version of the platform library instead of fixed version. Everyone used it. If, instead, people had locked to a known version of the library it would've been so much easier to upgrade.

What is shading a library?

I was curious as well, so I went looking: http://stackoverflow.com/a/13620420/4563079

relevant part: "For example, I am developing Foo library, which depends on a specific version (e.g. 1.0) of Bar library. Assuming I cannot make use of other version of Bar lib (because API change, or other technical issues, etc). If I simply declare Bar:1.0 as Foo's dependency in Maven, it is possible to fall into a problem: A Qux project is depending on Foo, and also Bar:2.0 (and it cannot use Bar:1.0 because Qux needs to use new feature in Bar:2.0). Here is the dilemma: should Qux use Bar:1.0 (which Qux's code will not work) or Bar:2.0 (which Foo's code will not work)?

In order to solve this problem, developer of Foo can choose to use shade plugin to rename its usage of Bar, so that all classes in Bar:1.0 jar are embedded in Foo jar, and the package of the embedded Bar classes is changed from com.bar to com.foo.bar. By doing so, Qux can safely depends on Bar:2.0 because now Foo is no longer depending on Bar, and it is using is own copy of "altered" Bar located in another package."

So it's like name mangling for libraries?

Friends don't let friends use guava.

Guava is what happens when the owner of a widely used technology is inept. Java is in much better shape today.

I think his point is that if ALL of your services have to use Guava (perhaps because you included some functionality that only Guava provides) then you have coupled your service to Guava. You can't use nodejs to build a service.

That assumes they were written independently. But if your services have a large heap of shared libraries in common, they all need to work together. Then when you upgrade Guava, there may be a lot of code to fix.

Thank you. But if my services have a large heap of shared libraries in common, it's by definition a monolith. With Guava, or without. I shouldn't call it "microservices architecture" in the first place.

What I always meant by "microservices" is whatever servers talking to each other by some kind of RPC, e.g. Protobuf in Google (HTTP+JSON works as well if you don't operate terabytes). Then it doesn't matter what Guava version, or even what programming language my services are written in, as long as they support RPC format (e.g. Protobuf).

So I conclude that original post author does microservices wrong. Or just don't understand what he's talking about. Sad to see it on InfoQ.

I don't get it either. If I use Spring WS to provide the endpoints, how does it make it a monolith? It's always using the right tool for the right job but if one set of libraries are the right tools, why not use it across everything where appropriate.

One of the benefits of microservices are your ability to choose the right tools for each of them instead of a common denominator. I don't know about Spring WS, but if it makes harder for ie a Go microservice (or whatever makes sense for your domain) to talk to your service then I would consider it a liability.

A lot of Microservices advocates talk about abandoning DRY to reduce code coupling.

I think that's completely impractical and that we should still make heavy use of shared abstractions in modules and libraries.

If I have a routine in two application services, e.g. a currency converter, and that code has to be changed, I have a few options:

With code duplication (Abandon DRY):

1. If the domain demands, I have to change the routine in multiple services and build, test and deploy both of those services in lockstep.

2. If the domain allows, I can change it in one service and build, test and deploy that one service in the short term. The other service will likely need to be updated at some point in the future. This is hard to manage and brings duplication of effort.

Without code duplication (DRY):

3. If the domain demands, I have to change it in one shared library and then test and deploy both of the dependent services in lockstep.

4. If the domain allows, I can change one shared library and build test and deploy only the one service that I need to change in the short term. The other service will likely need to be deployed at some point in the the future.

Shared libraries are orthogonal to the question of loose or tight coupling. We can have no shared libraries but still have two services which are tightly coupled and chatty. That is the situation we need to get away from, and it is acheived through good DDD and service boundaries, but really doesn't seem to have much to do with shared libraries.

I think the problem comes in when the shared library is used to provide some kind of baseline service, so that if changes are made to the library and 2 services using them are out of sync, then chaos will result.

The solution you advocate is correct. Service boundaries need to be well defined and well separated.

> so that if changes are made to the library and 2 services using them are out of sync

Wouldn't versioning help with this?

In what way? Versioning helps pin shared libraries so they are not inadvertently updated. However, when you have 2 services A and B using the same library, changes to the library, if they are not updated in both services, may lead to the services being unable to talk to one another. e.g. Suppose service A and B use library L. But the authors of library L decided to rename a certain field in the JSON of HTTP calls that are crafted by the library. Now, unless both A and B update to the new version, they may not be able to talk to one another, so each has to coordinate their efforts to either hold off using the new version, or deploy the new version at the same time.

> But the authors of library L decided to rename a certain field in the JSON of HTTP calls

The API endpoints should be versioned as well. Once all services have been upgraded to use the next version of the API, the previous endpoint can be deprecated.

I had a quick look at the transcript of Ben's talk, and it seems that one of his main issues with microservices which share the same code base (and therefore dependencies) is the overhead of upgrading their dependencies. I do understand that in some cases this might be problematic. However, if the microservices are written using the same language (or runtime), there is a huge benefit of reusing some abstractions. These might include things like validation schemas, serialisation formats, etc. These are the things that most likely would need to be created anyway for every microservice written in a new language. As usual, there has to be some balance in the amount of coupling between microservices. But there is nothing inherently wrong with reusing code between them. I think the most important factor in a microservice ecosystem is having well defined protocols and APIs.

You can also view the original talk at www.microservices.com/ben-christensen-do-not-build-a-distributed-monolith

Isn't that what's supposed to happen? You don't distribute small systems, you distribute big systems, otherwise the benefits of microservices are way fewer.

People jump on the microservices bandwagon way too often. I'm glad that Jan pointed out that not everything that is broken down is a micro service.

So, if someone creates a DropWizard endpoint, and someone else creates a Spring @RestController endpoint, and someone else creates a Scala Play endpoint, then the problem becomes that someone that knows Spring but not Scala might find themselves tasked with supporting the Play codebase, which is scary. So then the employer seeks to standardize on Spring, but that's apparently bad because if every new endpoint requires Spring then that's a Distributed Monolith...

I have used Apache Storm and Erlang OTP. And cannot still grasp how micro-services are different from Storm's bolt's or Erlang's processes.

While subjective, for a JVM-based solution stack, Apache Storm appears to be more elegant, complete and efficient than Linkerd, as an example.

May be advantage of microservices vs Storm becomes more apparent with scale, and I just simply had not had the experience at that level.

A microservice would be more like a well-structured, logically-bounded Erlang OTP Application (as in literally the Application pattern provided by OTP), than a single process.

You could run the OTP Application, and a bunch of others, locally. You could stick some on a different machine/cluster. Everything is nice and encapsulated and distributable.

The only issue is this would still violate the concept in the OP because it forces everything to either behave like an OTP Application or at a minimum to behave like an Erlang node.

Something I posted to clarify microservices and SOA - https://medium.com/@kashifrazzaqui/will-the-real-micro-servi...

Message Bus != SOA. What the prior poster is alluding too is the Actor pattern and/or Message passing pattern.

My problem with microservices is that your often picking subpar solutions/technologies particularly if low latency is a priority. For example HTTP is not always the best protocol and Request/Reply is not always the best messaging pattern but this is what most people equate to microservice (ie HTTP 1.1 REST a lightweight list of servers (etcd, zookeeper, consuler) or a load balancer).

This is really good. I also think about it in terms of ants and distributed systems.

Thanks for sharing.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact