* Compile-time DRY allows me to change my code quickly and reliably.
* Run-time DRY allows me to deploy services independently.
Sometimes these objectives are at odds. Sharing code among my services, can make my code less repetitive at compile time, but it can also lead to duplicated execution at run time. And vice versa.
The important thing (besides reading the latest blogs and following the latest tech trends, of course) is to figure out what your objective is.
For example, at my company, we want (1) to improve the ease of code development, and (2) to increase production stability for critical functionality.
So we have microservices that allow us to prevent non-critical issues from taking out critical functions. And we use extensive share code for HTTP, JSON, templating, logging, some business logic etc. in order to make development easier. We don't care about decoupling releases or segmenting production ownership. In other words, run time DRY isn't important to us. Our services are simply ways to intelligently distribute execution across limited resources.
Assess your objectives, and then choose the most solution that will meet them. Never do the reverse.
If you need separation of concerns and information hiding at runtime, this article has some good tips.
I had to spend a couple of days decoupling the mess .
The other annoying thing with Hystrix is that its basically Request/Reply pattern except its even worse because it uses inheritance (command pattern) and a massive singleton (HystrixPlugins). Furthermore although it uses a streaming library (rxjava) it sort of disingenuously provides very little actual support for streaming.
You don't need to have separate executables to make things decoupled but rather a good build system and architecture. Some of the ways are avoiding Request/Reply, singletons, and the command pattern.
We achieve fairly good decoupling by using a language agnostic custom message bus (it's over RabbitMQ which is AMQP... tons of clients) but you could use an actor framework or as another poster (@platform) mentioned Storm and Erlang process if you want to stick to one language.
And then critiquing other parts of his code which have nothing to do with his talk is completely unnecessary.
It's completely relevant as its many reasons why Hystrix is very coupling which is exactly what was discussed to avoid in the presentation. I think Ben would agree and I'm sure he has learned from his experience.
The tone might sound disparaging but as I said I truly hold both the library and Ben in very high regards (otherwise why would I fix it). I suppose I have a weird since of humor because I thought hey.. I just fixed that library that had these problems and the creator is talking about those problems :) .... NOT Ben is a moron and should have done it right.
Now your clients could either: (1) Pull the server every 0.5 second, or (2) Open a connection that stays open and let the server push messages across the connection.
Request/Reply and pulling is not bad in itself, but in some cases it has two downsides: (1) It delays everything because you can only pull so often. (2) I's bad for performance and scalability because all those pulls have to be handled, even when there's nothing to return.
Now you want to create a backend that also avoids pulling in the relevant places.
So when we say pull models are bad for performance and scalability, what we really mean is that the abstractions we are putting around our pull model at this level are more costly than the one level down (ie your message bus is pulling off a tcp/ip socket but avoiding http).
Depending on what kind of performance and scalability you are talking about, you can either address that by going an abstraction down (which is nearly always a performance booster that comes with a development time cost) or you can pull more, less often, at the high level of abstraction (smart batching protocols also exist at every level of abstraction).
You must respect backpressure because if you don't queues blow up and the only way to respect backpressure is to have the consumer request (ie pull) for more data. The trick is to make this a lazy async pull (reactive streams) and not a blocking pull (blocking queues) ie I'm ready for more data send it to me whenever.
RabbitMQ deals with this with ACKs, Prefetch count and heartbeats (as well as some other complicated techniques like TTL and queue length).
Interrupts are just* the cpu polling on interrupt lines in hardware.
Request/Reply is obviously not good if your problem is inherently push based ie Pub/Sub. This is because of performance reasons (could be argued) but also because Req/Rep does not fit naturally to pub/sub problems (you can think of pub/sub problems as live streams of data like stocks or chat systems).
Request/Reply pattern requires very smart endpoints. For some reason this is extolled heavily in the microservice crowd because it avoids single points of a failure. However smart endpoints need to deal with server discovery, timeouts, circuit breaking, metrics, backpressure and much more. This logic often gets put into a library and suddenly becomes part of ALL clients which as the presentation mentions is bad (this is what Hystrix does to some extent). For example having all your clients depend on ZooKeeper to find some servers is pretty coupling (this is what Archaius does).
That being said the above can be mitigated by making sure communication doesn't absolutely rely on the endpoints having the same intelligence.
Message Buses like RMQ avoid this issue because the pipe is smart. Your clients don't need to have the above logic which makes implementing clients in many languages far easier... but at a big cost: single point of failure and an extra network hop (broker).
Like smart endpoint problems smart pipe problems can be mitigated as well (e.g. RMQ has cluster support).
We use a mix of both patterns and in both cases have not-so-dumb endpoints and not-so-dumb pipes.
However we do do some endpoint routing independent of RMQ where if the client pushes a message to the "bus" and we detect the same client can consume that message we will sometimes avoid sending it over RMQ (ie local publish) (basically a performance enhancement to encourage bus usage for decoupling while avoiding the network hop for low latency).
By using protocols like AMQP and serialization that are language agnostic and focusing on doing very little routing we could switch to zmq, kafka, HTTP2 or whatever is in vogue if we wanted to.
With REST you have serious contract complexity: URIs, HTTP headers, POST form parameters, HTTP method, query parameters and the HTTP body. With an async message bus The message is the contract.
When you have 10,000 engineers then decoupling everything behind protocols and formats makes a huge amount of sense because the coordination costs are astronomical.
When you have 10 engineers then you need to standardise tooling and libraries because the costs associated with diversity in the codebase swamp the coordination costs.
Let's say I have service A that uses Guava v18, and service B that uses Guava v19. Neither of them care what other one use, all they see is RPC (or HTTP) calls.
Moving the entire company codebase at the same time may be possible with a dozen developers, but when you reach Netflix size it seems impossible.
What am I missing?
This often starts with creating a local nexus server (in java land), then is followed relatively quickly by creating a "X company commons" for "everyone" to use.
That kind of problem is tricky, since nobody decided to make it that way.
A system worked great when it was instituted, and was never changed, but is now no longer good.
Anyway, if/when I become Netflix CTO, I will make sure this is fixed!
"For example, I am developing Foo library, which depends on a specific version (e.g. 1.0) of Bar library. Assuming I cannot make use of other version of Bar lib (because API change, or other technical issues, etc). If I simply declare Bar:1.0 as Foo's dependency in Maven, it is possible to fall into a problem: A Qux project is depending on Foo, and also Bar:2.0 (and it cannot use Bar:1.0 because Qux needs to use new feature in Bar:2.0). Here is the dilemma: should Qux use Bar:1.0 (which Qux's code will not work) or Bar:2.0 (which Foo's code will not work)?
In order to solve this problem, developer of Foo can choose to use shade plugin to rename its usage of Bar, so that all classes in Bar:1.0 jar are embedded in Foo jar, and the package of the embedded Bar classes is changed from com.bar to com.foo.bar. By doing so, Qux can safely depends on Bar:2.0 because now Foo is no longer depending on Bar, and it is using is own copy of "altered" Bar located in another package."
What I always meant by "microservices" is whatever servers talking to each other by some kind of RPC, e.g. Protobuf in Google (HTTP+JSON works as well if you don't operate terabytes). Then it doesn't matter what Guava version, or even what programming language my services are written in, as long as they support RPC format (e.g. Protobuf).
So I conclude that original post author does microservices wrong. Or just don't understand what he's talking about. Sad to see it on InfoQ.
I think that's completely impractical and that we should still make heavy use of shared abstractions in modules and libraries.
If I have a routine in two application services, e.g. a currency converter, and that code has to be changed, I have a few options:
With code duplication (Abandon DRY):
1. If the domain demands, I have to change the routine in multiple services and build, test and deploy both of those services in lockstep.
2. If the domain allows, I can change it in one service and build, test and deploy that one service in the short term. The other service will likely need to be updated at some point in the future. This is hard to manage and brings duplication of effort.
Without code duplication (DRY):
3. If the domain demands, I have to change it in one shared library and then test and deploy both of the dependent services in lockstep.
4. If the domain allows, I can change one shared library and build test and deploy only the one service that I need to change in the short term. The other service will likely need to be deployed at some point in the the future.
Shared libraries are orthogonal to the question of loose or tight coupling. We can have no shared libraries but still have two services which are tightly coupled and chatty. That is the situation we need to get away from, and it is acheived through good DDD and service boundaries, but really doesn't seem to have much to do with shared libraries.
The solution you advocate is correct. Service boundaries need to be well defined and well separated.
Wouldn't versioning help with this?
The API endpoints should be versioned as well. Once all services have been upgraded to use the next version of the API, the previous endpoint can be deprecated.
While subjective, for a JVM-based solution stack, Apache Storm appears to be more elegant, complete and efficient than Linkerd, as an example.
May be advantage of microservices vs Storm becomes more apparent with scale, and I just simply had not had the experience at that level.
You could run the OTP Application, and a bunch of others, locally. You could stick some on a different machine/cluster. Everything is nice and encapsulated and distributable.
The only issue is this would still violate the concept in the OP because it forces everything to either behave like an OTP Application or at a minimum to behave like an Erlang node.
My problem with microservices is that your often picking subpar solutions/technologies particularly if low latency is a priority. For example HTTP is not always the best protocol and Request/Reply is not always the best messaging pattern but this is what most people equate to microservice (ie HTTP 1.1 REST a lightweight list of servers (etcd, zookeeper, consuler) or a load balancer).
Thanks for sharing.