I do like micro services. As a general architecture for new systems, but also as wrappers around existing legacy applications. It makes integration and evolution of old applications a lot smoother with newer technology stacks.
Apologies in advance, this went longer than I anticipated.
I'm in the midst of re-architecting a legacy system whose terribleness is legendary even in Hell.
Given that this system has a fairly small set of nouns as well as a limited set of verbs that can apply, I opted to try and abstract out each of these into their own services. This allows a rolling replace/upgrade cycle that allows the old cruft to continue running while limiting the scope of efforts to something less than the Aegean Stables.
One characteristic of the beshitted legacy system is all manner of action-at-a-distance and an embarrassing lack of code reuse so even minor changes in business logic involves a nightmare of grepping and hoping you found all the areas that need to be changed to either support that logic or implement it. To that end I opted to have a message broker for create/update/delete operations that was responsible for distributing such events to other services as business logic dictated. Internally we nicknamed it "Sorta SOA".
As an example work flow, a user is created and publishes the event to the broker. It then can either treat response as a pubsub, acknowledged-level message, or an RPC style message.
Broker gets message and generates a global transaction ID that can be used to trace all further emitted messages as well as handling final response to the originating service. It ack-responds to the originator with the ID. It then has a logic chain based on the name of the event message and can make calls to other services such as the communication service that may email or SMS message someone. Comm service acks upon successful receipt of message, then on delivery responds with any resultset. Broker then receives that result set and checks to see if it has completed all tasks for that transaction. If so, it responds with the results of all tasks (or a defined response using a subset of data from results) to the originator.
All services are idempotent and communication is via RabbitMQ to support fabric changes and persistence/guarantees of delivery.
Services themselves are RESTful HTTP API for manipulation of the specific nouns they are in charge of. It's allowed us to separate concerns to a surprising degree and formalize business logic in a single area (the broker) for event-driven behaviors. It also gives us the flexibility to interface with third party services in a manner that was impossible given the previous disastrous code.
I spend my days dealing with such a system, action-at-a-distance, grepping and hoping you found everything related, and to which I'll add: every variable declared global, so that whenever you see a part of the program which doesn't appear to initialize the variable, it might be (a) a bug, or (b) a dependency upon a human operator process that has hopefully caused a different part of the program to have already placed a value there. In the presence of multiple users it can be a kind of slow-motion race condition.
"The beshitted legacy system" is a more eloquent turn of phrase for it than anything I've yet conjured up, so I salute you.
Oh yes, we have globals on an, erm, "global" basis. As a bonus, this codebase was originally written to be entirely flat-file driven and later database bolt-ons for large portions were merely backups for those files. Of course, being dependent on flat files the code helpfully avoided any use of OS-provided locking and instead implemented a lockfile system that checked for existence of lockfile, if not exists then wrote that file and proceeded to modify the data file (and never appended, just rewrote). If the lockfile existed, it checked for existence 10,000 times and if the file was still there, it deleted it and then proceeded. Because counting to 10K takes a long time on a computer. And because nothing could actually create a lock between the time the code checked for existence vs creating its own. And because anything any other script is doing that takes more than counting from 1 to 10k is worth writing, so just overwrite. And that slow script would never overwrite your write. sigh
The most amusing part of that last bit was presenting concurrency tests that showed any concurrency > 1 insured corruption (because the C-level who chose the original contractor/employee refused to believe the actual logical argument of its faults) and still having to deal with arguments about how "it hasn't happened yet!" "Your house has never burnt down, yet you own insurance. And as an aside, I will note that it has happened on several occasions. You've just never had to actually solve it and your prior monkey kept you in the dark about it."
The original system was purchased via contractor work and then maintained/expanded by an in-house employee who later managed two additional developers. That employee then left and the company decided it needed to find an experienced development manager, which is when I was hired.
The first two years were essentially triage efforts to at least introduce some modicum of dependability and scalability while trying to avoid wholesale rewrites. Eventually some C-level management changes along with a couple of years of my insistence that delusions to the contrary we did not in fact have any in-house design talent, the company decided to finally address the woeful usability of their product and hired a usability/design firm to create a new front-end. Given that most of the original code entwined presentation, models, views, and sewage plumbing completely this was the opportunity to re-architect with the caveat that we wanted to limit the scope of that effort to purely customer-facing areas, because the homebrewed in-house "CRM" was at least as craptastic as the customer side of things but was a much larger codebase and dealing with a ground-up change was a recipe for disaster.
So in effect we have two systems running in parallel, with rational database design and separation of concerns on the customer side along with duplication of data into the older database to minimize impact on the internal stuff. The efforts for rolling replacement of the legacy system will then continue, removing a noun and associated verbs one at a time. Basically performing the old Ship of Theseus trick, except at the end a wooden row boat will end up a powered steel-hulled yacht.
One thing that people don't talk much about is the network overheads, marshaling/unmarshaling, which adds up really quickly in a microservices architecture - imagine pulling large amounts of data 3 layers out, with all that overhead. While microservices lends to a flexible design, need to be careful about the costs
Yeah, such an architecture isn't a panacea. As with any pattern you have to apply it to the right kind of problem. Treat every noun as a microservice, especially for reads, and you end up reinventing an RDBMS (poorly) in your code. Neither would you demand that a file copy be implemented using map reduce.
I think one of the pain points of microservices is the infrastructure, for non-enterprise or startups. If you think of it, when you have your application or system a bunch of widely dynamic services, they need to be dynamic. Thus, traditional ways of deploying (like one service per server -- which most likely will be statically defined) wouldn't be best.
Systems like mesos, flynn and the like will substantially make it more effective at orchestrating these microservice infrastructures. These allow developers to focus more on the service, rather than on the infrastructure and underlying dependencies.
Microservice architecture, as described, appears to be a SOA principles applied in a particular way (typically, to smaller units of functionality than are often considered with SOA), and with less baggage from association with WS-* standards (which, really, logically, aren't at all necessary to SOA, but have in many circles become so associated with it that its hard to say "SOA" without people thinking you mean "WS-*".)
Generally speaking, it's the lack of a linear service bus and orchestration.
In a 'traditional' SOA architecture, the workflow is defined as a linear progression from one state to another, where each 'service' mutates ( or not ) the state of the data.
Determining 'which' services are used/called in the service bus is typically defined in an orchestration.
They definitely have some similarities, however one thing to note about 'Architecture' is that it's often more about ways of thinking about a problem as much as it is about 'solving a specific problem'.
Another helpful comparison might be:
SOA is like GNU Hurd and Micro-Services are like Unix.
You're right. Or maybe saying "I agree with you" is more pragmatic. Recent case in point - I have an app that fetches and saves a lot of data from and to a server. It has pretty heavy security requirements so I split out user registration and login into a separate server app. The client calls it, gets a token and uses it to access data in another service.
A customer just asked to two-factor authN. I can make that change without affecting any of the business services. Getting it wrong means little more than a short-lived denial of service.
In traditional SOA authN would be wrapped up with authR in a service interface or façade, fronting the monolithic app. Not nearly as extensible or flexible.
Netflix is known for opensourcing a whole suite of infrastructure components which allow to run a microservices architecture on AWS (https://github.com/Netflix). These include various configuration management tools, event buses, monitoring solutions and much more.
To my mind, the most important benefits of a microservice architecture are a clear separation between teams working on different services and the ability to upgrade said services autonomously without stopping/breaking the consumers which allows for fast iteration. The important point is you don't just replace services, but introduce new versions and automatically retire old versions when all of the consumers upgrade to the new ones.
I feel like one of the major benefits is being able to use the right language for the job. I can't be the only one who cringes at the thought of writing a large scale webapp in js, but writing my rendering logic in python on the backend and again in js for the frontend breaks dry. With a SOA you can split the backend into an API and have a presentation/routing layer in js that sits in front of it. This can use something like react to emit plain HTML from the server, and then that same react code can kind of re-bind to the generated dom on the client (using the exact same backend api!) giving you a single-page-app experience with graceful degradation and fast server side rendering for nearly no extra development cost. A guy can dream...
> ...communicating with lightweight mechanisms, often an HTTP resource API
HTTP is great and all but it's not lightweight. I'm curious what happens when each incoming HTTP request from a client cascades into 5-10 HTTP requests inside your "Microservices" ecosystem. Does that scale? Granted, these may all take place on the same box but that still seems wasteful. Then there's the challenges of making sure each piece of the architecture is working correctly - and if it isn't, are you safely handling the error/routing to a known working node, etc?
Seems to me if you start writing your app with a service oriented architecture in mind then you have the flexibility to make your calls to services either in memory function calls or HTTP requests/RPCs.
One example case: you write a monolithic Rails app that is structured around services but all services are in memory service objects. In production you find that your search service is both doing far more work than your other services & getting queried more often. So you refactor your application so that a call to the search service is no longer an in memory function call but instead an RPC to a group of servers that only handle the search service and have been rewritten in Java/Go/C++ to be blazing fast. Since you wrote your app with services in mind, you probably don't change the API at all, it just becomes a wrapper for an RPC rather than a class in your monolithic app.
This way you don't automatically bulk up on unnecessary, expensive HTTP requests but you maintain the flexibility of optimizing modules of your app for performance.
This is amazingly great advice. Don't go pre-"optimizing" by adding a bunch of needless network calls—you'll only end up with more instability if done naively. DO start from the beginning with clear interfaces and boundaries between concerns in your codebase. Design for services from day one. If you don't have discipline not to tightly couple objects together, split into multiple codebases (gems, modules, whatever) tested in isolation that can be deployed together until it's time to pull them apart.
A well designed class interface can be turned into an HTTP (or socket) API easily enough. By the time you need to, you'll probably have the resources to monitor, scale and maintain the additional service properly.
AOP type architectures work fairly naturally with out of band queues etc. (0MQ, RabbitMQ or Redis pub/sub off top of my head) so its never really difficult to parallelize those requests out to workers who can largely operate independently. HTTP isn't a silver bullet and for the most part its how public API's are defined - however within your own infrastructure there's a bunch of other transport options, protobuffers, thrift, msgpack or whatever. But yes building some kind of hypervisor or FSM to manage error conditions across a cluster and particularly across datacenters can get out of hand very quickly
Do pipes have a good mechanism for communicating over a network in Linux? Genuinely curious. Seems like the benefit of http is that services can live on the same machine or some other machine or a cluster of machines and it won't require anything but changing an address in a config file.
No. Pipes are a local only bidirectional data stream meant for inter-process communication. Sockets were created to be the network form of a data stream. You can get data sent over socket connections using things like netcat or ssh-based IO redirection if you want to get a little creative.
If the benefit of a microservice is that you can distribute it across multiple machines then you really just have SOA.
ZeroMQ is a great idea, but the implementation falls far short of the project's stated goals.
I've tried to use it several times but I've run up against limitations built into the design that make it infuriating to use. Mostly this is due to how ZeroMQ goes out of its way to hide the networking details and refuses to expose them even if you need to know.
"One main reason for using services as components (rather than libraries) is that services are independently deployable. If you have an application  that consists of a multiple libraries in a single process, a change to any single component results in having to redeploy the entire application."
Sounds like some time an research on why we call something a component and then cannot replace it at runtime. I understand erlang has some features in this area.
Even if it does mean redeploying the application, i don't see what the problem is. If you're doing continuous delivery, you've already solved the problem of quickly rebuilding and redeploying. If you haven't solved that problem, solve it. It's not actually terrifically difficult.
Simply put: quickly rebuilding and redeploying is good enough for some applications, but standing up a new version of a component seemlessly is necessary (or at least "sufficiently highly desirable to be worth considering in architecture decisions") in others.
It's not that surprising that microservices are a good idea.
It is a fundamental truth that software adopts the structure of the teams that create it. At whatever level you inspect it, microservices/SOA mirror most team structures more closely than any other model.
This makes the architecture easier and more natural for teams of human beings to handle.
I really got into microservices a couple years ago and even wrote a pretty cool proxy in node to write polyglot api's using microservices. I called the project/protocal Radial (http://radial.retromocha.com) but it never really caught on.
Perhaps it was ahead of its time, or maybe the idea just wasn't very good. I really don't know, but it's cool to see the same ideas popping up again.