I don't have much ideology behind going with microservices vs. monolith, but what we've done on some recent projects is organize our code into modules that only communicate with each other through a narrow and well defined boundary layer. If we need to split a module out into a separate service, then it isn't nearly as much work to split it out later.
One of the practical issues we've had with microservices that need to interact with each other in real time is ensuring a consistent state across systems. For example, let's say I need to change the status of an object and afterwards, call a separate service to change state there as well. What happens if the call fails in some way? You can't just run all of this inside a single database transaction anymore. Now you have to design your code to deal with several potential failure points and edge cases, which adds complexity. The other consideration is all calls to a service should be idempotent if possible. It makes coding from the client side a lot easier if you can just fire off a call multiple times (in case of local or remote failure) and not have to worry about state.
Just some of my thoughts, since this stuff has been on my plate recently.
Transactions certainly make it easier to maintain global consistency, but one possible contradiction with the above is that if your modules are sharing transactions, then your boundaries are no longer narrow and well defined. By definition, your entire database and all its internal workings are now part of the interface.
This is one problem that I've observed with all the monoliths I've worked on. Because modules are colocated and sharing a database is easy, eventually somebody will do it (even if it wasn't originally intended), and you get lots of ostensibly modular code intertwined with other modules in non-obvious and subtly problematic ways.
I've heard this exact argument at work in favor of microservices. I honestly think it's a lazy man's cop out to say it makes sense to insert a network boundary because it's just assumed somebody's going to violate a module boundary. That's a huge tax to pay for being lazy. If you lack discipline it's going to show up no matter how you decide to distribute your complexity. From what I'm seeing the idea that "devops" is going to offset the increased complexity of network boundaries as interfaces is going to be the downfall of a great many of microservice based implementations that simply didn't need to take that burden and risk. I suspect that a hybrid approach is going to end up being the right solution for a great many companies. One or more well factored monoliths with common shared libraries and orthogonal services that each of them use that make sense being a service vs. a shared library.
I take it you've never worked on a monolith project before. There are always reasons (often deadline related, always legitimate, never as a result of laziness or lack of discipline) where developers have been forced to cross module boundaries.
Nobody arrives at microservices unaware of their complexity and complications. Developers are forced to choose the approach between monoliths have their own issues.
Also common shared libraries are a disastrous idea IMHO. They always become riddled with stateful business logic and the difference in requirements between their consumers means they end up brittle, inelegant and full of hacks.
Actually I've worked on a great many monolith projects over the past 20 years, including a J2EE server (WebLogic). And yes that was a huge monolith that had technical debt that needed addressing. When I left there was this future "modularity" project that I'm certain didn't involve introducing network boundaries between the servlet, EJB, JCA containers, etc. to achieve that modularity. And I can tell you that there was a definite effort to enforce interface/package boundaries between the various server components. If you introduced an "illegal" dependency you were going to be talked to about removing it. I want to say that we actually had the build breaking on such infractions after I'd moved to product management.
The point I was making is exactly what you're talking about. That sometimes deadlines force bad architectural choices and that's called technical debt. The laziness and lack of discipline comes with failing to acknowledge and address that debt in the future. As best I can tell people think that microservices are going to solve that problem and I'm saying they won't and that there's not enough thought going into the price of network.
Just like in real life when it comes to being in shape. Diet and exercise. There's no silver bullet there either and it seems to me as though microservices are the fad diet of the current tech cycle.
I realize this stuff isn't cut and dried and easy. If it were then none of us would have well paying jobs to figure out when to use what tool for what job. There's a time and a place for all solutions but I'm seeing the same groupthink I saw back when everyone was purchasing Sun, Oracle, and WebLogic for sites that didn't and would never need those tools. This is EJB all over again as best I can tell.
As far as your shared libraries comment goes, you wouldn't consider having any shared libraries ever? What you're describing are permutations of a shared library that either need to be addressed by the shared library's design by adapting or splitting into multiple libraries. I'd be interested in learning what you do instead of sharing components? Duplicating everywhere?
> Nobody arrives at microservices unaware of their complexity and complications.
I think the opposite is often true. There are developers that buy into microservice architecture without a full understanding of the complexities involved.
> There are always reasons (often deadline related, always legitimate, never as a result of laziness or lack of discipline) where developers have been forced to cross module boundaries.
Those are still a lack of organisational discipline.
Your argument is, in effect: In a modular monolith it is technically feasible to violate the stated architecture. Some organisations will chose to take that option for good reasons, therefore we should make it so it is no longer technically feasible to do so.
I've seen that too. I've done that too.
But let's call it what it is: It's choosing to implement the more complex technical design in order to remove options that you don't want to be available, because if they are available then someone will override your architectural decisions for short-term commercial reasons. And you don't want them to have that choice.
So, either
(a) the organisation is prone to making short term decisions with undesirable long-term consequences, and has collectively decided that they can't trust themselves to stop, and need technical constraints in place.
or
(b) the organisation is prone to making short term decisions with long-term consequences that the technical team don't like, and the technical team has decided that since they can't convince the organisation to stop, they need to put technical constraints in place.
One more point is that in your monolith you actually have the option of crossing module boundaries. You don't have that luxury with microservices unless you want to introduce XA (God bless you). So you better get your boundaries right. :)
>I honestly think it's a lazy man's cop out to say it makes sense to insert a network boundary because it's just assumed somebody's going to violate a module boundary.
Its not a lazy cop out when the comment he's replying to is the exact example in question.
It helps that network boundary typically becomes responsibility boundary (different teams)... in the end disciplined devs will make it work anywhere, anyhow.
Its a matter of making it work with the cards you have.
I agree about this danger. But then again, nests of services can also grow tangled dependencies, now in the form of RPC calls.
It's a general problem in software: adding a dependency without cleaning up (or even becoming aware of) it's effects on the dep-graph is often the quickest way to solve today's problem. You then pay for it over the rest of the life of the project.
What happens with monoliths is they tangle at all levels. How many times have you seen a giant "Utils" module which initially contained stateless StringUtils and similar classes then devolved into a dumping ground for stateful business logic.
Or especially with JVM applications how many times do library dependencies for one of part of the codebase end up causing issues with another. That's a big bonus with microservices i.e. being able to manage third-party dependencies better.
For a process that is inherently sequentially dependent on previous results, how is this not a transaction other than declaring it not so or 'dropping the outcome all over the floor' if there's a problem?
It's kind of like saying, "You're not allowed to have this problem, you be better off if you had some other problem like the one I have here"
Unless you're just saying to "grow the boundary" until all parts of the sequentially dependent process is inside the boundary? (this may be tricky to deal with the more external systems there are that cannot be "internalized")
* I am not saying you need distributed transactions - just that some processes cannot easily be encapsulated as "atomic" operations.
We've used microservices for around 6-7 years now. One thing we realized quite early was that letting each microservice store state "silos" independently was a bad idea. You run into the synchronization issue you describe.
Instead, we've moved the state to a central, distributed store that everyone talks to. This allows you to do atomic transactions. Our store also handles fine-grained permissions, so your auth token decides what you're allowed to read and write.
One non-obvious consequence is that some microservices now can be eliminated entirely, because their API was previously entirely about CRUD. Or they can be reduced to a mere policy callback -- for example, let's say the app is a comment system that allows editing your comment, but only within 5 minutes. ACLs cannot express this, so to accomplish this we have the store invoke a callback to the "owner" microservice, which can then accept or reject the change.
Another consequence is that by turning the data store into a first-class service, many APIs can be expressed as data, similar to the command pattern. For example, imagine a job system. Clients request work to be done by creating jobs. This would previously be done by POSTing a job to something like /api/jobs. Instead, in the new scheme a client just creates a job in the data store. Then the job system simply watches the store for new job objects.
Of course, this way of doing things comes with its own challenges. For example, how do you query the data, and how do you enforce schemas? We solved some of these things in a rather ad hoc way that we were not entirely happy with. For example, we didn't have joins, or a schema language.
So about a year ago we went back to the drawing board and started building our next-generation data store, which builds in and codifies a bunch of the patterns we have figured out while using our previous store. It has schemas (optional/gradual typing), joins, permissions, changefeeds and lots of other goodies. It's looking extremely promising, and already forms the foundation of a commercial SaaS product.
This new store will be open source. Please feel free to drop me an email if you're interested in being notified when it's generally available.
and how that central 'data store service' is different than a single 'database service' (rdbms or nosql - CRUD) that all microservices connect to and run there select/insert/update/delete/crud ops?
Other than api - rest vs whatever binary rpc protocol, it sounds very much like a standard database...
The difference may seem subtle, but I'd argue that it is a whole other paradigm. It's one of those things that you either get, or you don't, but it might take some time to fully appreciate.
First of all, we're not an RDBMS, and don't pretend to be. I love the relational model, but there's a long-standing impedance mismatch between it and web apps that I won't go into here. There are clearly pros and cons. Our data store isn't intended as a replacement for classical relational OLTP RDBMS workflows.
If you let all apps share a single RDBMS, you're inevitably going to be tempted to put app-specific stuff in your database. This one app needs a queue-like mechanism, this other app needs some kind of atomic counter support, etc. You may even create completely app-specific tables. How do you compartmentalize anything? How do you prevent different versions of apps to stick to the same strict schema? How do you incrementally upgrade your schemas without taking down all apps? How do you create denormalized changefeeds that encompass the data of all apps? How do you institute systemwide policies like role-based ACLs, without writing a layer in stored-procedures and triggers that everything goes through? Etc. There are tons of things that are difficult to do with SQL, even with stored procedures.
I would argue that if you go down that route, you'll inevitably reinvent the "central data store pattern", but poorly.
The issue with a centralized data store is that your services are coupled together by the schemas of the objects that they share with other services. This means you can't refactor the persistence layer of your service without affecting other services.
All that said, a single source of truth does do away with distributed transactions, so I can see the appeal.
He seems to come at it from a slightly different angle, and I can see how his scenario isn't a good idea.
It's worth pointing out that you do have the same challenge in a siloed scenario, but the "bounded contexts" are separated by the applications themselves, which no chance of tight coupling because there's no way to tightly couple anything. In the silo version, apps can still point at each other's data (e.g. reference an ID in another app), there's just no way of guaranteeing that the data is consistent.
The coupling challenge is solved by design -- by avoiding designing yourself into tight couplings.
For example, let's say you desire every object to have an "owner", pointing at the user that "owns" the object. So you define a schema for User, and then every object points to its owner User. But now all apps are tightly coupled together.
In our apps, we typically don't intertwine schemas like that unless there's a clear sense of cross-cutting. An "owner" field would probably point to an object within the app's own schema: A "todoapp.Project" object can point its "owner" field at a "todoapp.User", whereas a "musicapp.PlaylistItem" can point to a "musicapp.User".
(Sometimes you do have clear cross-cutting concerns. An example is a scheduled job to analyze text. The job object contains the ID of the document to analyze. The job object is of type "jobapp.Job". The "document_id" field can point to any object in the store. The job doesn't care what the document is -- all it cares about is that it has fields containing text that can be analyzed. So there's no tight coupling of schemas at all, only of data.)
However... I have played with the idea of a "data interface" concept. Like a Java or Go interface, it would be a type that expresses an abstract thing. So for example, todoapp could define an interface "User" that says it must have a name and an email address. Now in the schema for todoapp.TodoItem you declare the "owner" field as type "User". But it's an interface, not a concrete type. So now we can assign anything that "complies with" the interface. If todoapp.User has "name" and "email", we can assign that to the owner, and if musicapp.User also has "name" and "email" with the right types, it is also compatible. But I can't assign, say, accountingsystem.User because it has "firstName", "lastName" and "email", which are not compatible.
I think the point was not so much about a RDBMS, but from an architectural point of view, you have a central data thingy that similar to a central RDBMS data thingy in that all parts have to point at the central thingy for their data needs. Not so much about the pros and cons of RDBMS
Isolated, internal state is part of the definition of microservices. It is not hard to find articles that assert this. Taking that away may yield a better result, but you no longer have "microservices", you have something else
I don't have a specific definition of "microservices", nor do I think anyone does.
The central data store pattern arguably makes apps even more "micro", albeit at the expense of adding a dependency on the store. But the opposite pattern is to let each microservice have its own datastore, so you already have a dependency there.
It's just moving it out and inverting the API in the process; for many apps, the data store becomes the API. For example, we have an older microservice that manages users, organizations (think Github orgs, but hierarchical) and users' membership in those orgs. It has its own little Postgres database, and every API call is some very basic CRUD operation. We haven't rewritten this app to use our new data store yet, but when we do, the entire app goes away, because it turns out it was just a glorified gateway to SQL. A verb such as "create an organization" or "add member to organization" now becomes a mere data store call that other apps can perform directly, without needing a microservice to go through.
The users/organizations app example sounds like the app is self-contained anyway, I don't see much difference between the app having its own data store and the centralized data store service, it's just which service to call for upstream. What would you gain by moving from the app's own store to the central store service and eliminating the app?
It doesn't just eliminate that app. It generally means no app needs any CRUD except to encapsulate business logic.
Secondly, all data operations can now be expressed using the one, canonical data store API, with its rich support for queries, joins, fine-grained patching, changefeeds, permissions, etc. Every little microservice doesn't need to reinvent its own REST API.
For example: The users/org app has a way to list all users, list all organizations, list all memberships in an organization, etc. Every app needs to provide all the necessary routes into the data in a RESTFul way:
/organizations # All orgs
/organizations/123 # One org
/organizations/123/members # Members of one org
/organizations/123/members?status=pending # Members of one org that have pending invites
/users/42 # One user
/users/42/organizations # One user's memberships
etc.
A client that wants to query this app must first pick which silo to access, then invoke these APIs individually. The way that the querying is done is silo-specific, and the verbs only provide the access patterns the app thinks you want. What if you want to filter invites not just by status, but also by time? Every app must reinvent every permutation of possible access patterns. REST is pretty exhausting that way. GraphQL is a huge improvement, but doesn't really fix the silo problem.
With our new store, a client just invokes:
/query?q=*[is "orgapp.member" &&
organization._ref == "123"
&& status == "pending"]
(Yes, we did invent our own query language. We think it was necessary and not too crazy.)
Or indeed:
/watch?q=*[is "orgapp.organization"]
Now the client gets a stream of new/updated/deleted organizations as they happen.
First, I think when you talk about all the things you've built into the canonical data store (CDS), why can't some of these be decomposed services in their own right? Permissions would be a valuable service to decouple from CDS, for example.
Second, what are the constraints of CDS? How much data can I pack into a single object? silo? How does bad behavior on the part of one caller affect another? What if CDS just doesn't work for a new service you're building?
I do appreciate that your company has invested in providing data storage as a service for yourselves, which I think is a much better idea than having each team rolling their own persistence. However, I think people would be very interested in how you've made sure that CDS isn't a SPOF for all of your data, as well as what kinds of things it isn't good at.
EDIT: I would also point out that there is a difference between having a single CDS and having StorageaaS that vends CDS's.
Our old "1.0" store architecture did in fact decompose things into multiple services. It has a separate ACL microservice that every microservice had to consult in order to perform permission checks. That was a really bad, stupid bottleneck.
For our new architecture, we decided to move things into a single integrated, opinionated package that's operationally simpler to deploy and run and reason about. It's also highly focused and intended for composition: The permission system, for example, is intentionally kept simple to avoid it blooming into some kind of all-encompassing rule engine; it only cares about data access, and doesn't even have things like IP ACLs or predicate-based conditionals. The idea is that if you need to build something complicated, you would generate ACLs programmatically, and use callbacks to implement policies outside of the store (the "comments only editable for 5 minutes" is an example of this), and maybe someday we'll move the entire permission system into a plugin so you can replace it with something else.
It's also important to note that the store isn't the Data Store To End All Data Stores. It covers a fairly broad range of use cases (documents, entity graphs, configuration, analytics), but it's not ideal for all use cases. There are plenty of use cases where you'll want some kind of SQL database.
> let's say the app is a comment system that allows editing your comment, but only within 5 minutes. ACLs cannot express this, so to accomplish this we have the store invoke a callback to the "owner" microservice, which can then accept or reject the change.
I think these kinds of access control rules, can be expressed within an entitlement solution. These systems are often called RBAC+ABAC (role based access control + attribute based access control). The caller calls a PDP (policy decision point). Policy decision point is a rules engine that can take in the callers application context (which, in your case, will include current time and the time of the initial post)
PDP is often implemented as a microservice, or even as a cache-enabled rules engine that, as API resides with the context of every caller (for faster, lower latency, more resilient solution)
Idempotency would be nice, but it is often impossible to have at all layers. Eventually at some point, you deal with stateful microservices and distributed transactions. Depending on how long the transactions take, either two-phase commits or compensation transactions are needed to rollback or restore states when failures happen. And that is not trivial to implement and complicates your system further.
Stable and well-defined interfaces between microservices are another luxury hard to have in reality, especially when business and application logics constantly evolve. More often than not, it's inevitable to juggle multiple services to fulfill the need, which takes much more time, effort and risk than monolith.
While there are plenty of stateful services out there (somebody has to store your data or spin up your VM, after all), I think you'd be surprised at how few if any of them require distributed transactions. a lot of problems that appear to require distributed transactions can be solved by more efficient routing as well as more thoughtful approaches to how you approach state.
I would also say its easier to define good, growable, and sustainable APIs when you really think about what the primitives of your service are. Try and avoid baking a lot of opinion into your APIs, which lets your consumers own most of their business and application logic. If you do have a need to embed new business logic into your APIs, think about how you can preserve the current default as well as provide extension points instead of one-off changes.
Hmm, my ideology doesn't say whether microservices are better than "monoliths". But it does roll its eyes when it sees people mistake the encapsulating things into modules for some particular technology helping you do that.
I mean when OOP was new people talked as if (a) no one had been trying to seperate out modules before OOP, and (b) the class was the natural boundary between modules. Both are false.
BTW: what does that article mean about that [in] "Java 9 a native module system is added..."; presumably this is something distinct from the package system it always had. What are the differences?
Author here. Java 9 adds a new module system where module descriptors are introduced to explicitly demarcate the public API of a module, and to express its dependencies on other modules. Example of a module descriptor:
What happens is that every package except the ones exported are accessible to other modules. Non-exported packages are encapsulated, not even reflection can break through that barrier. The requires statements are used by the Java compiler and runtime to verify the current configuration of modules resolves correctly.
In short, Java makes a great step forward wrt. modularity. When regular JARs transition to modular JARs (adding a module descriptor), many more checks and balances are in place than are currently possible with the classpath.
Java packages are namespaces for code. Java modules are deployable bundles of code. The units of deployment (JAR, WAR, EAR files) pre-Java 9 do not have a consistent model of specifying dependencies and exports needed at runtime.
In other word, packages help the compiler, but give no information to the runtime or deployment as far as how, when, and from where to deploy or load code.
That's a good point, transactions are hard in the micro-service world. In my experience usually It's possible to re-design architecture to encapsulate transaction inside one micro-service. If you have transaction across multiple micro-service are they are really decoupled?
Similar problem is with doing asynchronous requests e.g. using RabbitMQ it is possible only with well designed boundaries as it's hard to control state of request if you do everything asynchronously.
Anyway micro-services is not perfect solution for everything, but even despite these problems I love working with them!
Great observation, and exactly what I meant with the modular alternative in this article. When you can 'get away' with such an architecture, it makes a lot of things simpler.
One solution is to use a distributed key value store (like etcd) to coordinate tasks. Kubernetes does that (AFAIK); you should definitely check that out.
I know microservices are successful in many organizations but one downside I've experienced from the microservices hype is starting an application with microservices.
It's very difficult to get the system boundaries correct while you're still iterating on core functionality, and if you get it wrong you're in a world of pain. Refactoring becomes very hard and performance suffers from unnecessary network overhead. Deployment is harder. Coordination is harder. Developers can't run the whole system locally. Testing is harder. Basically, if you don't have a very well defined interface between components, it's going to hurt.
I would not recommend starting with a microservices architecture. Build a modular, well-factored application and split out pieces if they need to be scaled separately or there are other compelling benefits of a microservice.
To quote Kris Jenkins:
This is your return type: Int
This is your return type on microservices: IO (Logger (Either HttpError Int))
I agree. In the system I'm currently working on, they can't. I think it would be possible to implement, but no one has been able to take the time (or, perhaps sees the value in it).
(OTOH, I kind of doubt developers at Google or Facebook run the whole thing locally, so there must be some kind of end state for this.)
I actually disagree, developers doesn't need to run the whole code on the local env. In my company we use development docker cluster where we keep instances of all of our micro-services and they are exposed (via vpn) to the outside world so you can call them by domains. When you work on the logic that e.g. would affect 2 micro-services you can just set-up 2 of them and make remote request to the dev env for everything else. I don't see any reason why you should run all of them on your computer.
Yeah that's a good point. When the system is too big there's no way you can run it all locally. The angle I'm coming from is an application that's fairly new and still small and is changing rapidly with lots of cross-cutting changes. We're paying some heavy microservices taxes with development, testing, deployment, and performance, but not at a point where we see benefits yet (IMHO).
If the alternative to microservices is a monolith, and you can run the monolith locally, then logically microservices can also be run locally. If it's difficult to run all the microservices locally then that's just a sign of weak tooling.
Why does technical overhead take a backseat whenever a microservices vs monolith discussion comes up?
Yes, in a perfect world every org would have sufficient time and engineering resources to implement microservices for better scalability and code quality. In the real world, setting up and maintaining microservices has huge technical overhead, I'd estimate double that of the equivalent monolithic architecture.
If your company isn't flush with cash and the product you're building will never need massive scaling then it makes no sense to use microservices, at least from a business perspective.
There are many business reasons to use micro-services not only technical ones. In our org we could hire developers fast as we can use multiple programming languages (we have few approved stacks, as micro-services do not share code base).
It allows us split out teams in to small agile units. That can e.g. deploy independently.
Getting new developers on-board take less time as they work on few small services and they do not need to be aware of the whole code base (at the begin)
Yes I agree there is technical overhead, and proper CI&CD servers are required, docker or e.g. vagrant is a must with micro-services. But including all the benefits I wouldn't say that we loses 2x more time than having monolithic architecture.
"Why does technical overhead take a backseat whenever a microservices vs monolith discussion comes up?"
I don't think it does, but that's kind of off topic.
"In the real world, setting up and maintaining microservices has huge technical overhead, I'd estimate double that of the equivalent monolithic architecture."
What a ridiculous statement. Splitting a single process into N processes with no shared memory multiples startup memory by N. It takes a much beefier system to run 8 JVMs than one. The same goes for 8 Python processes. Matters are worse on a VM, which you probably are since basically every shop out there issues developers a Windows or OSX machine. Or maybe you don't have "weak tooling," as you put it, and everything's containerized, using even more resources.
"Splitting a single process into N processes with no shared memory multiples startup memory by N. It takes a much beefier system to run 8 JVMs than one. The same goes for 8 Python processes."
I agree, but I haven't seen the amount of memory ever being a limiting factor when running multiple microservices on a local developer machine. I'd estimate that you can run at least 50 JVM, Python and Node.js processes on a typical single machine, and most applications consisting of microservices have 50 or less microservices.
> I'd estimate that you can run at least 50 JVM, Python and Node.js processes on a typical single machine
Unless someone has convinced you that Spring is the right way to build microservices, in which case you're going to need a gigabyte per instance, and most people won't be able to run 50 on a typical machine. I worked on a project like that, and our beefy iMacs really laboured to bring up ten services.
It's not just the runtime. Each Python process will redundantly import modules. The same goes for libraries in a Java project. It definitely adds up, and I have seen memory usage be an issue, particularly on VMs.
Yes, the tooling is weak. Now, let's convince the engineering managers to spend a bunch of time building tooling to start everything together and fix service discovery on localhost. Turns out, they'd rather build features.
I'm with you 100%. It drives me crazy, but right now you cannot run multiple services locally without a bunch of fiddling.
It's not impossible. As a sibling comment says, it usually just requires a bunch of ad-hoc scripts to make sure everything is running. This does get increasingly complex, especially as microservices are written with different stacks, but that's part of the tradeoff.
It shouldn't be too hard to set up some common code for making service calls. If the service isn't responding, the caller compiles and starts it, assuming some env variable is set to signal it's a dev machine. I've done this for my project.
Beginning in 2003, and until the height of the SOA craze around 2010, the Service Component Architecture (SCA) was seen as the holy grail of service integration. It encompassed both local/in-process (or in-JVM) as well as networked services (SOAP and REST), was polyglot in that it defined API bindings for native, Java, and even PHP and Cobol, could access external services, and was still quite practical. For those needing it, it also supported authorization and transaction policies and protocols.
When commercial interest in SOA middleware products dropped sharply, further standardization of version 1.1 slowed down, and Oracle, sitting on the SCA board, voted down all specs that had been worked on without further explanation.
To this day I still haven't understood what makes microservices different from SOA in a technical sense. I can get that the term SOA was probably burnt at some point, but if there's a real lesson to be learned from SOA failures, I'd really like to know. Maybe SOA was seen as too complex because it addressed some of the harder problems such as transaction and authorization protocols/boundaries, BPM, etc. upfront?
For a while I was consulting a lot around SOA - particularly around the IBM stack, which was one of the biggest SOA evangelists in the early days.
SOA was too complex for sure, but it wasn't just that it was complex. It's that complexity didn't actually deliver the return.
Part of the issue was the way SOA was sold. You buy a BPM or a Rules Engine and suddenly you unlock all this value and you can compose things on the fly -- Rules Engines were particularly bad in this respect. Business users were told they'd be able to tweak rules on the fly. This was never a reality.
Then you get outright hits in terms of complexity. Those BPM systems with two-phase commit were monsters. In fact they were so complex it was _more_ likely to fail. When in the vast majority of cases, actual failure rates were rare and easily handled by a reconciliation process. So the tech never really matched the need.
On top of that, the consultants cost a fortune. Instead of a few test environments, you needed 7. Managing test data and rollout to end-users became a nightmare... And even then, as the technology wasn't mature, you'd hit hurdles very late. Dismal production performance was a common one.
It's a bit sad. A really interesting set of ideas and technologies - but really poorly sold and way past the hype to deliver. It's equally true of related technologies like CORBA.
I know exactly what you mean, having suffered through JRules projects, but don't consider rule/forward-chaining languages part of a SOA stack per se (I especially loved the idea that rule bases, unlike services, don't need testing, because they're end-user configuration parts, and because, like SQL, they're kindof declarative).
Reading through the answers, I still don't know how microservices are any different from SOA ):
of course, the military has control of the allowed corba implementations so they don't suffer from some of the problems that plagued CORBA, including different runtimes, and get all the typed benefits and speed benefits.
I think microservices place a greater emphasis on data-encapsulation & isolation, wheres SOA focused more on the RPC/API surface. I think this is just an emphasis difference though and not a hard separation
All the in the wild SOA applications I've seen were just glorified RPC, not much different than what would have been done with CORBA and the like much earlier. Some seemed to recognize this and create micro services instead, which now seems to have also devolved into RPC by a new name.
I hope the next iteration removes the word service entirely, for too many devs the word service is synonymous with web service. I suggest sticking with daemon.
Microservices is to SOA as "JSON over HTTP" APIs are to REST APIs. Fundamentally the same goals but with a focus more on being a general description of a "style" rather than a specification for how it should be implemented.
I co-founded startup 6 months ago, since day 1 we use micro-services. For us the biggest benefit was that at the begin we could hire people knowing different programming languages (we managed to build a team of 5 in 3-4 weeks) and they could build a small parts of the system communicating via http/RabbitMQ. Downside is that we had to have a CI&CD from day one and it costs us some resources.
I am not saying microservices are cure for everything and of course there is a place for well maintained monoliths but I find that even for smaller teams micro-services can be just easier than monolith.
>they could build a small parts of the system communicating via http/RabbitMQ.
<Cue horrified twitching>
Now, it's totally possible that you're using RMQ completely correctly, but as someone who has seen multiple teams fundamentally misunderstand both the purpose and function of an AMQP server and lose critical data because of it, any mention of RMQ as a primary part of the application's communication mechanism ruffles my feathers.
Sometimes I wonder if the RMQ team is aware of how many people end up using it grossly improperly. It seems they'd put some bigger warning labels on it if they were.
They inject it into the normal data processing/RPC workflow. Instead of writing to a database or some other permanent storage, they just have the application write directly to a RMQ queue, and wait for a worker to pick it up and store it somewhere.
AMQP is asynchronous and the queues can get choked up, so sometimes messages will be delayed for hours, and if the queues get too large, RMQ will begin to evict messages and/or crash due to insufficient resources.
RMQ will throw all the data in your queue away on restart unless you explicitly ask it not to by setting durable mode (don't forget to do this or all your data is down the hole when you're misusing it this way). Even in durable mode, the queue does not provide the type of safety guarantees would be expected of a real storage solution, which RMQ does not pretend to be, but somehow people still believe it is.
Because RabbitMQ does not provide strong data safety, crash, or resilience guarantees, it should never be the system-of-record for important data.
Furthermore, because the nature of AMQP is to dispose messages as soon as they're picked up (and yes, I know you can use acknowledgements to try to hack around this, but it's not something to trust for the only place where your data is recorded), it's very easy to accidentally black hole messages while everything appears to be working. This can lead to an insidious type of data loss where some records just appear to be mysteriously missing and are very hard to trace.
While RMQ states these things in its documentation and obviously is not intended to be the system of record, people still abuse it in this way. Redis was abused similarly and responded by growing a full featureset to turn it into a reliable in-memory storage solution. It doesn't seem like that's a good track for RMQ to take, so they should stick some giant red warning labels all over their page, clearly embarrassing those who make these dangerous choices.
I don't think using micro-services from day 1 is a wise use of resources. What if one of your engineers, who probably owns an entire service, quits? And it's in Haskell because she felt like it?
Well we have a list of approved stacks and yes this may became a problem one day but we try to prepare for that.
One of the things we do on the top of using micro-services is using docker for everything (dev, build and run on production), it's easier to pick up a project if the whole environment is already setup inside docker image. Like I wrote in my first post, It's very hard to find e.g. 10 good engineers knowing python in a month time (and we are startup, can not pay twice more than everyone else). So far this whole strategy work out, but I am not deluded and I do not pretends that starting with micro-services is a good decision for everyone in every use-case.
This is a totally nutty scenario. Microservices are not "do whatever you want!" - they give you the freedom to choose your stack. It is obviously still a business decision to choose Haskell and you've got other problems if developers are building things in random languages that they feel like using without a larger discussion.
The issue here isn't Haskell it's ownership and process.
Nutty or not, it happens. I was a Ruby dev and was asked to work on a Scala app. I could contribute to Ruby stuff quite nicely, and could hardly figure out how to compile the Scala app. Trade-offs...
I'm sure it happens - my point is that it isn't relevant to microservices. Microservices allow multiple technology stacks, if dont properly, but it doesn't force it on anyone. If your developers are pushing code using a different programming language with no oversight there's an organizational issue.
That said, it's awesome that microservices let you use different tech stacks to solve different problems.
Were you experienced with micro services before starting your latest thing? When starting a project, you're going to have constraints: money, time, available talent, management's blessing, etc. I'd guess those constraints are probably a driving factor in dictating how all of the project's needs are wired together.
If the technical founder was a Python/Flask/micro-service/Angular/MySQL dev, that's probably what they'd be using to knock out code to build an MVP. If the founder was a Microsoft-MVC/C#/Postgres/Ember/monolith dev, I'd be super surprised if the MVP was a Python/Flask/micro-service/Angular/MySQL app :)
IMO microservices from day one aren't necessarily a premature optimization, or an optimization at all. It is sometimes just the natural way to model a solution.
For example at my last job, we developed several services that constantly generated reports for our clients to run. Instead of embedding the functionality to move these files to other machines in each service, we developed a separate service that monitored a directory to do only that. This meant that the reporting services were more open ended: clients could decide how to handle the files but were still left with a very convenient option. It also meant that we could hand off new versions of the transfer service on its own for customers to install without interrupting reporting services, and only having to deal with the documentation of the transfer service itself.
In terms of scalability or performance, it added absolutely nothing, but it simplified deployment, documentation and development from day 1.
Oh I didn't mention all the reason's why we use micro-services from the day one. We have very sophisticated use-case. We call-in to meetings using hangout, go2meeting etc to record them (we do speech recognition and many other thing with these recordings). In our case to concurrently call-in to many meetings and do processing in real-time it wasn't really premature optimisation.
I agree, but I don't think microservices are properly classed as an optimization of any sort, premature or not. Microservices arise because a company can't communicate/manage itself internally.
This does not mean that you must have one giant 50MB executable to run your whole company, but it probably does mean most companies shouldn't have 60 200-line microservices.
I think that microservices may be an actual optimization when the application flow has several clearly separable tasks that have varying requirements and you need to divide the load over several machines. For example, one task may be mostly I/O heavy, another will use a lot of RAM and a third may mostly be CPU bound. When you distribute the load over multiple servers, microservices can make it easier to tailor each server to the needs of the services it runs. The I/O bound workload doesn't need 100GB RAM and the CPU bound workload may not need several gigabit interfaces.
That said, I haven't personally worked with a microservice-based architecture where this ever became a useful optimization. Often it is exactly as you say: a technological workaround for an organizational problem.
Microservices do not feel like a premature optimization. It does not feel premature to make architectural decisions around scaling based on the entirely realistic proposition that you will have more customers in the future than you do today. Architecture is exactly the area you want to get right since it's a pain to optimize when you have a weak architecture.
I heard second hand that this really happened at Living Social. Engineers would write services in whatever they felt like (sorry, I mean, "the best tool for the job"), then get bored and leave.
Heck, this was even a fad for a while: polyglot programming.
Polyglot programming isn't "a fad" - it's something that microservices enable. That does not mean that developers make technical decisions in isolation.
If I were to introduce Haskell to my company there would have to be at least one other person interested in it and at least a few people who would be interested in learning it. I would never commit code using a new technology without discussing that with my manager.
Polyglot isn't a fad. That's like saying, as a carpenter, using more than a hammer is a fad. As a professional, you are supposed to have more than a single tool. That doesn't mean you use all of them on every job, but you should have them.
Exactly, every decision is approved by me at the moment and we keep list of approved stacks.
Also to be honest I am not that afraid if we keep our micro-services really MICRO the worst-case scenario would be rewrite single micro-service - still better than struggling to hire dev team by 3-4 months.
Surely the CTO factors this into signing off on any decision on what the engineers use to build the service. You can have an ecosystem comprised of different languages absolutely, but that doesn't become "Billing Automation is built in Idris because Brian wanted to try it out".
We used Microservices at my last job, and initially, everyone used the same Maven archetype to create a basic Java application of similar structure: property files, environment variables, filters, etc. It wasn't as sexy as, say, Scala or Node or Haskell. OTOH, there were absolutely no developers who didn't already know Java or could learn our framework in a few months - at least enough to follow and be productive.
This came in handy because everyone could easily jump from one service to another and figure things out pretty easily since, not only was it all in the same language, but also the configurations and bootstrap classes were the same. Once someone figured out something clever, it was easily added to the base classes for everyone else to inherit.
Eventually some of the newer hires got bored of Java and wanted to use things like Node, Python, etc.
At the time I left, it was a pure clusterfuck. It was impossible to write once and easily change the whole filter stack (e.g. an auth filter) since we were now up against at least 5 different languages / frameworks. Developers couldn't easily jump from project to project as needed, either.
Anyway, most of our Microservices were just glorified DB -> JSON Crud apps. Microservices themselves were probably not needed for our customer size - we would have been fine with a 2008-style multi-war project on JBoss.
And yes, even Java, is pretty damn good at spitting out CRUD / JSON data to front-ends. Absolutely no need for the complexity introduced by the half dozen other frameworks.
Cool story. Our startup (8 engineers) is built on microservices as well. Right now, (after 1 year live, 4 hours downtime until then) we have 43 microservices running. We have a rabbitmq broker for fire-and-forget communications. We use Consul for Service Discovery. Jenkins is used for automatic testing and deployment.
I have to say that I love our setup. While there are some things that are a bit more complicated, the increased efficiency is worth it all. On some days we deploy 10 or more times to production.
I have to say. The startup is very well funded and we have a dedicated SysOps guy who helps us with DevOps. He does all the nitty gritty Ansible stuff.
I think the best thing about micro-services is that it enforces service boundaries around aggregates. You just can't leak responsibility if you have to traverse the network. This enforces loose coupling.
Its also much easier to do code review if changes are localized to a small codebase of maybe max 500 lines.
Since you seem to have it figured out... I'm a monolith guy* in a microservices world. Friday, I had a question asked of me that in our old system was a simple query and I could answer in 5 minutes. This question, however, was split across three separate microservices in the new system, and the information had never been captured in a convenient way in hadoop. (Running two separate queries in two separate systems and writing a program to collate the results takes significantly more developer time. While not insurmountable, it's no longer a trivial task.)
Did you run into these problems? What did you build to solve them?
We have a statistics service for our ML guys. If it needs aggregated data from data stored in different services, the services must implement an API and the statistics layer can then make use of it. Usually one developer will do all of this. We don't have strict ownership.
What we also do is, we duplicate a lot of data into a separate event log. This log is used as a readonly store for analytics. This has the nice additional property that we have an event-log. The micro-services don't write directly to the log, but they can publish events to the broker and a dedicated event-service is used to append the data to the correct log.
Yes. Sometimes all of this is significant overhead, because everything has to be tested as well. But mostly when you start a service, the first thing is to provide all the necessary CRUD routines. Which is 99% of the time a few lines of code as every service is mostly build around at most one aggregate.
One day I plan to share every detail of our journey, but at the moment we are too busy building our product. So far I am very happy with the decision to go with micro-service oriented architecture, but of course we had some issues with that - but hey there is no perfect solution!
I used to make this argument. I'm not so convinced anymore. Much of this modularity can be achieved now, split you libraries into two, one for APIs and the other for Implementations. Then in your build, only include the API libraries as dependencies, and include both at runtime. It doesn't enforce runtime modularity, but it's generally good enough.
Why this isn't great, it encourages monolithic mindset. A huge benefit to microservices is that they are small! This means means that you have releases that are faster per service, build times that are faster per service, code that can be more easily reasoned about.
The author is correct that it introduces expensive cross service calls, so you do need to be thoughtful on your boundaries, but what you end up getting is simpler on a per instance basis. You don't need to become devops experts; there are plenty of options for deployment that handles a lot of this: heroku, gae, beanstalk. Hell, there's even still available to you for your own D.C., though I'd argue that if you're running in your own D.C. then you should think very hard on why that's important to your business (even if it's very large).
The earlier you have microservice architecture built into your stack, the easier it is to continue on that path, once you're monolithic, it's a huge amount of work to go back the other way.
A huge benefit to modules is that they are small! This means means that you have releases that are faster per module, build times that are faster per module, code that can be more easily reasoned about.
The earlier you have modular architecture built into your stack, the easier it is to continue on that path, once you're monolithic, it's a huge amount of work to go back the other way.
Have you considered what will happen when you're application has reached a muti-gig deployment? Where you have static resources commingled with business logic? Where you have rendering blended with DB access?
Modular systems are theoretically as good as SOA or microservices, but in practice they are not.
Failure can not be isolataed as easily, when there is a problem in the system, the entire thing crashes.
Also to your point; is that a single repo, or multi-repo source control system? In a single repo system your build and testing cycle becomes longer and longer for the entire service, regardless he size of changes, you have to deploy the entire thing. Every successful business with a monolithic deployment, regardless of repo structure, ends up in the same place (based on my experience): a large unwieldy beast, maintained by dozens or hundreds, or thousands of engineers, who each have limited knowledge of the runtime. It becomes slow, hard to deploy, hard to test, hard to make changes.
Microservices/SOA does not alleviate the need to good design, but it doesn't allow for the above to be true on a perservice basis, shortening turnaround for response to issues and deploying new features, it is better.
To add more: you get there by not practicing SOA, never breaking up a service, working at a company constantly ships new product rather than focusing on restructuring the application, not having developers own both the build and deployment of the app such that they don't experience the pain; reasons, etc.
"The modularized monolith can be scaled horizontally as well, but you scale out all modules together."
That's not exactly true. Yes you probably have to ship the whole monolith, but who said that you also have to get traffic for all the parts? For example you have /users/ and /books/. You can configure your nodes to only serve /users/ and configure different (maybe more) nodes to serve /books/. The code sits there but who cares?
We did this at my last gig. The application was monolithic[1] but ran in multiple environments: web application, API, admin, backend workers.
It worked quite well. The one downside is we had to load all the code (it was Rails) so processes took more RAM than strictly necessary. But sharing data models across all these services was pretty easy.
[1] There were a few separate services, for example a documentation CMS.
I think the article is missing a key value of microservices, or at least smaller services: service ownership. With a monolith, who is on call for the service when something goes wrong? How does that person find an appropriate person to diagnose the issue in one of the 100 libraries included in the monolith? What does the monolith dashboard look like?
The great thing about small services is that a team developing a service can own it from top to bottom: including:
1) being responsible for metrics, alarms, dashboards, and everything else required to monitor the service
2) being oncall for the service and getting directly paged when their are problems
And of course, there are other social/organizational problems with a monolith, another example being deployments. I want to deploy my new feature but I'm blocked because someone introduced a bug in some unrelated library that's clogging the entire pipeline. Or, I have to release my feature according to the deployment schedule of the monolith, which may not make sense for my team. With smaller services, a team can own its own deployment pipeline and decide when it wants to deploy.
A third organizational benefit of smaller services comes process separation. GC'ed languages work really hard to help developers pretend that memory is free, but memory is still a finite shared resource and it only takes one misbehaving module to cause the whole process to start stalling in large GC pauses. With smaller services you get process separation which makes the problem much more tenable. And of course there are other exhaustible shared resources like threads, and file descriptors.
At the end of the day, I prefer smaller services because I like the social organization where a company consists of agile, autonomous teams owning their own services. I feel a monolith service actively discourages that and leads to social organizations that are less productive and successful.
All fair points, but I think people ignore the challenges associated with microservices: much more complex to debug (which service is failing); inter-service latency (things run much faster on a single box); VM/instance sprawl (because every service should run on its own host x redundancy x multiple environments); increased costs (see vm sprawl). I still think the advantages outweigh the disadvantages, but they definitely come at a cost.
Those are organizational concerns that you're conflating with microservices vs monolith. They're tangential. I see no reason why you can't assign team responsibility to individual modules vs microservices. You can collect metrics for both modules and microservices and publish them in a dashboard. Alarms and monitoring are probably unnecessary at the module level in most cases - you'd just do it once for the whole monolith.
Good point, but I don't think it's necessarily incorrect to conflate organizational concerns with architectural concerns. I am reminded specifically of Conway's law: https://en.wikipedia.org/wiki/Conway%27s_law
It may be possible to deploy a monolith without having monolithic processes and organizations in place, but does anyone have experience successfully doing so in practice? And how easy is it compared to doing so with smaller services?
For my employer, the whole point of microservices is separate deploys. When we had hundreds of engineers committing on the monolith, a bad change in one out of the few dozen commits in a given day's upgrade could require rolling the whole thing back.
Now each service deploys (usually) one or two commits at a time, completely understood by the person who clicks the "upgrade' button. People working on unrelated code don't need to block each other's release velocity.
More disciplined interfaces could have solved the spaghetti problem, but lots of small services that a few people have absolute power over can move a lot faster than an integration/release process shared by thousands.
I've advocated for this approach before, I think it's definitely true that a single code base is an optimal place work out early architectural decisions so you can optimize the interfaces and separation before committing to the overhead of multiple repos and deploy stacks.
One of the challenges, though, is making sure the whole team understands the vision and doesn't violate the intended separations. Working primarily in Rails, I've found it to be pretty terrible in this regard, both due to ruby's global namespace, but also just because there are no conventions to properly separate app-level modules (Rails Engines are a second-class citizen). Django is significantly better since it gives you apps as first-class citizens, and because of Python's explicit module import, but even still you can quickly end up with a ball of mud if you're not careful.
We build a modular system based on Java with OSGi. It served us well but now we feel we have reached the limits of such a system and are looking at a more microservice based approach.
For example a modular system is not going to help you when a module is misbehaving since the whole thing still runs in the same process (JVM in our case). If someone introduces a resource leak in some trivial module the whole thing still comes down.
Exactly, that's one of the tradeoffs discussed in the article (author here).
Another pattern that might work, which I didn't include in the article, is to scale-out the modular application (assuming it's 'stateless') into several clusters. Each node in the clusters still has the whole modular application. However, each cluster will be responsible for handling a certain part of the API. Then, put a load-balancer/API gateway in front that can route different functional parts of your API to different clusters. Scale up the individual clusters as required by load. Even though all nodes contain all modules, depending on which cluster they're in, only a certain subset of modules really takes up CPU cycles. There's still no node-to-node communication necessary, since all nodes contain all logic.
Certainly not a pattern that's always applicable, but I've used it with success several times for webapps with REST backends.
For us scaling was never an issue; We did the stateless scale-out thing. Even solving statefull is relatively easy (at least in Java) with solutions like Hazelcast. The main drawback is there are usually only a few services in your monolith that are required to scale out. But you have to deploy the entire thing when scaling out.
As for OSGi, OSGi is complex; I dare to say that the complexity of OSGi rivals that of a microservice setup. If I had a nickle for every classloader issue I debugged... ORM was especially fun (For example wrote this piece way back: http://www.datanucleus.org/products/datanucleus/jdo/osgi.htm... ). But I must admit that I don't think we could have created (and maintained) such a large modular application without OSGi. Debugging itself is also way more complex. When an issue arises you spent a lot more time tracking down which module is misbehaving. Even though we had inserted lot's of probes (which ended up in graphite) and log statements (which ended up in graylog) to counter that.
In my experience writing smaller, simpler applications (which I acknowledge also have their own complexity with distributed debugging) are still easier to understand then an modularized application.
I'm with you on the complexity of OSGi, though some of the complexity plainly arises because it truly forces you to modularise vs. just winging it. In that regard, the new Java 9 module system has less of the service dynamics and classloading tricks going on. Very curious to see how the community will pick up the new module system.
This is a cool pattern which deserves to be more well-known.
I once worked on an e-commerce system where we deployed many copies of a monolith. Most were serving user requests. Two were running scheduled jobs. One was something to do with a distributed cache (this was a while ago). One was processing events off a queue. Having one codebase and one binary made development and releasing easy. Having multiple roles made production simpler to understand and more reliable (i think).
Another time, i worked on business web app. Again, one codebase, one binary, three instances serving users, two instances serving a data export API, and some number running scheduled jobs. When the API went down - which it often did - users were not affected. That company also tried splitting services out of that monolith; one was pretty successful, but one was a chronic failure, because so many features involved coordinated changes to both codebases.
To get high availability every system need to be able to hand over processing to another instance of itself in some way.
Modular or micro-service make no real difference here, except that a bigger collection of modules in a single JVM will take a bit longer to startup, but if you really can't periodically restart your service to reclaim lost resources due to 24/7 requirements, you will need to have several JVM instances serving the same service anyway.
This makes start-up times mostly irrelevant since any decent load balancer will handle seamless handover almost trivially.
Microservices is a rather resource expensive solution to modules bringing the JVM down for whatever reason.
Usually the right thing to do is to decrease amount of state in the application, think in microservices without HTTP and within the JVM, and finally: Run a a bunch of small JVM's instead of one large.
The upside of this approach is that when/if you need to migrate some modules to actual microservices, you have already made part of the work, already tested that you can run multiple instances of the same service, and all without the significantly increased operational and development complexity of microservices. Microservices is good where it really is the only option - eg. heterogeneous technology stacks and subsystems with different release schedules.
Simulating the grid/web of microservices including network errors like timeouts, latency and dropped connections in a petri-net (or similar) simulation tool can be a rather eye opening experience if the set of services isn't trivial. The set of weird behaviours a network of services can exhibit is staggering, even without a single line of code.
I'll see what I have when I get back to my computer. In the mean time, the magic words to find interesting things are something like "colored timed petri-net"
A lot of it will probably be about logistics, but sending a shipment and calling a service is mostly the same thing, might have to squint a bit.
Toolwise, there are the ProM tools (www.promtools.org), which although they are mostly for mining process networks from logs can also be used for simulation and analysis. A bit of a warning though: It's a rather strange piece of software, not very stable (for me), but it has a ton of modules related to process mining and simulation.
Lots of the petri-net stuff available is generally a bit dated, as although it's a useful, and quite simple formalism, it only ever appears to have caught on in somewhat niche areas.
It is sad, to see this as the only comment on OSGi, which actually provides an elegant solution for modularization.
I agree that tracking down problematic modules is difficult. And OOM Exceptions (due to leaks) are actually the only serious problem where we run into problems during runtime. On the other hand, even in these cases its often enough to just restart the single OSGi service (if you know which one).
For NeoSCADA it was definitely the right choice. We have servers running uninterrupted for months (if not years). And to be able to hotpatch a library on the fly, is really really great (although seldomly used).
At the start-up I currently work for, I designed the system to be composed of microservices (with Swagger/OpenAPI), even when it might've been easier to write modules, for scalability reasons. Before I was brought on as the first engineer, we had a series of contractors build a poorly architected monolith that had issues scaling. Microservices have enabled us to independently scale different parts of the service as load changes. It's also forced a very strong separation of concerns.
My main complaints are that we've run into various instances where we need to access some data in a different service, so we stuff it into some object so that the next service can pull it from the DB. However this is a solvable implementation issue.
The other complaints I have are that monitoring and profiling are much more involved. Additionally, setting up a new service can be painful if you don't have a template.
Overall, we've had a good experience with microservices and it's enabled us to deploy faster and scale simpler.
Few advantages for Microservices over modules is - performance of one module impacting other modules. 1. Say one module is running slow, that will impact entire application.
2. If we want to push code for one module, we need to push code for entire application. So there is no deployment isolation. 3. We have to have all the modules written in same programming language and same version of the programming language. Hence there is no clear path for upgrading version of programming language.
Micro Services addresses all these assuming we deploy Micro Services separating the services (including data). But the tradeoff is a bit of complexity and latency for the benefit of complete isolation and independence.
1. Say one module is running slow, that will impact entire application.
Why? Unless you're doing something silly like running the whole service on a single process+thread, the other workers should still respond just fine while the slow module is crunching away.
2. If we want to push code for one module, we need to push code for entire application. So there is no deployment isolation.
Yes and no. In what sense does that worry you? You can still ship only changes to one module, using version control. And you can still do a gradual deployment, avoiding downtime. The only thing you can't easily do is take down a whole module while keeping the rest of the application running.
1. Because infrastructure resources are shared - like CPU & memory
2. If all the modules are managed by same team, yes you are right. If we want each module to be managed by one pizza scrum team, we need need deployment isolation to avoid churns. Don't you think so?
> Modules are natural units for code-ownership as well.
I read a snarky blog post a few years ago entitled "Blame-Oriented Software Design". It described legitimate design patterns that software developers in a Dilbert corporate environment can use to deflect blame and extra work caused by others' shoddy code. Examples include well-defined module boundaries, strongly-typed APIs, and extensive assertions and logging enabled in production code.
Unfortunately, I can no longer find this blog post, but I chuckle about it often enough that sometimes I consider rewriting what I remember of it myself. :)
I've worked on both types of projects and I would never choose microservices over monolith if presented the option.
Rather, I agree with the author that proper separation into documented modules gives most of the benefits from microservices without any of the numerous drawbacks.
Microservices seriously increase the operational overhead. They increase the hardware expenses. They increase service latency through round trip times and unnecessary work. They increase the code complexity through unneeded serialization/deserialization and REST calls vs plain old function calls. They make debugging a lot harder. They give you all the pain of distributed systems and networking when all you previously needed was a function call.
With Go (and many other statically typed languages) you get proper modules that can't reach into the private implementation of other modules (even with reflection.) They can't have import cycles. The compiler will take care of alerting you when an API changes in an incompatible way. Likely all your modules will share a backend database(s) so you need to take care to use the public interface of the responsible module rather than reach around behind its back with direct queries - but that's a solvable organizational problem.
I wonder sometimes if the microservices craze isn't trying to find a technical solution to human problems that would be better dealt with via communication and creating solid organizational practices. Like team A doesn't trust team B and rather than work out a common set of norms and rules they just start using microservices to bypass dealing with the problem all together.
For those of you happy with your use of micrsoservices, how many microservices does your team handle?
I don't mind microservices but I definitely think my company has taken it too far, for my team of 5 we have ~25 separate components/deployments which make up ~12 microservices. It makes it impossible to keep track of each service and be familiar with idiosyncrasies in each project.
In my experience, the biggest obstructions to modularity are the "unknown unknowns". I can't tell you how many times I've been trying to set up some software where it turns out the installation was looking for file ABC in directory XYZ even though the documentation wasn't up to date and listed directory EFG (or nothing at all). Also, environment variables (i.e., global variables) are a pain and should be eliminated altogether. What would replace them? I'm not sure, but any random idea has got to be an improvement of some sort over the current system.
Docker alleviates the environment hell problem somewhat, so it's easy to set up a bunch of contained microservices that aren't going to fall apart when the global environment changes.
We need more research into the complexity of programming patterns. The same patterns and anti-patterns keep popping up under different names in slightly different forms.
Speaking of random ideas: dump the globals into functions (one function per global) - that's right a dumb single-variable function. You can't monitor or pop an assert on a variable, but you can put slap asserts inside a stub function, etc. Plus you know a function ain't local. Globals are such poison that the minor inefficiency created by converting a few globals to functions is usually meaningless.
I begin to think that we need a richer vocabulary to address "complexity." To analyse it into various forms, (some of which are likely to be beneficial.) Separating code into lumps means more lumps; but fewer dependencies, more known absence of dependency... We keep using one word "complexity" to point to various things. Distinguishing between Complex(opposite of 'simple') and 'complicated' (opposite of 'easy') is a start I suppose, but I can't help thinking there are more careful distinctions that could be made.
I even found using submodules in git was really helpful as well even though it wasn't a module per say. For example in Mongo, I have models in node.js and it is really simple to share the model in every node project across the org as mongo models contain schema and validation etc.
Recently I've seen a few folks use protobufs within parts of a monolith as a mechanism to enforce a strongly typed interface between components. This helps define a contract between teams without the overhead of a distributed system.
I found this a refreshing article that have a nuanced view of different solutions. Also I was glad it acknowledged that different solutions make sense at different scales and times in an organization's life.
This is why I tend to build microservices that implement a bounded context. Then compose those different APIs, as necessary, to implement the public facing API.
I don't need granular level services, it places too much burden on operations, but I still get the separation of context, concerns and 'anti-corruption' layers that I'm looking for in my design.
His comments with regards to the lack of compile time checking of the interfaces between microservices couldn't help me but think of GraphQL. Using GraphQL as the layer between microservices could alleviate this issue. Specially if the usual "deprecate, don't version" approach of GraphQL is followed.
Normally GraphQL is talked about within the context of user facing applications interacting with servers but it could be quite useful for strongly typed machine to machine communication.
We already use GraphQL for our client facing API. We're going to start rolling out microservices at our startup and I think we'll use it for communication between microservices as well.
Be interested in if anyone has much experience in using an Actor type model instead of microservices. Things like Akka/.net / orleans / Erlangs OTP etc
shouldn't the goal be to do whatever provides the most value and builds your product or business the fastest? if you waste time with a microservice architecture where you could've done more, faster with a monolithic app that seems like a poor use of time.
"If I had 8 hours to chop down a tree, I'd spend the first 6 sharpening my axe" - Abraham Lincoln
If you delivered the initial thing in 4 weeks vs 8 but you'd spend way more in future development and developer time, would you say you made the right decision?
To put it another way, delivering the wrong thing faster doesn't make it any more right.
couldn't you just as easily say you build the wrong thing in more time with your hypothetical increase in future speed, but you go out of business because you never have any customers?
what you're saying is the right thing, if you have unlimited money.
If a micro service is independent it makes no calls to other micro services. Thus the system has a single service and that service is a monolith. Contradiction, thus a micro service cannot be independent.
Lets say ycombinator.com was news.y, www.y and apply.y, all served from the same "monolith". And they decided to make them into separate micro-services ... The best way is to make them all independent!
One of the practical issues we've had with microservices that need to interact with each other in real time is ensuring a consistent state across systems. For example, let's say I need to change the status of an object and afterwards, call a separate service to change state there as well. What happens if the call fails in some way? You can't just run all of this inside a single database transaction anymore. Now you have to design your code to deal with several potential failure points and edge cases, which adds complexity. The other consideration is all calls to a service should be idempotent if possible. It makes coding from the client side a lot easier if you can just fire off a call multiple times (in case of local or remote failure) and not have to worry about state.
Just some of my thoughts, since this stuff has been on my plate recently.