Instead, we've moved the state to a central, distributed store that everyone talks to. This allows you to do atomic transactions. Our store also handles fine-grained permissions, so your auth token decides what you're allowed to read and write.
One non-obvious consequence is that some microservices now can be eliminated entirely, because their API was previously entirely about CRUD. Or they can be reduced to a mere policy callback -- for example, let's say the app is a comment system that allows editing your comment, but only within 5 minutes. ACLs cannot express this, so to accomplish this we have the store invoke a callback to the "owner" microservice, which can then accept or reject the change.
Another consequence is that by turning the data store into a first-class service, many APIs can be expressed as data, similar to the command pattern. For example, imagine a job system. Clients request work to be done by creating jobs. This would previously be done by POSTing a job to something like /api/jobs. Instead, in the new scheme a client just creates a job in the data store. Then the job system simply watches the store for new job objects.
Of course, this way of doing things comes with its own challenges. For example, how do you query the data, and how do you enforce schemas? We solved some of these things in a rather ad hoc way that we were not entirely happy with. For example, we didn't have joins, or a schema language.
So about a year ago we went back to the drawing board and started building our next-generation data store, which builds in and codifies a bunch of the patterns we have figured out while using our previous store. It has schemas (optional/gradual typing), joins, permissions, changefeeds and lots of other goodies. It's looking extremely promising, and already forms the foundation of a commercial SaaS product.
This new store will be open source. Please feel free to drop me an email if you're interested in being notified when it's generally available.
Other than api - rest vs whatever binary rpc protocol, it sounds very much like a standard database...
First of all, we're not an RDBMS, and don't pretend to be. I love the relational model, but there's a long-standing impedance mismatch between it and web apps that I won't go into here. There are clearly pros and cons. Our data store isn't intended as a replacement for classical relational OLTP RDBMS workflows.
If you let all apps share a single RDBMS, you're inevitably going to be tempted to put app-specific stuff in your database. This one app needs a queue-like mechanism, this other app needs some kind of atomic counter support, etc. You may even create completely app-specific tables. How do you compartmentalize anything? How do you prevent different versions of apps to stick to the same strict schema? How do you incrementally upgrade your schemas without taking down all apps? How do you create denormalized changefeeds that encompass the data of all apps? How do you institute systemwide policies like role-based ACLs, without writing a layer in stored-procedures and triggers that everything goes through? Etc. There are tons of things that are difficult to do with SQL, even with stored procedures.
I would argue that if you go down that route, you'll inevitably reinvent the "central data store pattern", but poorly.
The issue with a centralized data store is that your services are coupled together by the schemas of the objects that they share with other services. This means you can't refactor the persistence layer of your service without affecting other services.
All that said, a single source of truth does do away with distributed transactions, so I can see the appeal.
It's worth pointing out that you do have the same challenge in a siloed scenario, but the "bounded contexts" are separated by the applications themselves, which no chance of tight coupling because there's no way to tightly couple anything. In the silo version, apps can still point at each other's data (e.g. reference an ID in another app), there's just no way of guaranteeing that the data is consistent.
The coupling challenge is solved by design -- by avoiding designing yourself into tight couplings.
For example, let's say you desire every object to have an "owner", pointing at the user that "owns" the object. So you define a schema for User, and then every object points to its owner User. But now all apps are tightly coupled together.
In our apps, we typically don't intertwine schemas like that unless there's a clear sense of cross-cutting. An "owner" field would probably point to an object within the app's own schema: A "todoapp.Project" object can point its "owner" field at a "todoapp.User", whereas a "musicapp.PlaylistItem" can point to a "musicapp.User".
(Sometimes you do have clear cross-cutting concerns. An example is a scheduled job to analyze text. The job object contains the ID of the document to analyze. The job object is of type "jobapp.Job". The "document_id" field can point to any object in the store. The job doesn't care what the document is -- all it cares about is that it has fields containing text that can be analyzed. So there's no tight coupling of schemas at all, only of data.)
However... I have played with the idea of a "data interface" concept. Like a Java or Go interface, it would be a type that expresses an abstract thing. So for example, todoapp could define an interface "User" that says it must have a name and an email address. Now in the schema for todoapp.TodoItem you declare the "owner" field as type "User". But it's an interface, not a concrete type. So now we can assign anything that "complies with" the interface. If todoapp.User has "name" and "email", we can assign that to the owner, and if musicapp.User also has "name" and "email" with the right types, it is also compatible. But I can't assign, say, accountingsystem.User because it has "firstName", "lastName" and "email", which are not compatible.
The central data store pattern arguably makes apps even more "micro", albeit at the expense of adding a dependency on the store. But the opposite pattern is to let each microservice have its own datastore, so you already have a dependency there.
It's just moving it out and inverting the API in the process; for many apps, the data store becomes the API. For example, we have an older microservice that manages users, organizations (think Github orgs, but hierarchical) and users' membership in those orgs. It has its own little Postgres database, and every API call is some very basic CRUD operation. We haven't rewritten this app to use our new data store yet, but when we do, the entire app goes away, because it turns out it was just a glorified gateway to SQL. A verb such as "create an organization" or "add member to organization" now becomes a mere data store call that other apps can perform directly, without needing a microservice to go through.
Secondly, all data operations can now be expressed using the one, canonical data store API, with its rich support for queries, joins, fine-grained patching, changefeeds, permissions, etc. Every little microservice doesn't need to reinvent its own REST API.
For example: The users/org app has a way to list all users, list all organizations, list all memberships in an organization, etc. Every app needs to provide all the necessary routes into the data in a RESTFul way:
/organizations # All orgs
/organizations/123 # One org
/organizations/123/members # Members of one org
/organizations/123/members?status=pending # Members of one org that have pending invites
/users/42 # One user
/users/42/organizations # One user's memberships
With our new store, a client just invokes:
/query?q=*[is "orgapp.member" &&
organization._ref == "123"
&& status == "pending"]
Second, what are the constraints of CDS? How much data can I pack into a single object? silo? How does bad behavior on the part of one caller affect another? What if CDS just doesn't work for a new service you're building?
I do appreciate that your company has invested in providing data storage as a service for yourselves, which I think is a much better idea than having each team rolling their own persistence. However, I think people would be very interested in how you've made sure that CDS isn't a SPOF for all of your data, as well as what kinds of things it isn't good at.
EDIT: I would also point out that there is a difference between having a single CDS and having StorageaaS that vends CDS's.
Our old "1.0" store architecture did in fact decompose things into multiple services. It has a separate ACL microservice that every microservice had to consult in order to perform permission checks. That was a really bad, stupid bottleneck.
For our new architecture, we decided to move things into a single integrated, opinionated package that's operationally simpler to deploy and run and reason about. It's also highly focused and intended for composition: The permission system, for example, is intentionally kept simple to avoid it blooming into some kind of all-encompassing rule engine; it only cares about data access, and doesn't even have things like IP ACLs or predicate-based conditionals. The idea is that if you need to build something complicated, you would generate ACLs programmatically, and use callbacks to implement policies outside of the store (the "comments only editable for 5 minutes" is an example of this), and maybe someday we'll move the entire permission system into a plugin so you can replace it with something else.
It's also important to note that the store isn't the Data Store To End All Data Stores. It covers a fairly broad range of use cases (documents, entity graphs, configuration, analytics), but it's not ideal for all use cases. There are plenty of use cases where you'll want some kind of SQL database.
I think these kinds of access control rules, can be expressed within an entitlement solution. These systems are often called RBAC+ABAC (role based access control + attribute based access control). The caller calls a PDP (policy decision point). Policy decision point is a rules engine that can take in the callers application context (which, in your case, will include current time and the time of the initial post)
PDP is often implemented as a microservice, or even as a cache-enabled rules engine that, as API resides with the context of every caller (for faster, lower latency, more resilient solution)
These components are part of XACML