I'll be honest with you: I thought this was a parody. It's SO abstract.
It's like you came from doing this abstract thing inside a big company & decided to do the same abstract thing as a startup. And describe it using the specific terminology used inside that specific team of SuperBigCo.
I don't have to work too hard to implement the control plane as PostgREST takes care of generating the API and Postgres already has authorization controls built-in. Authentication is the part that requires a bit of toil to figure out. The rest is managing the schemas of the control plane entities. Basically, design the data and most of the rest is generated for me.
The data plane is trickier but I'm experimenting with using Postgres' streaming logical replication protocol to convert logical queries on the control plane data into business domain events that are forwarded onto the message queue. This is the part that uses the postgresql-replicant library I wrote but other libraries written in Python exist and can do the same thing.
This then enables me to implement business logic/data-plane actions asynchronously down-stream as isolated, stateful services that follow the event streams and react accordingly to various policies. They can then update the control plane models as they progress, which then could add more domain events to the stream, and so on. It's a bit like a functional-reactive architecture.
I don't know if it's a production ready style architecture. Monitoring replication stream performance can be tricky and integration testing is challenging. And managing changes to the business domain is not a fully-solved problem: still a lot of exploration/tooling/and do-it-yourself duct-taping to do.
But it's simple enough that I can do lot of work with very little code so far.
I do tend to model the business processes at a high level using domain-driven design and map the aggregates to services that ingest event streams. The services then react to the event streams in several ways: emitting events, updating control plane models, issuing new commands to other services, etc.
Each service keeps its own state internally and if I need to, I can blow away their state and replay all of the business domain events thanks to the durable message stream. That part is key... and is also the most duct-tape-and-toil area of this architecture.
I've been toying around with ideas to generalize this into a consolidated application framework but it's still pretty experimental stuff.
The high-level architecture isn't terribly novel or new but having standardized tools for common operations, managing migrations from event schemas, managing checkpoints, etc; is still a work in progress.
Can you explain them in a more concrete, conversational way?
Does this service let me e.g. take any docker image and turn it into a SaaS, handling user accounts and billing etc?
Explainers seem to not cover _why_ you would want to separate these "planes". There are several reasons, and I'm no authority, but for starters:
* control messages will have different expectations around them: their amount and frequency, delivery guarantees, urgency with which they are processed. Treating this traffic separately means you can engineer appropriately for data and control traffic.
* last thing you want is the control message "stop processing traffic from IP x.x.x.x port y" to be stuck behind traffic from said IP/port...
In this context, the meaning is somewhat different. They are referring to administrative traffic vs "actual work" traffic. Auth, billing/accounting, configuration updates, that sort of thing. If you are running a SaaS, and your customer is very security conscious and wants none of their precious data to ever leave their VPC, you have 2 options: deploy your software into their VPC completely, making it hard to do a variety of things like upgrades, and increasing complexity; or you can separate control actions from your "worker nodes" and storage, and only deploy the latter into the VPC. You can then work on your control panels, monitor usage, continuously evolve various admin panels and config options, etc, using normal SaaS approaches while the security conscious customer knows that their core data is not leaving their virtual walls and only "bob ran a thing and stored results" goes to the vendor.
This post is about abstracting out common bits of how one implements that, and allowing SaaS offerings to provide that sort of separation easier.
Control plane is the central lifecycle management system that helps provide all the SaaS experience for your Infra SaaS application, manages the metadata for your application and also pushes this information to all the data planes. Example of lifecycle management operations could be creating an user, a new organization, provisioning your data plane in a specific region, deleting a cluster etc.
The data plane is your product that you want to sell to your customers and control plane is the central system that helps you to make your product work in a self serve way with your customers.
This example can also be mapped to internal use cases. Many companies manage their own infrastructure internally and end up having to build a central control plane to manage all the different infrastructure that they provide as a service to their developers. Hope this helps.
Pretend we're a SaaS company offering a database as a service. Adding and removing users, and the setting of passwords is control plane stuff. In a sufficiently web scale system, adding and removing users becomes, not just its own microservice, but a collection of microservices to authenticate and send updates to the main product database, and have its own separate database.
Overall I'd imagine there are a lot of parallels to other SaaS-ish architectures, one big divergence is that I'd consider the data-plane to be a special kind of client, (client in the same way that a user's phone or browser is). The big difference is that we (the company) ALSO manage the lifecycle of this "client" (i.e. shutdown, startup, repair, update). Having an untrusted client that you also manage the lifecycle of can lead to some interesting design spaces.
But, this raises some important questions. You're essentially outsourcing your entire company. Think about it, especially for an open source project, the main way you make money is build a cloud version of you product. If someone else is doing that for you, what are you left doing? It's a dangerous place to be in.
What's to stop the owner of this service to realize he can cut you out from the middle, and just build a service out of your product himself?
E.g. that's a lot of AWS's M.O. recently; they have really robust internal infrastructure to spin up managed services for any project, and they make a lot of money doing that.
Unfortunately there are limited tools/resources out there to answer the question "how can i build a cloud deployment option for my open source project" without 25 layers of abstractions. This is why open-source projects end up raising millions of dollars (e.g strapi, appsmith just off the top of my head). All this money just for all these companies to essentially build the same thing.
Ideally, there should be a service/tool (maybe thenile will be it) where I answer a few questions.
Allow end users to deploy many instances?
% premium per instance over cloud costs?
License per instance?
Pricing (tiered, volume, stairstep?)
On end customer's own infra?
Min resources per instance.
Airgapped per user/airgapped per instance/all instances running on same cloud?
Instance management API?
Big Green button to update all instances at once?
And it should spit out a ready to deploy setup where I can start monetising my open-source project while concentrating on maintaining the project.
If the above exists then congrats, you disrupted the main reason for open source projects raising millions of dollars.
One shortcut with all this is just to allow maintainers to surcharge a premium with deploy buttons.
For Infrastructure SaaS, it is a bit different. You typically will have different customers provision your infrastructure in different cloud or regions depending on where they have their infrastructure. This leads to having many physical infrastructure deployed in many regions and cloud providers. At the same time, for the user, you need to provide a single pane of glass experience where they can manage all their infrastructure from a single dashboard. This requires a central control plane that is responsible for all the life cycle management operations and it helps to communicate all the metadata back and forth to all the data planes. Things like upgrades, observability, user and tenant management all need coordination with the data planes. This makes the Infrastructure SaaS use case a bit different from standard B2B SaaS. Hope that helps.