I don't think people are qualified to reject this pattern unless they've spent some serious time working in these ecosystems. It took me a long time with a ton of production experience to evolve my thinking and truly appreciate CQRS/ES.
Like anything else, it's one tool in the tool bag, but like that giant pipe wrench that you're always looking for a reason to use, it's almost always the wrong tool for the situation. CQRS carries horrific complexity and requires commitment to a handful of golden rules or the entire thing comes crashing down.
> I don't think people are qualified to reject this pattern
I don't think most people are qualified to know what the hell this pattern truly is, much less put it into proper operational use. I've seen plenty of people attempt to use CQRS when they should have stuck to a simple CRUD model. At one point I was adjacent to a team that had built 8+ services to handle one or two trivial business processes.
CQRS isn't a fine wine or stinky cheese. If it smells funny for your case, it probably isn't the right choice.
What are the golden rules?
- Don't use the read side from the write side.
- Each aggregate should own its data (or more abstractly, the basics of DDD, bounded contexts).
As with all software architecture, you need to adopt a concept to the problem at hand: following the "rules to the letter" is hardly useful. Most successful CQRS systems are those that do not follow every rule (e.g. let command handlers return response data make for a much more convenient workflow).
It's easy to read this and go "of course" for a small conceptual system, but in practice even really experienced engineers want to break those rules in production systems - and it's a lot easier to break those rules than fix the system to respect them.
This is the problem with CQRS. The answer to so many of these little tweaks depends on a ton of system knowledge and contracts and promises, and it's really really easy to make the wrong choice.
Fine - but it isn't a matter of rejecting or accepting; but rather of understanding. And if someone isn't "qualified" to reject a certain architectural pattern... then they probably won't be very good at implementing it either.
So on balance I'd rather be in an environment where people are allowed to "reject" what they don't fully understand yet -- rather than (pretend to) go along with it (on the basis of some article they vaguely read about it, or "because X said so.")
Having tried and failed to apply it shouldn't aautomatically generalize the failure to a problem with the tool, but rather to either its inadequacy to the use case, or its improper use (and this is useful learning as well).
> And while storing raw events may be "expensive" compared to any single optimization, all optimizations can be derived from raw events.
> The reverse is impossible.
But I have to say: the resources that this site links only served to confuse me. Greg Young may have popularized these concepts, but watching his talks left me with little practical implementation guidance, and Udi Dahan was even worse, much, much worse, in terms of leaving me helpless and confused.
What really helped me was "Designing Event Driven Systems" a promotional book by Confluent that nevertheless has many great sections with practical advise for implementing these patterns. Likewise the CQRS/ES FAQ.
And while this post says you don't need Kafka for CQRS/ES, Kafka sure does help. Kafka Streams is the ultimate tool for CQRS/ES. It contains all the primitives you need to do CQRS/ES easily. I am in the process of writing a blog post about my experience and am looking forward to sharing it. People love React/Redux, state as a function of reducers, and time travel debugging on the front-end: there is no reason you can't have all the way down your stack. Kafka Streams makes this possible, and much easier than you'd think.
We created Crux because we found ourselves routinely needing bitemporal functionality when building immutable systems that are capable of ingesting and interpreting out-of-order/late-arriving events .
So far we have decided against using Kafka Streams to keep our log-storage options very pluggable, but we pretty much implement the same mechanics.
Disclosure: product manager for Crux :)
Edit: the other big picture Confluent quote I like is "You are not building microservices, you are building an inside-out database" (Tim Berglund, Confluent, 2018) which is a perfect answer to this other quote: "The hardest part of microservices is your data" (Christian Posta, Red Hat, 2017)
My thoughts on the process are:
1. Being able to use a pure/functional event-driven model for the command model, and a standard relational model for queries, is the big payoff for us; we get the best of both worlds.
2. Our model does not encapsulate command-side and query-side updates in a single transaction, nor does it require
that they live in the same database; this gives us a lot of benefits for scalability, but it introduces eventual consistency, and not having "read-your-writes" consistency can make things harder for our front-end devs.
A simpler model that does transaction updates to both side might be a win for a lot of teams, and I still wonder if we should have gone that way.
3. We use Kafka for bulk dataflow and inter-service messaging, but not as the internal event store; that gives us some leeway for migrations and surgical edits to the event stream where necessary.
We've found that Kafka's immutability and retention properties do not make it a good fit for a primary source of truth; it's way too easy to "poison" a topic with a single bad message.
4. One thing that none of the books/frameworks do a good job talking about is external side effects with vendors/partners/legacy systems. That's definitely been the single hardest part of our implementation, and we're still evolving our patterns here.
Overall, though, it's been a great direction for us, and we're excited to keep pushing our architecture forward.
Given a query a system returns some relatively useful model back.
Honestly, why is this so interesting ?
Sure, you might want to add some autonomous component(s) that manipulate data before returning them.
A good example of applying CQRS to a small part of a system is e.g
1. Data is constantly being written (commands) into the system, excessively.
2. You want to present a "snapshot" of the data, because trying to do it "realtime" for all your users will demand too many resources, and your system will come to a hault.
3. You create the "snapshot" every 10 seconds, from code in an autonomous component, and then store it as a serialized object in a data store. Like a cache sort of.
4. When the query asks the system, the system loads up the "snapshot", deserialze it, and returns that.
Here you have two independent read and write systems. That's CQRS for you.
Do not apply this pattern with a loose hand.
1. There are 2 event handlers for "writes". One handler writes normalized data in a PostgreSQL DB. The other handler writes denormalized data into Firestore.
2. Our frontend uses Firestore so mutations are reflected realtime in the frontend. We never found a need for the command to return data. There is also no need for complex queries in Firestore since our data is denormalized and optimized for reads.
3. The PostgreSQL DB is useful for reporting and complex queries. Our frontend app displays this data only in the reports area.
So far, I don't see how things can get confusing with this pattern.
It's in fact a very simple pattern. But I wouldn't use it system wide because not everything has to be eventuel consitent.
You are describing CQS (Command Query Separation), a la Bertrand Meyer, rather than CQRS. The only thing that CQRS says is that the read and write paths are different. It does not preclude a command from returning a response - or for that matter being handled synchronously.
Why especially this pattern ?
I would say you shouldn't apply any pattern with a loose hand...
CQRS is not a system wide pattern. It should applied in small contexts and corners of domains.
I don't agree that every design pattern has the same level of impact on a system. By far IMHO.
Also the cached version: https://webcache.googleusercontent.com/search?q=cache:jZXfTY...
I was working on the backend systems for electric car charging. When we heard about real-world events happen (start-charging, stop-charging, etc.), we wrote them directly to Kafka. It was up to other services to interpret those events, e.g. "I saw a start then a stop, so I'm writing an event to say user-has-debt". Yet another service says "I see a debt, I'm going to try to fix this by charging a credit card". I guess you'd call the above the 'C' part of CQRS.
But Kafka by itself is not great for relational queries. So we had additional services for, e.g history. The history service also listened to starts, stops, debts, credits, etc. and built up a more traditional SQL table optimised for relational queries, so a user could quickly see where they had charged before.
The issues we had were:
1) Where's the REST/Kafka boundary? I.e. when should something write to the Kafka log as opposed to POSTing directly to another service? E.g. If a user sets their payment method, do we update some DB immediatley, or do we write the fact that they set their payment method onto Kafka, and have another service read it?
2) Services which had to replay from the beginning of time took a while to start up, so we had to find ways to get them not to.
3) You need to be serious about versioning. Since many services read messages from Kafka, you can't just change the implementation of those messages. We explicitly versioned every event in its class name.
Worth it? For our use case, hell yeah.
I believe the term that's emerging for this issue is "collapsing CQRS" and how you handle this is application-dependent (are you using plain synchronous HTTP requests? websockets?) In my case, the HTTP server has a producer that writes to a Kafka topic and a consumer that consumes the answers. The HTTP request waits until the answer appears on the answer topic.
> 2) Services which had to replay from the beginning of time took a while to start up, so we had to find ways to get them not to.
Kafka Streams local state makes this fast, unless you need to reprocess.
> 3) You need to be serious about versioning. Since many services read messages from Kafka, you can't just change the implementation of those messages. We explicitly versioned every event in its class name.
Yes, this is tricky. In my case, I either add fields in a backwards compatible manner, or rebase/rewrite event topics and roll them out while unwinding the previous version of the topic that may still be in use. The former is obviously the simpler option.
Right now the system is running with ~800 million events, with roughly 1.5M added per day :)
He also has this: https://leanpub.com/esversioning
This blog by Jay Kreps (Linked-in) is also a good read: https://engineering.linkedin.com/distributed-systems/log-wha...
- Command and query handlers being individual (testable) classes is great for general organisation and neatness
- Beats 'Transaction Script' as a pattern any day
- Can wrap command/query execution with various value-add pipelines: logging, timing, caching (queries)
- Can direct queries to use read replicas
Some dirty tricks:
- Service layer (API endpoint controllers) does pre-validation, declaratively where possible, to enable OpenAPI/Swagger documentation of failure modes
- We do really care about whether commands finish or not - so they do run synchronously in 99.9% of cases. They can be put to a queue, but usually the caller wants to know it finished
- Also, they can throw exceptions with error results.. it seemed a trade-off that's worked well enough
Recently had an idea to steal some concepts from serverless leveraging this: could measure command/query performance, resource use (CPU/RAM) by, based on some heuristic, farming out execution of one or the other to a separate process. Commands, queries and query results all being serialisable.
Event Sourcing and separate stores seemed a bridge too far. It definitely wants loads of expertise/experience and careful design, whereas the above is easy as pie to get going with.
Anyways. It's proved for some enjoyable enough development.
- Compared with a repository pattern or typical fat-fingered ORM, separating command and query models in code can make it easier to let the data store / search engine do the things it has been optimised to do well on the query end (joins, views, etc), rather than over-fetching data and doing too much of the translation in code.
I'd be pretty interested to see the same "store" working on frontend and backend and every redux event being queued up and (eventually) synced to the server to provide a log of what happened. Obviously this leads to building an analytics system for your product quite quickly.
A while ago I built a system that would use a React CQRS plugin and provide optimistic concurrency - allowing really fast UIs at the expense of users having to deal with failed commands if a breaking error occurs:
I think the code examples are out of date so don’t worry about downloading those until the book is officially released. This guy has had some great success stories implementing CQRS/Event Sourcing at his place of work in the last year.
But I would suggest just read up and watch a little online and explore on your own. E.g any talks by Greg Young. The OP article has good further reading links. The pure concepts are simple. Implementations vary.
That being said, I converted a failed application that needed rollback and history from CQRS to standard CRUD with trigger based operation logging and achieved the same effect with about 1/10th the code. The architecture was a lot more accessible to lower level devs too, greatly reducing maintenance costs.