Basically redux is more or less already doing eventsourcing on the client. You have actions("events") that tells your global state what the next state should be, and replaying the history of all actions will deterministically get you to your current state. The only thing you need to do is persist that history of actions in your state also (which is already commonly done for implementing undo and time-travel).
Then all that is left is to make your server use these same events too! This means whenever you save to the server, rather than doing GraphQL/REST-style per resource updates, you just sync any unsaved redux actions from the client. The server will save these actions and optionally replay them to get a snapshot of current state (or don't - that's just an optimization. The queue of actions is the first class citizen now).
Afterwards realtime collaboration, undo/time-traveling, history/audit logging, etc all come for free.
This is a really nice way to unify event shape design that is consistent on client and server.
 https://news.ycombinator.com/item?id=15946425 or https://github.com/google/boardgame.io.
(I am making it sound a little simpler than it is. There's a lot of details to get right with state design, tagging actions as relevant to be persisted, multiplayer conflict resolution, app versioning issues, event queue compaction, serverside permissions validation, etc. But to get a prototype up and running quickly what I said above will work. Boardgame.io didn't worry about those details but it's still a nice codebase to study)
Next step for improvement would be to split event receiving side and event displaying side to get CQSR pattern.
I didn’t write any docs or so, it was just playing around! In the index.js there are some routes that store events and some that show you projection of state.
I really hope event sourcing will get more traction. The idea that you defer building your state model is in my opinion incredibly strong. But there are some open questions that i find hard to answer! Like instant feedback to client, replayabolity of side effects, ensuring ordering when having multiple services etc!
Side effects caused by action handlers are also something to beware - don't launch the missiles when you're replaying the past.
I was hired into one $100m+ greenfield project (silly money by a silly investor) where the novice technical management wanted everything done "perfect" - no room for compromise.
Project failed, and 600 people that were hired over the 18 months were then all fired.
Every piece of data was to be event-sourcing, every data transaction on a global message queue (Kafka), the ultimate pure event sourcing model.
The big expenses with event sourcing are up-front design of protocols, backward compatibility of protocol changes (if you got it wrong), and problematic structuring of coordinated queries (using coordinators.)
Trivial stuff like serialisation starts to become 30%+ of your development cost.
The simplest example we had, for instance, is the creation of a new user on sign up.
But the Authentication module was not the same as the personalisation module.
So we had authentication creation, then personalisation module need to learn about the user and create the personalisation defaults, as a coordinated, distributed query.
All with Kafka. The next thing here is you discover you need RPC and not Pub/Sub. So you end up seeing confused developers bending the architecture out of shape, doing RPC over pub/sub - which looks like a dog's dinner.
My general view of event-sourcing is "do it in the small" for specific problems where you need it.
If you think it's the silver bullet, then please read Fred Brooks again, and stop pushing your golden solution to all things on us all, please.
BTW, you don't need a fancy framework to keep time-versioned histories in a relational database (I realise RDBMS doesn't scale for all systems, but it does for many.) You just need to design your schema well, and preferably, make your writes go through stored procedures which transactionally maintain history and current schemas.
I can vouch that at least 30% of development cost went into things that are otherwise very trivial. The risks of defining the wrong domain boundaries early on has culminating costs and besides specific use cases the benefits are not outweighed by the disadvantages.
Even things such as business intelligence need to have knowledge of events and how they reduce to comprehensible state. For a startup this is can turn into a disaster of dealing with technology instead of focusing on solving problems.
Lastly, GDPR really changes things when you need to be able to anonymise or delete data. It doesn't play too well with the event sourcing model.
It almost seems like it's obvious to just cut out reactors when you're replaying, but what if you introduce an almost invisible dependency on a reactor's results that makes the playback of application state NON deterministic? The old joke "then just don't do that" comes to mind, but it would be a bit nicer if the rules about what's allowed to do what, and more detail on what to avoid should probably be spelled out here?
It almost seems like allowing reactors to generate new events is a mistake! how do you keep that from becoming unmanageable mess?
One solution I implemented was to log all of the ‘reactions’. Then when reacting, I checked whether the reaction had already happened to prevent side effects. For example:
Projections only create/update/delete views and are idempotent.
Event handlers could do side effecty things like emails or generate another command that goes back into the system.
While playing back events, just didn't hook up the event handlers at all.
Worked beautifully and it was very liberating to be able to wipe out your entire read side and recreate it either exactly the same way or tweak it to handle evolving requirements.
In order to achieve this, I use process manager.
Say the event "User Signed Up" is dispatched. A process manager will then execute the "Send Welcome Email", the aggregate checks if the user aggregate root is in a state where a welcome email should be sent (something like a flag `welome_email_sent`). Depending on the result of your email provider, the event "WelcomeEmailSuccessfullySent" or "WelcomeEmailNotSent".
It is possible to react to the erroneous event with another process manager and retry.
It's an amazing technique for implementing eventual consistency!
I've also learned that, as usual, I didn't know enough and that others had gotten to this problem and thought about it more thoroughly than I ever could.
> These questions could be answered in seconds if we had a full history.
I agree, but that does not necessarily require an event sourcing design. I think the better fit here is bitemporal databases.
A bitemporal database can answer pretty much any query about what is currently true, what was true and what will be true in future; about the story of a sequence or the existence of a sequence or of a point in time; about not only what "is" or "was" true but when you believed it to be true.
So I can not only ask "what was the total value of the cart last thursday", I can ask "what did we believe the total value was, before we applied a correction?"
I've recently given this general area of systems a lot of thought, trying to square away several different schools of thought around data. Principally stream processing in the style of Akidau/Chernyak/Lax, bitemporal databases as described by Snodgrass and dimensional modelling as described by Kimball. Plus some ideas pinched from Concourse and a hasty skimming of Enterprise Integration Patterns by Hohpe, Woolf et al.
As it happens I will be trying to pitch some of this lunatic handwaving to colleagues this coming week. My basic goal is that you should have pluggability without needing to rewrite the universe. Streaming without giving up tables. Tables without having to convert them always to streams. Different views of data treated on their own terms, in their own form, without elevating one or the other to being The One True Way Of Managing Data.
Alternatively, I'm wrong.
I'm curious if you think this is because the primitives don't exist (seems everyone doing CQRS/ES roll their own) or if it isn an "essential complexity" of doing CQRS/ES?
I'm leaning more towards the former and less of the latter. Although I'm only a month down the road of implementing an CQRS/ES system.
But I have also come to think that streaming or eventing systems have come to overstate the argument that the stream is the "true" system. It is and it isn't.
The analogy a lot of folk hit on before I did is to calculus, including streaming folks. But I think it actually proves that in "streams and tables", neither is the truest of them all.
I can take the instantaneous velocity of an object, or I can take it over a span of time, or I can calculate its acceleration, or perhaps the distance that has been traveled. These are all functions that can be reached by deriving or integration from each other. But none of them is the high lord master formula. They are just different representations that make sense in different cases.
I've read before this is not recommended by people who did it (besides people who are consulting firms or promote Kafka). Is there a reason for this anyone can chime in with?
How do you synchronize multiple events? How do you handle partial system outages? What is the retry strategy for failed events and how do we handle inconsistent data?
Then comes the fact that you've more then doubled your data size since you need your source of truth and a copy on each read view - that gets expensive quickly.
I agree with Fowler - CQRS/ES is probably too complicated and you should avoid it unless you have a well described bounded domain that fits the model.
This has been my lived experience with it, as well. I find that, in retrospect, the complexity was really not worth it.
And now, with regulation like GDPR, how do you handle things like the right to be forgotten in an (and especially if it's a legacy app) ES/CQRS architecture? What I've personally discovered as the approach we implemented at work is horrifically complex, to borrow your words.
Funny thing is that the traditional stateful model handles these scenarios horrendously as well. Probably more so, since you have no history to reconcile with.
The canonical description of the domain model pattern (from Fowler, also) calls for it be _considered_ in a case where there are "complex and ever-changing business rules". He goes on say that "if all you have is some sums and null checks, a different transaction processing pattern is more appropriate".
I have seen a lot of event sourced systems, and the vast majority which fail fall into the "not null checks and trivial sums" case rather than the former.
Make an event sourcing system single threaded, and the problems of synchronisation don't exist. Make a "traditional" RDBM system distributed, and the problems of synchronisation exist.
If you weren't building a distributed (or parallelised) system to start with, why introduce it with event sourcing? If you were, how were you going to solve all those problems with a traditional RDBMS approach?
The solutions are more or less the same.
Perhaps "retry strategy for failed events" is new - how would you have handled a replicant failing in a distributed RDBMS application?
The data size issue in my experience pales in comparison to the problem of having the business ask questions you can't answer because you threw away data. How high is your transactional rate, anyway? Views shouldn't be a data size concern, as they should only contain the pertinent data. In many cases you should be able to hold views completely transiently in memory. It's relatively rare that a view is expensive to compute from an ordered history.
> I agree with Fowler - CQRS/ES is probably too complicated and you should avoid it unless you have a well described bounded domain that fits the model.
Given the model fits any domain in which things happen, this might not be the best way to determine if ES is appropriate - it's probably more along the lines of how important history is, how important flexibility for future use of data is, and how important time to market is. If we give up security to get a product out sooner, we would pretty quickly also give up a total history of the system, too.
There's also more than one way to skin this cat - you can use transaction scripting and log transactions; you can use database table level auditing; you can have a high level business intent audit log. Probably others. They all have complexities and drawbacks.
I don't personally find CQRS/ES particularly complicated, having worked with it a few times. I don't apply to every project, the same way I don't apply any of the other history mechanisms to every project, but I'll reach for it over those other mechanisms, because I prefer the costs of ES over the costs of those mechanisms.
You will simply get things faster without it, and can always introduce it later to progressively phase out components with lots of debt.
> This post has been created by A who is a temp. So it must be approved by an editor first. How do I check those things before validating my command? Let's forget about that, but a link (maybe) to the saga pattern and consider it done.
Some things come for free but a lot of easy things become a lot more complex. But this will be hidden to sell the new silver bullet. What's funny is they often use git as an example: remember you rarely use git alone when you want to do things like code review and handling permissions.
That said, I completely agree with your assessment. Event sourcing makes simple things a lot harder. It should be used sparingly, only for those situations when the history is genuinely useful.
Has anyone here here implemented a system like this based on Splunk, and what was your experience while doing this?
Would be nice if the events here were compatible