Event sourcing isn't nearly as common knowledge among new programmers as the CRUD-one-row-per-entity pattern, and it really should be. I liken it to introducing version control for your data; when immutable updates are your canonical source, no matter how much the system behind them changes, or the business requirements change, and no matter how many teams are deriving different things from them in parallel, they can all work off of the same data and "merge" their efforts together.
The one downside is that shifting your business logic to read-time means that you need to have very efficient ways of accessing and memoizing derived data. For some applications, this can be as simple as having the correct database indices over your WhateverUpdates tables, fetching all updates into memory and merging on each request. For others, you'll need to have a real-time stream processing pipeline to preemptively get your derived data into the right shape into a cache. And those are more moving parts than your typical monolith app, but the
One benefit to actually using event sourcing with a stream processing system is that, in many cases, it can be the most effective way to scale both traffic capacity and organizational bandwidth, much in the same way that individually scalable microservices can (and fully compatible with that approach!). Martin Kleppman at Confluent (a LinkedIn spinoff creating and consulting on stream processing systems) writes some great and highly-approachable articles about this. Highly recommended reading.
The CRUD one-row-per-pattern is common because it's enough for most projects. It works well with ORMs so you can build quickly and securely. And most of the time, performance isn't an issue and having a history of an entity is unnecessary.
I'm worried that event sourcing is going to become this year's over-applied design pattern with libraries in every language for every database with blog posts that recommend it be used on every project.
It's a good idea, very useful - in the right hands on the right projects. But it makes sense that junior devs normally use CRUD because that's normally the right solution. At least until better tools come along.
> The CRUD one-row-per-pattern is common because it's enough for most projects. It works well with ORMs so you can build quickly and securely.
If by "works well", you mean it works until someone asks for historical data - then IT guy has to say w/ a straight face "we lost it". This is unacceptable considering the value of data and the strategic leverage it can have today.
Considering immutable facts tables are the most stable data model; companies often have to re-invent it (poorly) on top of relational at some point; that storage is often not a problem;
and that having clean historical data is crucial for data science; there are increasingly fewer excuses to not adopt a sane data model from day one.
I agree partially w.r.t. to tooling - few implementations aid adopting this pattern, but I believe the value of historical data, over time, overcomes not being able to slap some quick Rail CRUD together and then being stuck at local minima.
>If by "works well", you mean it works until someone asks for historical data - then IT guy has to say w/ a straight face "we lost it". This is unacceptable
You'd be surprised.
For tons of projects it's totally acceptable, has worked for years, nobody paying to implement them cares about historical data and their leverage. In fact the majority of web apps is like this.
I always find it strange when people use "unacceptable" with wild abandon, like they're generals receiving some demand of unconditional surrender.
> This is unacceptable considering the value of data and the strategic leverage it can have today.
The last part is important.
Just because it's been true in the past, it doesn't mean this trend will continue. Maybe it keeps being true for your run-of-the-mill MVP, but I don't see it being acceptable for a system in any industry w/ any chance of making serious money in the mid-term.
As long as managers have limited budgets and projects have deadlines, then tradeoffs will still have to be made.
Event sourcing is an extremely expensive design pattern to implement, and it's also very easy to get wrong. Implementing it tends to preclude junior developers from working on the project, makes it harder for database admins to understand the data, and it requires a lot of thought on how to structure the events.
So on a project with, say, a £20K budget, it might triple the cost. On a project that would take 4 weeks to implement with CRUD, it might take 3 months with event sourcing. You've got to justify that extra cost. It's better to let a BA decide what they will need, and by all means explain the pros and cons of different solutions.
But I don't for a second believe that every single project should now be using event sourcing instead of CRUD.
>Just because it's been true in the past, it doesn't mean this trend will continue. Maybe it keeps being true for your run-of-the-mill MVP, but I don't see it being acceptable for a system in any industry w/ any chance of making serious money in the mid-term.
Again, you'd be surprised. Aside from advertising and consumer behavior analysis, there are not many industries that care or even have a need for such historical data.
The very idea that it must be necessarily valuable to store historical data about "all the things" (apart maybe from some aggregations you can create and store), seems more associated with the recent "big data" fad.
(I've seen 10-12 such trends rise and fall in the industry. In 10 years, I guarantee you it will have fallen off as a keyword, and only used as a technology where really appropriate).
Could happen, but I think event sourcing (and CQRS generally) carries enough implementation overhead in the amount of code required that it's less likely to be adopted in situations where it isn't appropriate.
That isn't to say it won't happen, but I think it's more likely that teams would miss an opportunity to leverage it than leverage it inappropriately.
Here's the term I wish was unfashionable with the kids: reshaping.
Did you spot all those command-to-query-to-event-to-log-to-storage data type conversions in those pretty diagrams? That's a whole bunch of needless reshaping of data as it flows through the system.
For each one of those data transformations to be successful, there has to be accurate communications between people and bug free code written in the data conversion and routing of messages through the system. All those moving parts make changing the system extremely painful, lotsa ripple effects - and every time you have to make a change to your events, you'd have a data migration project for any running event streams.
Naming things is hard too, and there's a lot more naming of entities needed in a CQRS-ES system.
I like all the promised benefits of a CQRS and ES, but I can't imagine a case where I'd take the risk of attempting it on anything but a toy project. Perhaps if I was on the version 5 rewrite project for an insanely profitable system where the requirements and design are completely understood up-front. I would need to grok some canonical example of a large, well-architected, well-implemented representative system before I would ever attempt to implement one.
Are there any non-toy examples of successful CQRS-ES with open source available to read? Did those projects go over-budget, and by how much? Would the authors of those examples still recommend the architecture now that they've gone through the experience?
As someone that has fallen for the "event sourcing" promise before, the article does a decent job explaining the promise. Not sure if it will be the next article, but the actual task of delivering on this work is where things break. Hard.
The vast majority of the things you will ever program are pretty much guaranteed from one statement to the next. Hard boundaries, where things can fail, are often decently understood and actually quite visible in the code.
Moving everything to be an event completely throws this out the window. You can take a naive view, where you pretend from one event to the next is safe to happen. However, to start building up the system to cope when this is not the case starts to build a complicated system. In areas that are decidedly not related to your business domain. (Well, for most of us.)
Maybe some day there will be a system that helps with this. Until then, my main advice is to make sure you have solved your system with a naive solution before you move on.
Agree with the potential for complexity. Here's how we've dealt with this (on a so far / so good basis): Didn't seem necessary to go asynchronous and beef up on heavy infrastructure, so we went with simple in-thread, in memory message buses to start with. I think a lot of the perceived complexity of building event sourced systems comes because people start with the heavy plumbing instead of going YAGNI in order to get the domain model implemented. Easier to beef things up as needed once everything works.
I became disillusioned with doing the naive solution, for two reasons:
Found it to be impossible (from a time/project mgmt perspective) to ever replace it with the "non-naive" one, so it turns into the usual mess because CRUD doesn't work well as you load more functionality on it over time.
Secondly ... it's thinking machines. From a business perspective, does it still make sense to hand code glorified rolodexes without behaviour? Maybe Excel does the trick. Seeing it as a red flag if someone asks me to build dumb data entry forms in 2016.
Therefore, I always start eventsourced theses days. YMMV.
This seems solid advice. I guess my main questions are:
* If you are in-memory and in-process, why bother with the events in the first place? (Simpler put, why not go with the simpler process based solutions?)
* If you are not testing distributed, how do you know you will be able to distribute?
In particular, there is a very large chance that you will have the same difficulty in replacing an in-memory/process solution that you would have had with a naive one.
And don't underestimate the amount of manpower you can get with success. Nor the amount of features that will not help get success.
I don't really do eventsourcing for technical reasons. To me it's about creating an environment for project delivery where teams can succeed because they own more of their stack, from an org chart & organizational perspective.
As far as code is concerned, I see that as coming down to managing coupling & cohesion in such a way that the various pieces can be designed, built, supported, deployed and enhanced by the same team, with minimal need to wait for, coordinate with or be on the same page with any other team.
I think event sourcing & CQRS are great for enabling that because they result in the lowest form of coupling: Commands are fire and forget. You don't care who subscribes to your events. You don't care who publishes events you subscribe to. You never query anything you don't own in order to get information you need for your business logic. While the system is still small, we seem to get these advantages just fine when we do in-memory and in-process. Once package management, performance, units of deployment, etc. become an issue, we build out to add infrastructure, as needed. Because the pieces are pretty standalone within the code, it's not all that complicated to do, compared to "traditional" CRUD systems.
Not testing distributed: It's possible to deduce a lot with testing locally if it's simple to understand what's coupled to what and if good shared contracts for commands and events are in place, but ultimately the proof is of course in the pudding. When the time comes, there is a big advantage: The system is already fully operational and testing comes down to "does it still work?" If one were to do the plumbing before the business functionality and tests that in isolation ... does that really amount to knowing that it'll still be OK once it actually runs what it needs running ...? For starters, there is a danger of over-engineering because the required performance characteristics of the end products are more difficult to asses if there is no end product yet.
I think the problem is best seen by considering the statement "You don't care who subscribes to your events." This is great when you are literally doing something only on one end of the event creation barrier. This makes sense when creating something takes a lot of effort.
However, for many many shops, this is an abstraction they will never get to. They care intimately about who will subscribe to the events they publish, because they are planning on doing something as the primary subscriber.
> Moving everything to be an event completely throws this out the window.
This is either an antipattern or unrelated to event-sourcing depending on how you read it. "Event sourced" means that state within a transaction boundary is built ONLY through events which are regarded as persistent and immutable; evented is the term I'll use for state which is built when events happen, when those events may or may not be recorded, and may happen before or after state is changed and are independent of the state change.
If you meant "evented" then I would say that there are lots of message-based systems that aren't failing hard, and a lot that are, which says to me that there are patterns some of those systems are using to manage the nature of async, evented development that others aren't.
If you mean "event sourcing" then the application of event-sourced data is neither an application architecture nor appropriate for all areas of your application[1,2]. If you were trying to apply event-sourcing in this way its not surprising you ran into problems with it.
> Until then, my main advice is to make sure you have solved your system with a naive solution before you move on.
It is important to really know and understand the problem domain you are applying ES to. Having a strategy to upgrade streams to new versions of your domain model is a good idea if you're applying ES to a business without a well-understood domain. However, it is very, very difficult to back into an ES implementation from a "naive" solution, which I'm reading as "CRUD".
Unfortunately, you lost me in that first paragraph. I think you left off a set of quotes on the first use of "evented."
Regardless, I meant the idea of moving everything to a communication of events between subsystems. To be fair, the sibling read me correctly and stated that if you just ignore the distributed nature of events, then things aren't that hard.
However, it is easy to follow the lure of "I'll go ahead and make this work for the distributed case" from the beginning. This is for two reasons. First, why not? :) Second, it is seen as something that would be very hard to add in later.
So, I can't disagree that it is an anti-pattern or unrelated. However... I would be surprised if the next version of this article didn't go over the distributed nature. Indeed, it already covers how this more correctly mirrors the distributed nature of the organization.
> I think you left off a set of quotes on the first use of "evented."
Indeed, sorry I lost you there.
> Regardless, I meant the idea of moving everything to a communication of events between subsystems.
Unless the persisted events are the primary representation of data, this describes "evented" (or message-based) rather than ES.
> To be fair, the sibling read me correctly and stated that if you just ignore the distributed nature of events, then things aren't that hard.
> However, it is easy to follow the lure of "I'll go ahead and make this work for the distributed case" from the beginning
So it seems like what we're actually saying here is "Distributed systems are hard to build reliably"—which wouldn't seem to be a surprising result on HN, and certainly a sentiment I'd agree with.
I've moved to using Elixir and Erlang primarily because processes/actors have such a natural fit for this kind of data that I get really grumpy when I have to work in something else now. There is also a long history of distributed systems within the BEAM ecosystem so there is a legacy of design meant to deal with the inherently unreliable nature of building distributed.
Do you have specific examples of where things break hard?
We are currently using ES end to end for a distributed application including a wearable device, iOS app and scala backend, and up to now, the things that broke hard in the system are the naive/non ES parts.
For information, on the server-side, we are currently experimenting with GetEventStore (https://geteventstore.com/), which seems to be working well for us.
I'm curious what has broken that was not directly related to the hard split between the wearable and the backend. This is a particular case where you are by definition doing a distributed system, so any attempts to hide that will be problematic.
My main thoughts are anywhere you are trying to hide that distributed nature, things will go awry. Add to that, anywhere you have introduced the potential for things to be distributed.
The sibling post about keeping it simple as you build up your eventing system is pretty accurate. Remember you are trying to solve an actual customer problem. Keep pointed on that and do not get distracted by any neat engineering problems that come along the way. (This is not to say you will not have to solve some... but if you are solving a neat problem that was not needed for the customer's problem, you are going to have trouble.)
Just don't do eventual consistency if the domain doesn't allow for it and if performance is OK. From a functional perspective there is no need for it in CQRS/ES systems.
However ... if going with eventual consistency, it's essential that the write side is consistent. The read models - not so much. This is because every aggregate (... if Domain Driven Design terminology is your thing) represents a transactional boundary within which state needs to be consistent in order for the business rules to work correctly. Read models are more forgiving.
I'd suggest people start with CQRS+Events before going to CQRS+EventSourcing.
In the first case, you keep however you were loading/saving your entities, and modify them to emit events as they mutate. Then you can play with sending those events to external systems, or using them to drive read-models for queries, etc.
In the second case, you go further, and "dogfood" those events so that they are the authoritative record of what the entity is at a given point in time.
Architecting around events has several ramifications.
For building up a picture of the world, it's pretty good. It's very nice to be able to replay a log of events and recreate a view of the way things are expected to be; if there's a bug in your code, you can fix it and repeat the replay to get back into a good state (with caveats, sometimes later actions creating events may be dependent on an invalid intermediate state). Whereas mutating updates erase history, perhaps with some ad-hoc logging on the side that is more often than not worthless for machine consumption.
For decoupled related action, it's not too bad. If you have some subsystem that needs to twiddle some bits or trigger an action when it sees an event go by, it just needs to plug into the event stream, appropriately filtered.
For coordinated action OTOH, e.g. a high-level application business-logic algorithm, you need to start thinking in terms of explicit state machines and, in the worst case, COMEFROM-oriented programming[1]. Depending on how the events are represented, published and subscribed to, navigating control flow involves repeated whole-repo text searching.
It's best if your application logic is not very complicated and inherently suitable to loose coupling, IMO.
FYI in case the author reads this, since this seems to be intended as an intro for people who aren't already familiar with this stuff: I didn't see "CQRS" defined anywhere in this article or in the two or three links I followed from it; they all begin with an assumption that you know the acronym, and delve straight into details. It might be good to define some terms in the front matter (unless I've misunderstood the target audience).
Note that Martin's blog is what inspired the event bus in https://home-assistant.io, an open source home automation project I occasionally contribute to.
I've tried working out how to move to an event sourcing system, but I always struggle with locking behavior. Do you just have to invent your own locking mechanisms on top of event sourcing?
The stream is the consistency (locking) boundary. Your first step it to get your model aligned with such boundaries. For example, your amazon shopping basket is independent of my basket. Then you chose your concurrency model - append to a stream with an expected version (pessimistic) or just append anyway (no expected version). Your amazon basket may be the later, your amazon payment and shipping checkout may be the former.
Locking across streams is an anti-pattern / smell. It can be done (as can anything) but it usually points to a modelling problem. Example: cancelling an amazon order is a _request_ that is in a race with the fulfillment system (boundary); it may or may not be successful.
Read about how LMAX achieved 6 million transactions per second using a ring buffer-based concurrency architecture called disruptor, all on a single thread and without locks. Event sourcing plays a big role in their architecture [0].
Combine this with the actor model (using Akka or similar) gives you guaranteed "one message at a time" processing and you don't have to deal with locks.
I question any amount of guarantees around "one message" anything. There might be this guarantee per actor, but you have no such guarantee per system. And, assuming a real system, this will be a problem.
So, you get to pick, "at most once" or "at least once." And then you need to build your system to act accordingly.
I'm not sure what you are suggesting. There is no getting around "at most/at least once." You can shift the posts, some, but at some point that is your choice. This is a good read on the problem: http://bravenewgeek.com/you-cannot-have-exactly-once-deliver...
"Exactly once delivery" is not the same as "One message at a time". Akka Actors process one message at a time. Akka does no provide exactly once delivery. It default defaults to "at most once".
Correct. Message delivery strategy is separate from message processing once delivered. If you choose at-least-once delivery, you'll need to handle possible duplicates regardless of whether or not you can process a duplicate message in a lock-free manner.
ISTM event sourcing actually avoids many locking problems, since it's essentially "write-only". Of course every event write should be atomic, but that seems easier than making updates atomic?
When a certain set of events occurs (the files arrive etc) I want to kick off one and only one batch processor task. This is accomplished with a transaction and a write lock in an sql database, but when trying to use event sourcing it ends up requiring a 2 step "intent to run" event before running or some out of band synchronization.
This isn't something I would handle with event sourcing. Using ES throughout an application is an antipattern.
For something like this my batch processor is implemented as you're probably used to—files get a CRUD model associated with them, schedule a background job to handle it, let locking get handled there. Once inside the batch processor you can use the same domain services and commands that you'd use from your application layer and commit events on a command basis, or on something like a row in the file (which may generate several commands and dozens of events depending on your model), or on a file level (all or nothing.)
The thing I see people do frequently (and sadly, have done myself on occasion!) that makes their lives harder is trying to shoehorn everything into ES without doing the design work to establish a domain, its boundaries, and what events make sense within it.
I'm not sure you can achieve good event sourcing performance using a regular database engine. Better to view it like writing logs.
If you really must expose an SQL API, perhaps you could read the journal on another thread or process and then make changes to the db based on the incoming "diffs" that the threads determines from the journal?
There is a lot that could be done to make event sourcing easier to work with...
Imagine tooling that allowed an event stream to be used to create state for testing modules, crudlike helpers to allow crud-familiar developers to think that way at first, and workflows based on snapshots, rewind, etc.
I think a model that used events that correlated to graph deltas rather than crud deltas would be the cat's ass, and many queries about the near-current state could be handled efficiently using ephemeral subgraphs as indexes located at the network's edges.
If anyone wants to discuss and possibly build some of this stuff, let me know :)
> Imagine tooling that allowed an event stream to be used to create state for testing modules, crudlike helpers to allow crud-familiar developers to think that way at first, and workflows based on snapshots, rewind, etc.
i know where you're going with this, and i honestly believe its a terrible idea (not to be discouraging or rude—just experienced.)
if your event streams contain mostly CRUD (possibly ANY) then you're most likely applying it incorrectly. Its not just a version history of your data. The event type itself is data, which provides context and semantics over and above the notion of writes and deletes. If you're falling back to crud events all you're doing is creating a lot more work for yourself and deriving almost no benefit from the use of ES—in that case, you should just use CRUD and the ORM of your choice.
> if your event streams contain mostly CRUD (possibly ANY) then you're most likely applying it incorrectly. Its not just a version history of your data. The event type itself is data, which provides context and semantics over and above the notion of writes and deletes.
Right. A good way to think about this is that as with rows in an RDBMS, events in an ES system are facts, and just as tables in an RDBMS define a category of facts with a particular shape, event-types in ES do the same thing. The difference is that whereas in an RDBMS the facts represented by rows can be general (and are often, in many designs, facts about the current state of the world), events are facts about a specific occurrence in the world rather than the state of the world (and the "state of the world" is an aggregate function of the collection of events.)
Right^2: Good events are facts that occur at a higher level of abstraction, trying to capture more of the "why" behind what goes on. It's not about describing the effect on data, but the business-decision itself. (Which, when reapplied to a set of rules, will do the actual data-change.)
> A good way to think about this is that as with rows in an RDBMS, events in an ES system are facts, and just as tables in an RDBMS define a category of facts with a particular shape, event-types in ES do the same thing.
Thank you for this, it has cleared many of my troubles with understanding event sourcing completely.
> if your event streams contain mostly CRUD (possibly ANY) then you're most likely applying it incorrectly. Its not just a version history of your data.
Thanks for that. I'd made that mistake: I have a system which now needs to become distributed (a copy of it goes offline for a couple of weeks, and has to merge back into the main datastore) and keep a history of changes. It's currently CRUD backed by MySQL, and I'd latched onto event sourcing as what I'd need.
> The event type itself is data, which provides context and semantics over and above the notion of writes and deletes.
See the presentation by Greg Young I linked to in another comment. One thing he talks about towards the end is the application for occasionally connected clients, which is one I'm looking to tackle myself in the next several months. ES may very well serve your needs, but I'd take a step back from the CRUD model and really think about the domain model. Something I do a lot of times when I'm looking at moving away from CRUD in an app is creating a bunch of domain-based command classes, which take over the persistence job, and move the app layer towards talking only to those commands. Still CRUD under the covers, but now there's an abstraction layer above it, and as you get a better delineation of boundaries you start to see where the services will fall out of it and what areas might benefit from ES the most.
There's nothing wrong with having a system that uses immutable events to represent data, IMO it's mainly a terminology issue: "Event Sourcing" often implies you're recording rather abstract, high-level events that try to capture something about a particular domain. As opposed to, say, a straight diff between two data-dumps.
I was thinking of making CRUD a specific event type that had meaning only in the context of changes to an instance of some schema. Of course, the most interesting events will not be CRUD oriented, but does this mean it's a mistake to include them at all, particularly if interacting with other systems that do use a CRUD metaphor for interaction and state must be synchronized?
I don't know if I'd be strong enough in my words to call it a "mistake" (there are few absolutes) but its a big stinky code smell, and it likely falls into that bucket of things you're going to regret one day sooner than you realize.
The thing is, DDD isn't easy. You're going to end up with someone on your team at some point that's inexperienced with it, and having a bunch of CRUD events in the schema already is going to entice them to add a few more because its easy and they're already there and oh hey what's a few more amongst friends?
Something else I've learned about this stuff is that because it works best in stable domains, it is also going to work best in domain services that have low code churn. Once you get it right, it tends to stay that way. But if you try to model with non-domain events, you're dramatically increasing the surface area that is subject to churn.
> particularly if interacting with other systems that do use a CRUD metaphor for interaction and state must be synchronized?
This is also an antipattern, btw—events between bounded contexts should be somewhat limited, and well-defined behaviors when doing so. What you're doing with a scheme like that, sharing CRUD events between external services or systems, breaks all kinds of conceptual and logical encapsulation (in addition to the aforementioned CRUD thing.)
I was looking into Event sourcing for a system I built recently, and the tooling just doesn't seem to be that widespread yet. How do you read out of the entire event stream to figure out the current state? While there are tols, they seem to be .net focused. Just didn't seem to be a "standard" answer yet.
We ended up going with microservices that pub/sub events into Kafka, but maintain their own databases. There's another microservice that lets you query past events for statistics.
We find that a simple in-memory synchronous message bus + event logging to files goes a long way. See e.g. https://github.com/robertreppel/hist for an in-memory bus + file system (and DynamoDB ...) helloworld which isn't .net.
Scaling that up by adding asynchronicity and more ambitious plumbing when needed seems reasonably straightforward. For something more out-of-the-box, see https://geteventstore.com/ . It has clients in a variety of languages. Comes with a nice HTTP API too.
I wouldn't normally read the entire event stream; usually, only the state of a particular object (aggregate, in Domain Driven Design speak) is of interest, E.g. the customer with id 12345. Events contain the aggregate ID, so the query to whatever event store you use would be "give me all events with aggregate ID 12345".
Are you using DynamoDB Streams at all? I've been toying the idea of using DynamoDB as an event store and having other services listen to a table's stream, allowing them to update caches/views (the read-side of CQRS), report analytics, perform asynchronous tasks, etc.
You basically consider the event log as a big collection,and you "fold over" the events in order to incrementally build your state/projection, the same way you would do with finite collections in a Functional language (scala, haskell, ...).
I for some months now have tried to build a small test-case for a invoice app. I wish to have a good syn strategy and the use of ES sound good. However, I have find how replicate the functionality of a normal app with this: For example, what to do for avoid duplicates and in general pre-saving validations. Also, I need to anyway to use RDBMS tables for hold current-data and RDBMS have not a good history for stream back results.
I have been working with this sort of patterns for a while but I have yet to find good texts exploring the topic. Does anyone have book or paper recommendations for event sourcing? The stuff I have seen is mostly programmers reporting on something that worked on their particular domain. I am, looking for something more rigorous and comprehensive.
Lurk on the CQRS/DDD list [1], lots of good info there. I'm not aware of any textbooks on ES per se but there are a few good books on areas that overlap. [2] [3] [4]
How strange, just today I've heard the Event Sourcing name and thought I don't know what it is. (Turns out it is this old idea I knew under various different names). And at the same day I hear about Event Sourcing on HN. What's the buzz?
Its been slowly building steam (under that name) for about ten years, first in .Net and now filtering out to other ecosystems. I think its kind of inevitable given the recent popularity of functional programming models.
Very curious: if you have multiple datastores, how do you ensure they are consistent? If you scale sideways, how do you ensure nothing gets lost if there's a partition? Etc?
Embrace eventual consistency. A good deal of collaborative domains (things involving human decisions) are naturally eventually consistent. Meat computers appear to be particularly good at resolving conflicts and compensating.
Having been part of a project to rewrite a monolith e-commerce site into an event-sourced, domain driven, CQRS system, let me tell you in which situation that is not possible: when you already have data. Remember that in a DDD, ES, CQRS system, the event store is the single source of truth. If you already have data in a relational database, then the existing data is the source of truth. You can't have two sources of truth, that completely defeats the purpose. So it's not actually possible to migrate to an event sourced system, you can only create one from scratch, with no existing data.
Conceptually, that's not really true: you just transform the pre-ES state into one or more events (in an basic accounting system, which is pretty much the simplest ES system, long-predating the name for the model, this is just creating "starting balance" entries as transactions.)
In practice, that can be challenging, but it doesn't seem fundamentally more challenging than any other legacy data conversion effort.
Sure, if the existing DB is simple, that is straight forward, but remember that likely this is a monolith that is so bad that even management have agreed that it needs to be rewritten. Likely there are lots of DB tables with foreign keys and relations (sometimes documented and enforced, most often not). This means you can't really convert the entire database into an event sourced system, as that means converting all of the tables in one single go, instead of a gradual change. And believe me, in a system like this you want slow gradual changes! Also, even of you got it into events, what happened to the domains? There are so many relations between the different events sources (because you didn't put everything into just one event source, right? What happened to bounding contexts?) that you are no better off. And this means you have to prevent anything else from using the database anymore, and in a legacy system where you can just join across any two or three tables to extract whatever information you want, you can be certain there are some analysis engines that are just feeding directly on the sql data. And there might be other systems writing to the database too!
So the first step is to disentangle all the data and encapsulate it, trying to prevent others from using it, so you have full control over it. This includes tracking down any other system using this data, and ensuring they too go through the database. And you have to do this for one subsystem at a time, often in several iterations.
> Sure, if the existing DB is simple, that is straight forward, but remember that likely this is a monolith that is so bad that even management have agreed that it needs to be rewritten.
Yeah, but that's not a "converting legacy data to ES" problem, that's a "converting legacy data to any non-broken thing" problem.
> This means you can't really convert the entire database into an event sourced system, as that means converting all of the tables in one single go, instead of a gradual change.
Whether its ES or something else you are converting to, you either do a big-bang conversion and eat the pain of that (which can be tremendous, sure), or you instead eat the pain of taking the monolith and finding a way to break out components and do it incrementally, even though that takes not only building the new components, but reengineering parts of the old monolith to support that. Which, also, can be tremendous pain. But, again, this isn't really essentially tied to event sourcing, you face this dilemma even if you are going from a (broken for current needs, which is why it is being replaced) classically-designed "current state" RDBMS-backed system to a (meeting current needs, and hopefully more adaptable to future needS) classically-designed "current state" RDBMS-backed system.
Yup, agree, this is the problem of having a legacy monolith RDBMS that needs to be rewritten and split apart. It's tempting to throw every new fancy technology at the problem when that is suddenly an option, but it's better to focus on the goal of splitting it apart only. If you have split it out, and it's now simple to convert to ES CQRS, then you are probably in a situation where you don't need to do that, as it works quite well.
> It's tempting to throw every new fancy technology at the problem when that is suddenly an option, but it's better to focus on the goal of splitting it apart only.
Splitting it apart involves:
(1) Dividing the data and functionality into a legacy component and a new-implementation component,
(2) Making changes to the DB and application code for the legacy component,
(3) Implementing the new-implementation component.
In a monolith that you are breaking apart, the reusability of legacy code for the new-implementation component is likely to be low (you'll actually likely have to do extensive changes to the larger "legacy component" as well, but the reusability should be somewhat higher there.)
You have to use some technology for the new implementation component, and what you should aim for is whatever is the best fit for the job, whether it is similar to what existed before or not.
> If you have split it out, and it's now simple to convert to ES CQRS, then you are probably in a situation where you don't need to do that, as it works quite well.
I disagree. The hard part of converting to ES/CQRS for the components that are broken out ("new implementation" components, not the "legacy" reduced-monolith) is done in the analysis phase of what you are breaking out. Once that is done, implementation in a ES/CQRS manner is fairly straightforward, since defining the events that the component will handle is a core part of analysis, as is defining the impacts those events have on stored, reportable data (the query side of CQRS).
"The hard part [...] is done in the analysis phase..." smells like big design up front that is usually more likely to fail than not, especially so for complex systems.
Big design up front would be a complete system replacement, not incremental replacement by component. An incremental replacement still requires definition of the components to be replaced with new implementation and the part to be essentially retained with only the changes necessary to interface with the new component.
Sorry, replied too quickly. My point is that if things are so bad that they warrant a major rewrite, then they are probably so bad that there is no simple way to map the existing data into events or starting conditions. It might be true if you have a simple well defined silo of a system that does one thing well, but not of you have what the author described, a monolith that does several things in the same codebase.
I'd say the data isn't the problem. Its all the code that WRITES data that's the problem. finding every nook and cranny in your apps that do this, and very frequently across logical boundaries, is a challenging exercise at a minimum, even if you build projections that reproduced the old database state.
This is almost certainly a naive question, but wouldn't you treat your DBMS data as a "snapshot" of that (zero) point in time and then all of the new events update from there until the next snapshot?
The one downside is that shifting your business logic to read-time means that you need to have very efficient ways of accessing and memoizing derived data. For some applications, this can be as simple as having the correct database indices over your WhateverUpdates tables, fetching all updates into memory and merging on each request. For others, you'll need to have a real-time stream processing pipeline to preemptively get your derived data into the right shape into a cache. And those are more moving parts than your typical monolith app, but the
One benefit to actually using event sourcing with a stream processing system is that, in many cases, it can be the most effective way to scale both traffic capacity and organizational bandwidth, much in the same way that individually scalable microservices can (and fully compatible with that approach!). Martin Kleppman at Confluent (a LinkedIn spinoff creating and consulting on stream processing systems) writes some great and highly-approachable articles about this. Highly recommended reading.
http://www.confluent.io/blog/making-sense-of-stream-processi...
http://www.confluent.io/blog/turning-the-database-inside-out...