The one downside is that shifting your business logic to read-time means that you need to have very efficient ways of accessing and memoizing derived data. For some applications, this can be as simple as having the correct database indices over your WhateverUpdates tables, fetching all updates into memory and merging on each request. For others, you'll need to have a real-time stream processing pipeline to preemptively get your derived data into the right shape into a cache. And those are more moving parts than your typical monolith app, but the
One benefit to actually using event sourcing with a stream processing system is that, in many cases, it can be the most effective way to scale both traffic capacity and organizational bandwidth, much in the same way that individually scalable microservices can (and fully compatible with that approach!). Martin Kleppman at Confluent (a LinkedIn spinoff creating and consulting on stream processing systems) writes some great and highly-approachable articles about this. Highly recommended reading.
I'm worried that event sourcing is going to become this year's over-applied design pattern with libraries in every language for every database with blog posts that recommend it be used on every project.
It's a good idea, very useful - in the right hands on the right projects. But it makes sense that junior devs normally use CRUD because that's normally the right solution. At least until better tools come along.
If by "works well", you mean it works until someone asks for historical data - then IT guy has to say w/ a straight face "we lost it". This is unacceptable considering the value of data and the strategic leverage it can have today.
Considering immutable facts tables are the most stable data model; companies often have to re-invent it (poorly) on top of relational at some point; that storage is often not a problem;
and that having clean historical data is crucial for data science; there are increasingly fewer excuses to not adopt a sane data model from day one.
I agree partially w.r.t. to tooling - few implementations aid adopting this pattern, but I believe the value of historical data, over time, overcomes not being able to slap some quick Rail CRUD together and then being stuck at local minima.
You'd be surprised.
For tons of projects it's totally acceptable, has worked for years, nobody paying to implement them cares about historical data and their leverage. In fact the majority of web apps is like this.
I always find it strange when people use "unacceptable" with wild abandon, like they're generals receiving some demand of unconditional surrender.
* I keep hourly backups of mysqldumps which can be restored in the case of catastrophic mistakes.
* I explain that this kind of data collection would be out of scope, and require significantly more budget.
Just because it's technically possible, it doesn't mean you need to do it. YAGNI.
> This is unacceptable considering the value of data and the strategic leverage it can have today.
The last part is important.
Just because it's been true in the past, it doesn't mean this trend will continue. Maybe it keeps being true for your run-of-the-mill MVP, but I don't see it being acceptable for a system in any industry w/ any chance of making serious money in the mid-term.
Event sourcing is an extremely expensive design pattern to implement, and it's also very easy to get wrong. Implementing it tends to preclude junior developers from working on the project, makes it harder for database admins to understand the data, and it requires a lot of thought on how to structure the events.
So on a project with, say, a £20K budget, it might triple the cost. On a project that would take 4 weeks to implement with CRUD, it might take 3 months with event sourcing. You've got to justify that extra cost. It's better to let a BA decide what they will need, and by all means explain the pros and cons of different solutions.
But I don't for a second believe that every single project should now be using event sourcing instead of CRUD.
Again, you'd be surprised. Aside from advertising and consumer behavior analysis, there are not many industries that care or even have a need for such historical data.
The very idea that it must be necessarily valuable to store historical data about "all the things" (apart maybe from some aggregations you can create and store), seems more associated with the recent "big data" fad.
(I've seen 10-12 such trends rise and fall in the industry. In 10 years, I guarantee you it will have fallen off as a keyword, and only used as a technology where really appropriate).
That isn't to say it won't happen, but I think it's more likely that teams would miss an opportunity to leverage it than leverage it inappropriately.
Did you spot all those command-to-query-to-event-to-log-to-storage data type conversions in those pretty diagrams? That's a whole bunch of needless reshaping of data as it flows through the system.
For each one of those data transformations to be successful, there has to be accurate communications between people and bug free code written in the data conversion and routing of messages through the system. All those moving parts make changing the system extremely painful, lotsa ripple effects - and every time you have to make a change to your events, you'd have a data migration project for any running event streams.
Naming things is hard too, and there's a lot more naming of entities needed in a CQRS-ES system.
I like all the promised benefits of a CQRS and ES, but I can't imagine a case where I'd take the risk of attempting it on anything but a toy project. Perhaps if I was on the version 5 rewrite project for an insanely profitable system where the requirements and design are completely understood up-front. I would need to grok some canonical example of a large, well-architected, well-implemented representative system before I would ever attempt to implement one.
Are there any non-toy examples of successful CQRS-ES with open source available to read? Did those projects go over-budget, and by how much? Would the authors of those examples still recommend the architecture now that they've gone through the experience?
The vast majority of the things you will ever program are pretty much guaranteed from one statement to the next. Hard boundaries, where things can fail, are often decently understood and actually quite visible in the code.
Moving everything to be an event completely throws this out the window. You can take a naive view, where you pretend from one event to the next is safe to happen. However, to start building up the system to cope when this is not the case starts to build a complicated system. In areas that are decidedly not related to your business domain. (Well, for most of us.)
Maybe some day there will be a system that helps with this. Until then, my main advice is to make sure you have solved your system with a naive solution before you move on.
I became disillusioned with doing the naive solution, for two reasons:
Found it to be impossible (from a time/project mgmt perspective) to ever replace it with the "non-naive" one, so it turns into the usual mess because CRUD doesn't work well as you load more functionality on it over time.
Secondly ... it's thinking machines. From a business perspective, does it still make sense to hand code glorified rolodexes without behaviour? Maybe Excel does the trick. Seeing it as a red flag if someone asks me to build dumb data entry forms in 2016.
Therefore, I always start eventsourced theses days. YMMV.
* If you are in-memory and in-process, why bother with the events in the first place? (Simpler put, why not go with the simpler process based solutions?)
* If you are not testing distributed, how do you know you will be able to distribute?
In particular, there is a very large chance that you will have the same difficulty in replacing an in-memory/process solution that you would have had with a naive one.
And don't underestimate the amount of manpower you can get with success. Nor the amount of features that will not help get success.
As far as code is concerned, I see that as coming down to managing coupling & cohesion in such a way that the various pieces can be designed, built, supported, deployed and enhanced by the same team, with minimal need to wait for, coordinate with or be on the same page with any other team.
I think event sourcing & CQRS are great for enabling that because they result in the lowest form of coupling: Commands are fire and forget. You don't care who subscribes to your events. You don't care who publishes events you subscribe to. You never query anything you don't own in order to get information you need for your business logic. While the system is still small, we seem to get these advantages just fine when we do in-memory and in-process. Once package management, performance, units of deployment, etc. become an issue, we build out to add infrastructure, as needed. Because the pieces are pretty standalone within the code, it's not all that complicated to do, compared to "traditional" CRUD systems.
Not testing distributed: It's possible to deduce a lot with testing locally if it's simple to understand what's coupled to what and if good shared contracts for commands and events are in place, but ultimately the proof is of course in the pudding. When the time comes, there is a big advantage: The system is already fully operational and testing comes down to "does it still work?" If one were to do the plumbing before the business functionality and tests that in isolation ... does that really amount to knowing that it'll still be OK once it actually runs what it needs running ...? For starters, there is a danger of over-engineering because the required performance characteristics of the end products are more difficult to asses if there is no end product yet.
However, for many many shops, this is an abstraction they will never get to. They care intimately about who will subscribe to the events they publish, because they are planning on doing something as the primary subscriber.
This is either an antipattern or unrelated to event-sourcing depending on how you read it. "Event sourced" means that state within a transaction boundary is built ONLY through events which are regarded as persistent and immutable; evented is the term I'll use for state which is built when events happen, when those events may or may not be recorded, and may happen before or after state is changed and are independent of the state change.
If you meant "evented" then I would say that there are lots of message-based systems that aren't failing hard, and a lot that are, which says to me that there are patterns some of those systems are using to manage the nature of async, evented development that others aren't.
If you mean "event sourcing" then the application of event-sourced data is neither an application architecture nor appropriate for all areas of your application[1,2]. If you were trying to apply event-sourcing in this way its not surprising you ran into problems with it.
> Until then, my main advice is to make sure you have solved your system with a naive solution before you move on.
It is important to really know and understand the problem domain you are applying ES to. Having a strategy to upgrade streams to new versions of your domain model is a good idea if you're applying ES to a business without a well-understood domain. However, it is very, very difficult to back into an ES implementation from a "naive" solution, which I'm reading as "CRUD".
Regardless, I meant the idea of moving everything to a communication of events between subsystems. To be fair, the sibling read me correctly and stated that if you just ignore the distributed nature of events, then things aren't that hard.
However, it is easy to follow the lure of "I'll go ahead and make this work for the distributed case" from the beginning. This is for two reasons. First, why not? :) Second, it is seen as something that would be very hard to add in later.
So, I can't disagree that it is an anti-pattern or unrelated. However... I would be surprised if the next version of this article didn't go over the distributed nature. Indeed, it already covers how this more correctly mirrors the distributed nature of the organization.
Indeed, sorry I lost you there.
> Regardless, I meant the idea of moving everything to a communication of events between subsystems.
Unless the persisted events are the primary representation of data, this describes "evented" (or message-based) rather than ES.
> To be fair, the sibling read me correctly and stated that if you just ignore the distributed nature of events, then things aren't that hard.
> However, it is easy to follow the lure of "I'll go ahead and make this work for the distributed case" from the beginning
So it seems like what we're actually saying here is "Distributed systems are hard to build reliably"—which wouldn't seem to be a surprising result on HN, and certainly a sentiment I'd agree with.
I've moved to using Elixir and Erlang primarily because processes/actors have such a natural fit for this kind of data that I get really grumpy when I have to work in something else now. There is also a long history of distributed systems within the BEAM ecosystem so there is a legacy of design meant to deal with the inherently unreliable nature of building distributed.
We are currently using ES end to end for a distributed application including a wearable device, iOS app and scala backend, and up to now, the things that broke hard in the system are the naive/non ES parts.
For information, on the server-side, we are currently experimenting with GetEventStore (https://geteventstore.com/), which seems to be working well for us.
My main thoughts are anywhere you are trying to hide that distributed nature, things will go awry. Add to that, anywhere you have introduced the potential for things to be distributed.
The sibling post about keeping it simple as you build up your eventing system is pretty accurate. Remember you are trying to solve an actual customer problem. Keep pointed on that and do not get distracted by any neat engineering problems that come along the way. (This is not to say you will not have to solve some... but if you are solving a neat problem that was not needed for the customer's problem, you are going to have trouble.)
Events for writes and current data for most reads.
However ... if going with eventual consistency, it's essential that the write side is consistent. The read models - not so much. This is because every aggregate (... if Domain Driven Design terminology is your thing) represents a transactional boundary within which state needs to be consistent in order for the business rules to work correctly. Read models are more forgiving.
In the first case, you keep however you were loading/saving your entities, and modify them to emit events as they mutate. Then you can play with sending those events to external systems, or using them to drive read-models for queries, etc.
In the second case, you go further, and "dogfood" those events so that they are the authoritative record of what the entity is at a given point in time.
For building up a picture of the world, it's pretty good. It's very nice to be able to replay a log of events and recreate a view of the way things are expected to be; if there's a bug in your code, you can fix it and repeat the replay to get back into a good state (with caveats, sometimes later actions creating events may be dependent on an invalid intermediate state). Whereas mutating updates erase history, perhaps with some ad-hoc logging on the side that is more often than not worthless for machine consumption.
For decoupled related action, it's not too bad. If you have some subsystem that needs to twiddle some bits or trigger an action when it sees an event go by, it just needs to plug into the event stream, appropriately filtered.
For coordinated action OTOH, e.g. a high-level application business-logic algorithm, you need to start thinking in terms of explicit state machines and, in the worst case, COMEFROM-oriented programming. Depending on how the events are represented, published and subscribed to, navigating control flow involves repeated whole-repo text searching.
It's best if your application logic is not very complicated and inherently suitable to loose coupling, IMO.
Note that Martin's blog is what inspired the event bus in https://home-assistant.io, an open source home automation project I occasionally contribute to.
Locking across streams is an anti-pattern / smell. It can be done (as can anything) but it usually points to a modelling problem. Example: cancelling an amazon order is a _request_ that is in a race with the fulfillment system (boundary); it may or may not be successful.
So, you get to pick, "at most once" or "at least once." And then you need to build your system to act accordingly.
Low-ish volume- design your system such that data flows [at the relevant crucial points] through a single actor to ensure proper concurrency.
High volume- trickier but I think same idea in principle. First thought that comes to mind here is the new GenStage stuff in Elixir.
You may not have to do deal with locks at a local level. You absolutely have to deal with locks at a system level.
For something like this my batch processor is implemented as you're probably used to—files get a CRUD model associated with them, schedule a background job to handle it, let locking get handled there. Once inside the batch processor you can use the same domain services and commands that you'd use from your application layer and commit events on a command basis, or on something like a row in the file (which may generate several commands and dozens of events depending on your model), or on a file level (all or nothing.)
The thing I see people do frequently (and sadly, have done myself on occasion!) that makes their lives harder is trying to shoehorn everything into ES without doing the design work to establish a domain, its boundaries, and what events make sense within it.
If you really must expose an SQL API, perhaps you could read the journal on another thread or process and then make changes to the db based on the incoming "diffs" that the threads determines from the journal?
There's also a great presentation by the developer, Allard Buijze, at https://www.youtube.com/watch?v=s2zH7BsqtAk.
Imagine tooling that allowed an event stream to be used to create state for testing modules, crudlike helpers to allow crud-familiar developers to think that way at first, and workflows based on snapshots, rewind, etc.
I think a model that used events that correlated to graph deltas rather than crud deltas would be the cat's ass, and many queries about the near-current state could be handled efficiently using ephemeral subgraphs as indexes located at the network's edges.
If anyone wants to discuss and possibly build some of this stuff, let me know :)
i know where you're going with this, and i honestly believe its a terrible idea (not to be discouraging or rude—just experienced.)
if your event streams contain mostly CRUD (possibly ANY) then you're most likely applying it incorrectly. Its not just a version history of your data. The event type itself is data, which provides context and semantics over and above the notion of writes and deletes. If you're falling back to crud events all you're doing is creating a lot more work for yourself and deriving almost no benefit from the use of ES—in that case, you should just use CRUD and the ORM of your choice.
Right. A good way to think about this is that as with rows in an RDBMS, events in an ES system are facts, and just as tables in an RDBMS define a category of facts with a particular shape, event-types in ES do the same thing. The difference is that whereas in an RDBMS the facts represented by rows can be general (and are often, in many designs, facts about the current state of the world), events are facts about a specific occurrence in the world rather than the state of the world (and the "state of the world" is an aggregate function of the collection of events.)
Thank you for this, it has cleared many of my troubles with understanding event sourcing completely.
Thanks for that. I'd made that mistake: I have a system which now needs to become distributed (a copy of it goes offline for a couple of weeks, and has to merge back into the main datastore) and keep a history of changes. It's currently CRUD backed by MySQL, and I'd latched onto event sourcing as what I'd need.
> The event type itself is data, which provides context and semantics over and above the notion of writes and deletes.
OK, going to have to get my head around that :)
The thing is, DDD isn't easy. You're going to end up with someone on your team at some point that's inexperienced with it, and having a bunch of CRUD events in the schema already is going to entice them to add a few more because its easy and they're already there and oh hey what's a few more amongst friends?
Something else I've learned about this stuff is that because it works best in stable domains, it is also going to work best in domain services that have low code churn. Once you get it right, it tends to stay that way. But if you try to model with non-domain events, you're dramatically increasing the surface area that is subject to churn.
YMMV but I wouldn't do it.
This is also an antipattern, btw—events between bounded contexts should be somewhat limited, and well-defined behaviors when doing so. What you're doing with a scheme like that, sharing CRUD events between external services or systems, breaks all kinds of conceptual and logical encapsulation (in addition to the aforementioned CRUD thing.)
Pattern matching? Creating useful state snapshots? Curious to hear more about your experience.
We ended up going with microservices that pub/sub events into Kafka, but maintain their own databases. There's another microservice that lets you query past events for statistics.
This article was extremely helpful to me for understanding some solutions in this space.
Scaling that up by adding asynchronicity and more ambitious plumbing when needed seems reasonably straightforward. For something more out-of-the-box, see https://geteventstore.com/ . It has clients in a variety of languages. Comes with a nice HTTP API too.
I wouldn't normally read the entire event stream; usually, only the state of a particular object (aggregate, in Domain Driven Design speak) is of interest, E.g. the customer with id 12345. Events contain the aggregate ID, so the query to whatever event store you use would be "give me all events with aggregate ID 12345".
GetEventStore documentation has some examples of how you can create projections (https://geteventstore.com/blog/20130212/projections-1-theory...), which you can use as inspiration to build your own projections.
In practice, that can be challenging, but it doesn't seem fundamentally more challenging than any other legacy data conversion effort.
So the first step is to disentangle all the data and encapsulate it, trying to prevent others from using it, so you have full control over it. This includes tracking down any other system using this data, and ensuring they too go through the database. And you have to do this for one subsystem at a time, often in several iterations.
Yeah, but that's not a "converting legacy data to ES" problem, that's a "converting legacy data to any non-broken thing" problem.
> This means you can't really convert the entire database into an event sourced system, as that means converting all of the tables in one single go, instead of a gradual change.
Whether its ES or something else you are converting to, you either do a big-bang conversion and eat the pain of that (which can be tremendous, sure), or you instead eat the pain of taking the monolith and finding a way to break out components and do it incrementally, even though that takes not only building the new components, but reengineering parts of the old monolith to support that. Which, also, can be tremendous pain. But, again, this isn't really essentially tied to event sourcing, you face this dilemma even if you are going from a (broken for current needs, which is why it is being replaced) classically-designed "current state" RDBMS-backed system to a (meeting current needs, and hopefully more adaptable to future needS) classically-designed "current state" RDBMS-backed system.
Splitting it apart involves:
(1) Dividing the data and functionality into a legacy component and a new-implementation component,
(2) Making changes to the DB and application code for the legacy component,
(3) Implementing the new-implementation component.
In a monolith that you are breaking apart, the reusability of legacy code for the new-implementation component is likely to be low (you'll actually likely have to do extensive changes to the larger "legacy component" as well, but the reusability should be somewhat higher there.)
You have to use some technology for the new implementation component, and what you should aim for is whatever is the best fit for the job, whether it is similar to what existed before or not.
> If you have split it out, and it's now simple to convert to ES CQRS, then you are probably in a situation where you don't need to do that, as it works quite well.
I disagree. The hard part of converting to ES/CQRS for the components that are broken out ("new implementation" components, not the "legacy" reduced-monolith) is done in the analysis phase of what you are breaking out. Once that is done, implementation in a ES/CQRS manner is fairly straightforward, since defining the events that the component will handle is a core part of analysis, as is defining the impacts those events have on stored, reportable data (the query side of CQRS).
You won't have any history though.