Getting Started with Event Sourcing in Postgres

karmajunkie · on July 13, 2017

Before anyone goes building their very own eventstore based on this blog article—even calling this implementation of the pattern eventsourcing is a stretch. You don't "query" the event stream. Your projections subscribe to it. (Also, there's no projection in this.) And while its not the first time i've heard of someone using a table per event type, its definitely something I'd regard as an antipattern.

There are MUCH better articles out there about eventsourcing, even in Rubyland where its still fairly uncommon. This one is just cargo-culting, poorly.

codebeaker · on July 13, 2017

I'm currently writing a book on this (ES/CQRS), but I have come to the conclusion through some of our projects that a "subscription" model is actually less ideal as you end up falling back to keeping a lot of state in the application heap.

We rather prefer to have a subscriber who re-runs the projections which are purely functional, and dump their state.

This mandates that you have some way of querying relevant events from the underlying store. Sometimes the best store for this is a DBMS, sometimes it's something simpler like CAS linked lists in Redis or similar (lists of hashes pointing to keys, similar to git's ref/tag model)

Either way, this idea that you subscribe to a stream makes reconnection and place-holding (cursor) semantics difficult, and simply calling as-frequently-as-makes-sense your update function is simpler to reason about, and helps highlight performance issues by making it easier to see the perf of frequently called short lived processes vs dealing with any VM bloat of your choice of language's runtime managing long lived subscribers.

Caveat: naturally this is all very subjective, just my 2 ¢

fnord123 · on July 13, 2017

>Either way, this idea that you subscribe to a stream makes reconnection and place-holding (cursor) semantics difficult

This is true. It's also a problem if you drop a packet - you need to build up state again on the client side.

If you're writing a book about event sourcing, I should think you'll spend a good deal of time covering trading systems and order books since these are an area where event sourcing (AIUI) has been used for decades.

karmajunkie · on July 13, 2017

Yeah, i agree with most of that, particularly the utility of having your projections be pure functions. You can go a really long way on ES by doing on-demand projections (with or without snapshots as you describe). However, I stand by what I said earlier in that you don't query streams—though whether you subscribe as a live projection or request the entire stream (or perhaps the entire stream from version number X onward, to be used with snapshots) is definitely a YMMV kind of question.

davidgl · on July 13, 2017

Strongly recommend this as a good place to start: https://engineering.linkedin.com/distributed-systems/log-wha...

cedricd · on July 13, 2017

Do you have a link to some of those articles?

karmajunkie · on July 14, 2017

Google Greg Young and Udi Dahan for starters... If you're doing this in Ruby, you can check out Eventide as one option, but i'm not sure if it plays well with Rails as the team behind it is squarely in the anti-Rails camp.

amluto · on July 13, 2017

All this "event sourcing" stuff is fun to watch. I've been using essentially this paradigm in production for years. The difference is that I use a simple custom append-only database for it. The consumer side can read, parse, and handle well over 1M events per second on a single core. Take that, Postgres. If the code wasn't such a mess, I'd open source it.

I'm a bit mystified as to why it's called event sourcing, though. Everything sources events. The difference is that this data model has everything consuming events instead of consuming state.

sandGorgon · on July 13, 2017

> If the code wasn't such a mess, I'd open source it.

Those are always the best kind. Real, live, messy, production code. You should open source it. We will learn something

andy_ppp · on July 13, 2017

Also if it works well someone will come along and cleanup and write tests for those hairy bits so you don't have to...

eriknstr · on July 13, 2017

I second this. Please open source it. I suggest using the ISC, MIT or the 2-clause BSD license.

Terr_ · on July 13, 2017

> I'm a bit mystified as to why it's called event sourcing, though. Everything sources events.

That's a bit like saying Historians know everything, since every known fact is in the past and therefore history :P

I think the point is that the events are kept and can continue to be used as the primary source of truth, rather than used and discarded.

fweespeech · on July 13, 2017

I'm not sure but the term is pretty old.

https://martinfowler.com/eaaDev/EventSourcing.html

This is an article I have from 10+ years ago and its still roughly the same way I do it now. I've never needed to scale it beyond something like Postgres or Redis.

fnord123 · on July 13, 2017

Isn't Event Sourcing the same as anything using a Write Ahead Log (WAL)? i.e. it's been around for decades.

karmajunkie · on July 13, 2017

No, a WAL is an implementation detail of a database—you could only consider that to be ES if you're building a database. ES refers to the primacy of domain-related events as the source of truth about your application state. Your application consumes event from individual streams (events belong to one or more streams—usually one) and changes state with each event. You can think of it as a functional data pattern.

fnord123 · on July 13, 2017

>you could only consider that to be ES if you're building a database.

I poke around in the guts of databases, so that's what I meant. The WAL is ES in the context of databases. So I guess ES is the generalization/abstraction(?).

tomnipotent · on July 13, 2017

Probably more appropriate to think of ES as relying on a WAL for managing an ordered event stream. When getting the current state of a business object e.g. "SELECT * FROM table WHERE pk=313", you would replay the event log to get current state and discard it when you're done. Mutation is only done by adding an event to the stream.

It's not uncommon to see RDBMS's still used to improve read performance (but not as a source of truth, just a cache with SQL).

fnord123 · on July 13, 2017

Right, and then, if the domain allows it, you can compact the logs into the current state so the compacted state is the single source of truth. e.g. HDFS works this way by having an fsimage and the edit logs. After a compaction, the fsimage is the HDFS namespace. The RDBMS that you mention is just a coarse tool for doing the same thing (IIUC).

karmajunkie · on July 13, 2017

Ok, you win that round :)

gaius · on July 13, 2017

A simple idea made massively over-complicated.

scotu · on July 13, 2017

along the lines I know geteventstore.com and kafka are used for event sourcing as "append only" db, but anyway, it would be cool if you opensourced your system :) even if it's a mess

thinkMOAR · on July 13, 2017

I sort of missed this in the bit, volume numbers and query times.

_lqaf · on July 13, 2017

Not what I do anymore so it isn't a surprise, but yeah, I was unaware this got a fancy name. In the late 90s, I built almost exactly what's described in article for a dot.com for a bulk mail delivery manager. Seemed like the obvious thing to do.

friendzis · on July 13, 2017

I see a lot of event sourcing tutorials are based on ordering by timestamp. If events (state transitions) are generated concurrently that would lead to huge problems reconstructing actual flow (and breaking state machines). For example when user has multiple sessions open and makes concurrent changes in both sessions.

Wouldn't it be better to store identifier of previous state and reconstruct the event tree using CONNECT BYs/WITH RECURSIVEs?

piaste · on July 13, 2017

If your event payload contains the whole updated state of the system, then yes, you need to know the ID of the original state in order to detect concurrent updates to the same original state. But if you want to keep things smooth for the user you'd have to merge those concurrently updated states, which may well be a massive pain, depending on your data model.

A better option is to only include the changes in the event payload, in which case you don't need to know the original state and you can simply project them all in order - and if one makes the next impossible, it's first come, first serve. Again, though, depending on your data model, generating an event payload that only includes the changed elements may also be a pain.

friendzis · on July 14, 2017

But you do not necessarily need to merge events. Contrived example, but say you have some kind of transactional flow (e.g. accounting) that is in the end signed/sent/processed by a central authority/job. There are two concurrent transactions modifying the same object. If the flow is to commit the latest state, then user1 can accidentaly commit user2's edits. This can easily happen if accountants connect with company credentials.

--- edit addition ---

For example there is an error in submitted (not yet committed) invoice for green and blue pens - invoice is for 5 boxes of blue and 5 boxes of green pens, but the courier departed with 11 total boxes, because it was a quick call directly to warehousing. Accounting does not particularly care for pen colour and get request to modify the invoice to total to 11 boxes. Accountant 1 modifies green pens count to 6, accountant 2 modifies blue pens. There is now high chance to sign the invoice with 12 boxes.

I understand that problem in example is solvable with checks and full state changesets, but it is a problem nevertheless. That's why I am proponent of history trees.

piaste · on July 15, 2017

In your example, I think ending up with a 12-box invoice is absolutely the correct behaviour for the program.

anarazel · on July 13, 2017

> I see a lot of event sourcing tutorials are based on ordering by timestamp. If events (state transitions) are generated concurrently that would lead to huge problems reconstructing actual flow (and breaking state machines). For example when user has multiple sessions open and makes concurrent changes in both sessions.

Indeed. For postgres it's much better to use https://www.postgresql.org/docs/current/static/logicaldecodi... to convert the WAL into a consistently ordered (commit time, with no issues w/ non-discrete or out of order clocks) stream of changes.

damian-h · on July 13, 2017

Correct. Time is also non-discrete. That is one can only measure it to a certain degree of precision and thus provides weak ordering.

karmajunkie · on July 13, 2017

You typically only see this method being used when the store is being backed by a database like postgres that comes down on the CP side of CAP. The timestamps get generated by the database, so they're a cheap way of providing a total ordering of the event stream. In addition, many if not most event stores/interfaces provide something like expectedVersion to enable optimistic concurrency—if the stream version doesn't match the event when its published the store kicks it out as an error. This is (I think) what you might be getting at when you say "identifier of previous state"... except events are not state—you build state from the events.

friendzis · on July 14, 2017

The problem is not event ordering itself, which, as you described, is easily solved in a database, but event dependencies - some events logically can only fire when a certain chain of events has already fired. In the OP article it would be that email cannot be delivered if it was not sent.

If possible event sequences are linear (or even a tree) then something intelligent can be done about it - maybe attempt to reconstruct dropped event, maybe hold or drop current event until the missing ones are in, depends. It's no candy, but doable. I see problems when possible event sequences form a graph. If you don't have metadata like "logically previous event" (which is not the same as "temporally previous event"), then eventually a situation will arise where you have two event sequences that are intermixed and have no way to tell them apart and then there is no way to apply projections and arrive at consistent state.

Maybe it's the paranoid me talking, though.

karmajunkie · on July 15, 2017

Well, the OP is a somewhat contrived example of what I think you're talking about, which necessarily leads to bad discussion, but in that example if the timekeeper is a single point of integration (e.g. Postgres in my previous post) then its not really possible for that kind of inversion to happen unless you're publishing events related to process A from process B, and process B's knowledge of events from A is by inference. That said, I have put myself in situations with asynchronous processes where the projection interleaved two different streams and between streams temporality of the events became inverted. However, the solution in those situations is fairly straightforward in that you pass a causality token to the next command or attach it as metadata to the events it gives rise to, which allows you to resort the graph in terms of causes. Worth noting is that this situation can arise even when you're using logical clocks if you have complex relationships between asynchronous processes—one reason most ES proponents don't recommend going async until you really need to.

dfox · on July 13, 2017

Ordering by timestamp is significantly faster (and even that is in some applications too slow and you want to cache the state somewhere), also there are applications where you don't know what the previous state is (eg. because of network partitions).

bruce_one · on July 13, 2017

The CTE in the final query seems like it should be replaced with a subquery (or view?) because the CTE is an optimisation fence, so it's going to seq scan email every time, even if there's a where condition which could use the message_id index.

Yuioup · on July 13, 2017

I've inherited a project with CQRS+Event Sourcing but I can't wait to get rid of it. It's genuinely depressing to work with the code.

skyde · on July 13, 2017

Could you elaborate about what you don't like about it

Yuioup · on July 13, 2017

It's to do with the fact that the original implementors decided to apply this design pattern for the entire system, including places where it's not really useful like CRUD.

A lot of redundant plumbing in the code with classes mapped to other classes. The data structure they chose does not allow inheritance so there are a lot of classes that look exactly alike where sub-classes would be useful.

This notion of Command that generates an Event that generates more Commands that can generates more Events asynchronously with code for eventual consistency. Lovely concept but database records don't exist. You can't just query a database. You have to build your views based on playing back your events. I just find it completely confusing.

Frustrating as well because certain things that are trivial in an RDMS end up costing an incredible amount of development time in CQRS+Event Store. I must admit unfamiliarity with the architecture is definitely a factor - except that my predecessors wrote everything from scratch down to the JSON format of the data structure. One mistake and the microservices crash causing the data to be in an invalid state. In order to fix the data, you need to replay all the events - and there are millions of them. Nothing ever gets deleted because a 'delete' is a new event.

pffff ....

karmajunkie · on July 13, 2017

I don't blame you for being soured on it—sounds like there are a lot of antipatterns in play there!

> including places where it's not really useful like CRUD.

Probably the most common mistake made by architects doing it for the first time.

> You can't just query a database. You have to build your views based on playing back your events.

Its pretty common to have projections that build database tables so you can do exactly that. It sounds more like the original architects violated (or had poor delineation) of service boundaries, which just creates a tightly coupled set of microservices, which is worse than the monolithic architecture it seeks to replace.

> In order to fix the data, you need to replay all the events - and there are millions of them. Nothing ever gets deleted because a 'delete' is a new event.

Its also common to archive streams periodically and "declare bankruptcy" with an initial state setting event, proceeding forward from there. Snapshotting is also a thing.

You might find this helpful in dealing with some of those issues: https://leanpub.com/esversioning/read#leanpub-auto-status

UK-AL · on July 13, 2017

"you need to replay all the events" If your aggregates have millions of events, I would suggest snapshotting, or mementos.

jpfed · on July 13, 2017

What would you say are the worst pain points?

olalonde · on July 13, 2017

For those of you who use event sourcing: how do you scale horizontally? Do you have a single process that handles all events sequentially? Maybe I'm overthinking this but I'm worried this might become a bottleneck eventually. Otherwise, how do you deal with concurrency issues (e.g. 2 different visitors who signup with the same username concurrently)?

karmajunkie · on July 13, 2017

The first rule of eventsourcing is don't eventsource EVERYTHING.

Eventsourcing requires really, really detailed analysis of your boundaries (and no small bit of experimentation) to get right, even when you've done it before. It should really only be applied to the areas of your application that are going to see a real benefit from it. My proxy for measuring this is "how much money does this group of features make for me?" If the answer is some version of "none" or "none, but its a requirement to have it anyway" then I don't try to apply ES to that area of the application. I never use ES on user records anymore, for example.

I'm sidestepping your actual question, but mostly because your use-case is something I consider to be a bit of a canard. To answer it though, as others noted, "it depends"—I model projections with actors, which means all events DO come through a single process handling events sequentially (and this is a very common way to do it.) But because I've got tons of these actors (one per stream that's actively running), there's no real bottleneck—this is also one way you guarantee ordering within a stream.

Higher up in the stack, at the eventstore level, you'll probably have a pool of processes handling writes and routing events to the correct handlers/projections, but this is an implementation detail that can vary widely. The advice to not get bogged down in the details early on is good—the thing i see almost every developer new to ES do is get mired in the basic "how do i implement this" question, and that's really not what's going to kill your project. Not understanding nearly as much about your business and the software boundaries in your system is where most projects go wrong.

Vinnl · on July 14, 2017

> The first rule of eventsourcing is don't eventsource EVERYTHING.

I've heard this rule so often now, but never what exactly it is that you _can_ safely event source. I'm afraid it's something you can only get a feeling for by doing it a few times and making mistakes, but that way of learning is hard to justify in real projects.

karmajunkie · on July 15, 2017

> I've heard this rule so often now, but never what exactly it is that you _can_ safely event source. I'm afraid it's something you can only get a feeling for by doing it a few times and making mistakes, but that way of learning is hard to justify in real projects.

see above:

>> My proxy for measuring this is "how much money does this group of features make for me?"

That said, i think you're also correct—there's a lot of trial and error in developing your instincts for this architecture. That's also true of monoliths, web MVC, or really any architectural style. Experience, as they say, is that thing you only get AFTER you needed it.

GordonS · on July 13, 2017

Do you build upon an event sourcing library, or have you written everything from scratch inhouse?

karmajunkie · on July 13, 2017

I've done both. My advice: use a well-maintained OSS library or server to handle the event store and subscriptions, because its easy to get bogged down in solving those problems yourself, and there's very little value in doing so.

GordonS · on July 15, 2017

Can you recommend an OSS library you've had success with?

karmajunkie · on July 15, 2017

My own is called Replay[1] but I wouldn't recommend using it for most people—I wrote it when I was first learning how to apply ES in Ruby back when the only material being written about it was for .Net. (I think I might have given the first-ever talk on ES at a RailsConf back in 2012.) Replay makes some assumptions I wouldn't make building something like that today, such as a very entity-focused perspective (e.g. in its original incarnation, all events were published from a projection by necessity), and it lacks some features I think a modern evented framework ought to have like projection snapshotting. I've got a lot of private changes to it to evolve it some, but you're better off going with something like Eventide[2] if you're working in Ruby today.

I've also used some of Ben Smith's libraries in Elixir[3]. If you're in .Net, the Orleans framework has some really interesting ideas going on, and in Javaland, Akka and friends are what I'd start looking at.

[1] https://github.com/karmajunkie/replay

[2] https://github.com/eventide-project

[3] https://github.com/slashdotdash

joshrivers · on July 13, 2017

Very good question. The answer is, of course, 'it depends'.

What is depends on is how much horizontal scale you expect to have. At a small scale, you can use something like postgres as a centralized broker of sequence. A single postgres cluster can sequence a lot of records in a hurry, no problem.

At a larger scale you might need to decompose your domain into multiple, independently ordered aggregates. Now you have multiple choke point brokers, each of which can handle a lot of records, but which don't block each other.

Step it up another notch, you can use a consensus algorithm (paxos or raft) to get a cluster of machines to agree on the state of the sequence, and perform optimizations such as assigning blocks of sequence by shard and node.

I recommend not over building your event storage architecture until you measure your actual needs. There are cheap ways and huge expensive ways, and building too much is an easy way to make your project fail. Also, one of the nice things about a sequenced set of events is that it is pretty easy to replay into your new, faster event storage later once you've had the happy problem of too much success.

TobbenTM · on July 13, 2017

By having a good partitioning strategy, where you partition based on unique keys like usernames, you can very easily scale horizontally, because each consumer only consumes one partition. And it's still sequential for events that has to be sequential.

Scarbutt · on July 13, 2017

Is this the same as what datomic does by default?

felixge · on July 13, 2017

Maybe I'm missing something, but the final query in the article seems complicated/inefficient for no apparent reason. IMO this one should be better:

  SELECT DISTINCT ON (message_id)
    message_id,
    status
  FROM emails
  ORDER BY message_id, created_at;

srt32 · on July 18, 2017

Great feedback! You are right and the post has been updated to reflect this feedback!

fred256 · on July 13, 2017

It's late and I'm tired and far from a SQL expert, so this may be a silly question, but why is the dense_rank needed instead of just using the event's timestamp directly in the 2nd query?

pulisse · on July 13, 2017

To break ties when there are multiple events with the same timestamp, I would guess.

sudhirj · on July 13, 2017

With or without a dense rank a second sort order would be necessary to break ties.

sitharus · on July 13, 2017

The timestamp would work fine, but if you do a dense_rank you could easily include the number of events per message. They're not doing that in the example but perhaps the application they're basing the blog on does.