Stream processing, Event sourcing, Reactive, CEP… and making sense of it all

presty · on Feb 4, 2015

ye, good article, but it's already been discussed here https://news.ycombinator.com/item?id=8966852

jpatte · on Feb 4, 2015

I'm currently in the process of redesigning the whole server architecture of our SaaS application to embrace event sourcing - we were using a classic relational database until now. I'm excited by all the new possibilities this will give us, but it's a lot of work to rewrite just about everything and to migrate old data into event streams. My advice to anyone who considers event sourcing for future projects: use it from the beginning. Transitioning from a more traditional model is hard.

Btw, the author seems to make a confusion between events and commands. These are not the same thing: a command represents an action (by the user, an other system or an internal process), while an event represents a change in the data. A command may generate 1 or several events during its processing. It is not saved in a data store, but it can be replayed in case of failure if retry policies are in place.

boothead · on Feb 4, 2015

I always think that event sourcing is missing the concept of commands.

If you have the idea of commands (which I think are mentioned in some of the literature) it becomes a lot easier. Commands are intensions to change the world that might fail, and have access to the current state when they are executed. If they succeed when they're run, they will emit events. Doing it like this means that you are guaranteed a sane event stream, and it makes it much easier to map from an existing system to events.

I've written both a web scraper and a database exporter that works in this way: as the legacy system is traversed you try to run commands. If the command succeeds and produces events you know that the events are compiant with your business logic. If the command fails you have the option to fail the whole operation or continue without the parts of the old system that don't meet your business logic.

edit updated as you mentioned commands above.

jpatte · on Feb 4, 2015

The concept of commands is very useful but I don't think it belongs to the event sourcing pattern. Event sourcing is just a way to write and read data, it doesn't say anything about how you process inputs and how that processing can lead to a change of data. For that concern there exist other patterns like CQRS which introduces the concept of command. These patterns are complementary, and using both CQRS + ES is actually common practice.

ghc · on Feb 4, 2015

Some of this is very good, and it's a good beginner's introduction, but there is significant misunderstanding about the applications of FRP and the actor model in the area of complex event processing. These are fertile areas of CS research (to which I made some small contribution in the CS department at Yale), not just "loosely coupled ideas" or industry buzzwords. They have real academic meaning, even if the terminology is sometimes co-opted to make unrelated software sound cutting edge.

anebg · on Feb 4, 2015

Care to point me to a better place to learn about the aplications of FRP and the actor model besides what it is being buzzed around?

grandalf · on Feb 4, 2015

It's great to see the world finally rediscovering this stuff. CEP is a superb paradigm that makes reasoning about so many kinds of complex, asynchronous systems far easier. One just has to get over any aversion to storing a massive event store. The good news is that great datastores exist for this purpose and one can usually bootstrap by storing events in whatever relational DB you are already using until size/perf becomes an issue.

This is great article. I really like the work from Stanford on Rapide (a declarative, logic style language for event pattern rules).

http://complexevents.com/stanford/rapide/

The site is a bit outdated but the language is awesome.

Also check out these books on CEP:

http://www.amazon.com/Power-Events-Introduction-Processing-D...

and

http://www.amazon.com/Event-Processing-Action-Opher-Etzion/d...

kiyoto · on Feb 4, 2015

A great article. A couple of open source projects for CEP/event processing.

1. Esper (http://www.espertech.com) has been around for awhile. This is the CEP engine a lot of people are familiar with.

2. Norikra (http://norikra.github.io/) is a schema-less event processing engine, often used with Fluentd (https://www.fluentd.org) as the data collector (Fluentd can do stream processing as well).

3. Apache Storm (https://storm.apache.org) has been popular in the Hadoop community, often with Kafka as the event source.

sleazebreeze · on Feb 4, 2015

Storm and Kafka are great building blocks for event streaming and processing, but if you're interested in a very well designed and non-opinionated framework for Java that helps you build CQRS applications, Axon[1] is worth looking into. The creator and maintainer, Allard Buijze is a great guy (and he is paid to maintain it) and the code base is very solid with improvements being made on a regular basis.

[1] http://www.axonframework.org/

tristanz · on Feb 4, 2015

This architecture seems great for analytics apps, but when expanded beyond that (e.g., Pete Hunt's Full Stack Flux talk) I never see explanations of basic patterns like validation. Where does validation happen and how are errors sent back to users?

For instance:

AddToCart(prod=1, quantity=1) -> Transactionally check that there is still inventory, return error if there isn't, add to stream if there is.

jpatte · on Feb 4, 2015

This is where the distinction between commands and events is important. "AddToCard" is a command (which might fail), while "AddedToCart" is an event that will result from processing only if the command passed validation. You should store events, not commands.

jnaour · on Feb 4, 2015

Really interesting blog post by one of the guys behind Kafka related to this topic: http://engineering.linkedin.com/distributed-systems/log-what...