Hacker News new | past | comments | ask | show | jobs | submit login
EventReduce: An algorithm to optimize database queries that run multiple times (github.com/pubkey)
208 points by eventreduce on April 16, 2020 | hide | past | favorite | 85 comments

Did I understand this correctly? You have a single set of items which you query by evaluating a predicate on each of them and then sort the matching ones. After the initial query you update the query result by looking at all the data update events, i.e. you remove delete items from the result, you insert matching new items in the correct position according to the sort order and you insert, remove, or move updated items as they start or stop matching the predicate and change their position according to the sort order.

Yes this is correct. The performance benefit comes from doing all this stuff on the CPU instead of using disc-io. Also the internal binary decision diagram of EventReduced is optimized in a way to run less logic then a query would do. This makes it even faster then running the query again with an in-memory database.

And the main cost of this (questionable IMO) benefit is losing consistency, which is losing any change to DB not coming from the calling app. You haven't mentioned this cost anywhere.

The writes are not tunneled somehow through this algorithm. You still use the database like your normally would do. So the consistency is not affected.

Also this is an open source project, not something I want to sell you. Feel free to make a PR/issue with any open topics that are not mentioned in the docs.

>The writes are not tunneled somehow through this algorithm

Then I fail to understand how it works. How Event-Reduce becomes aware of these "write events"?

>this is an open source project, not something I want to sell you

You made it open source so others can use it, right? They better be making an informed decision whether your solution suits their needs.

You have to provide the events by yourself. See EventReduce as a simple function that can do oldResults+Event=newResults.

And yes, you should always do testings before you use open source stuff. There is no warranty use it on your own risk.

OK, so you don't "tunnel writes through" EventReduce, you "tee" them to EventReduce.

Anyway, to maintain consistency, you have to limit yourself to one process of your app. No sharding, load-balancing etc. This is significant limitation, and it's not obvious. I encourage you to mention it in README.md.

I encourage you to read the readme and check out the demo. EventReduce is nothing magically drills out your database and affects the consistency of your write-accesses.

It is a simple algorithm that is implemented as a function with two inputs and one output.

> How Event-Reduce becomes aware of these "write events"?

Some DBs expose an event stream, for example, PG:


> For the different implementations in common browser databases, we can observe an up to 12 times faster displaying of new query results after a write occurred.

Is this intended to be an optimisation on top of localStorage and so on? If so, at least you don't have to worry about multiple writers.

localStorage is no database, check out the demo page.

Materialize exists to efficiently solve the view maintenance problem: https://materialize.io/

There is also noria [0] which is basically a SQL database with integrated materialized views with memcached like perf. - But I think it is more in a research state.

[0]: https://github.com/mit-pdos/noria

Any database which offers what are called "continuous queries" also solves this problem. I'm glad the ACM Digital Library is open right now, because I can just point to: https://dl.acm.org/action/doSearch?AllField=continuous+queri...

The work in general purpose streaming grew out of the continuous query work in the early 2000s. Marrying truly general purpose streaming with transactional databases is not a solved problem (as opposed to transactional databases providing streams), but the notion of continuously running database queries has been around for a long time.

They also allow for arbitrary joins. Basically every valid SQL query can be turned into a stream by Materialize which is pretty amazing.

This algorithm is limited to single table aggregates I think?

It should be able to aggregate across multiple tables, but it might need intermediate views. Can you give an example?

"EventReduce can be used with relational databases but not on relational queries that run over multiple tables/collections. (you can use views as workarround so that you can query over only one table). In theory Event-Reduce could also be used for realational queries but I did not need this for now. Also it takes about one week on an average machine to run all optimizations, and having more state functions looks like an NP problem."

And this doesn't even take into account recursive queries.

It's to bad that the EventReduce algorithm doesn't come with a paper, I'd love to compare the two.

If EventReduce needs joins to be delegated to views then it doesn't solve the really hard parts of view maintenance. Interleaving nested joins and aggregates also wouldn't be possible, because the aggregation and join live in separate worlds. And a lot of queries actually have that property. I guess you could push down all joins into a single megatable and then group-by the living hell out of stuff to get back the structure but that would create an O(insane) cartesian product from your database.

Materialize on the other hand is basically the holy grail of view maintenance. It supports ANSI SQL, which is a lot, and it's using the culminated work on differential dataflow and timely dataflow. Also it's written in Rust.

My prediction is in 10 years we'll all be using it in our backends and frontends. You'll stream live updates from your sql database to your frontent where it will completely replace React and Js with Rust and a local db that has the UI as a maintained view.

It seems to me that I want heterogeneous replicas of my databases. Partial indexes on this copy, full indexed on this one, no deletes on this one.

But a simpler solution might be to have a standard WAL format, and use that as a basis for having replicas of the system of record be a first class citizen in my data center instead of something we all cobble together or pay for.

Thanks for pointing that out. I never heard of materialize.io but I will dive into it. On the first glance it looks like materialize is more like a full product while EventReduce is just something that you use on top of your existing solutions.

How does this work for complex queries with sub-queries, LEFT OUTER JOINs, LATERAL JOINs, aggregation (DISTINCT/GROUP BY), window functions, CTEs, RECURSIVE CTEs?

I've a radically different approach: can the queries in question as VIEWs, materialize them, use triggers to update materializations where you can write those triggers easily and the updates are quick, or schedule an update where they're not.

If your RDBMS is very good about pushing WHERE constraints into VIEWs, and depending on how complex a VIEW query is, you might be able to make the update automatic by just querying the materialization's underlying VIEW with appropriate WHERE constraints from the ROWs being INSERTed/UPDATEd/DELETEd. You can tell which VIEWs might suitable for this by checking that the TABLE whose row the trigger is running for is a "top-level" table source for the VIEW's query: meaning a table source that's either the left side of a top-level LEFT JOIN, or either side of an INNER JOIN. If you can run a query on the VIEW with a timeout then you can just do that in the trigger and mark the materialization as needing an update if the query is too slow. Lastly, a scheduled or NOTIFYed job can run to perform any slower updates to a materialization.

It sounds like you did not read the READme. What you are describing is something that does scale up with more data but not with more requests. When you have an application where 1000s of users subscribe to different queries, your view-maintainance would kill the write performance while EventReduce does not.

Forgive me if I missed stuff, please point me in the right direction if you've covered it, but some questions if I may (after I've said well done!). I've considered this problem before and it seems very difficult . So:

1. Do you have a paper on this with a rigorous justification of the algorithm?

2. This surely has to rely on the isolation level being pretty high, or EventReduce might be reading while n other processes are updating. I don't see that mentioned.

3. Surely you need logical clocks for this? If not, could you point me to a high-level description of the algorithm to show why they aren't necessary.

4. Why does sort order matter? A timestamp yes (see 3. above), but I don't understand why the order matters.

thanks (and trying to understand this might be the thing to get me into looking at BDDs again. I never understood their value).

1. No I do not have a paper. I thought a lot about publishing a paper first but then decided against it, because I think that good code and tests and demos are more valuable.

2. EventReduce is mostly useful for realtime applications. I myself use it in a NoSQL database (RxDB). There you stream data and events and a single document write is the most atomic 'transaction' you can do. If you need transactional serial writes and reads that depend on each other, you would not use EventReduce for that.

3. EventReduce is just the algorithm that merges oldResults+Event. It assumes that you feed in the events in the correct order. Mostly this is meant to be used with databases that provide a changeStream where you can be sure that the order is correct.

4. Sort order matters because EventReduce promises you to always return the same results as a fresh query over the database would have returned. When the sort order is not predictable, the returned rows from a query depend on how the data is stored in the database. This order cannot be predicted by EventReduce which means it will then return a wrong result set.

PS: BDDs are awesome :)

It would probably be a good idea to write a paper at some point; it's simply easier to read a document explaining the algorithm with some pseudocode than to dig through an actual codebase with all the messy language-details in between the parts that actually matter.

I understand that reading the plain source code is more painful then reading a paper.

There are many different trade-offs between a paper and the current repository with source code. For me the biggest argument was that EventReduce is a performance optimization. So to be sure if it really works and is faster, you always need an implementation since you cannot predict the performance from a paper.

Because I did not have time for both, I only created the repository with the implementation. Maybe a paper will be published afterwards.

The value of a paper is the peer review by experts.

So if I get you, it's for append-only data - probably no updates, definitely no no deletions? Still don't get how you don't need logical clocks to pick out the delta(s), but thanks for your prompt answer.

Edit: your example gives replaceExisting() so that's supporting an update of some kind.

No it is explicitly not for append-only data. It works with inserts, updates and deletes. I think I have problems understanding what exactly you mean by the need for a logical clock. The algorithm is feeded with the old query results plus one event, and then returns the new query results. Since there is only one event at each point of time, it does not have to order or maintain them.

I had the same question as #2. Basically, it has to be the front-end to any event that reads/writes the data, in strict order of occurrence?

Not exactly. To use EventReduce you must have a changestream out of all writes to your data int the correct order. You can do that by wrapping a frontend over your database.

But easier you do that by using a database that already provides a changestream like couchdb, Postgres, mongodb and so on.

BDDs you are using, are they zero-supressed decision diagrams or it was not necessary to do these kinds of optimizations?

Yes the BDD is minimized with the two rules (reduction and elimination). Also the sorting of the boolean functions is optimized via plain brute forcing.

The was no good JavaScript implementation for BDDs so I had to create my own one https://github.com/pubkey/binary-decision-diagram

Cool, I've checked your code and it's not zero suppressed BDDs, although it might not be a performance gain if you used ZDDs. (zero supressed BDDs are very good at representing sets of permutations/combinations etc. but as far as I understand you have all the possible permutations encoded in the BDD, not a subset of all possible permutations).

This seems conceptually similar to differential dataflow.


"EventReduce can be used with relational databases but not on relational queries that run over multiple tables/collections."

Forgive my ignorance, but that is the whole point of working with a relational database. If cannot use JOINS then this solves only a very limited use case.

The biggest usecase for EventReduce is realtime applications. Most technologies for these like Firebase, AWS AppSync etc. work on non-relational data. If you want to use EventReduce with relational queries, you have to make them non-relational before, for example by using materialized views. If you do not want to do that, you should not use this algorithm in its current featureset.

My guess would be that if you're at a scale where you're thinking about these sorts of things, you are also at a scale where you're running on multiple machines. How does EventReduce share writes across the cluster?

EventReduce is an algorithm and not a database-wrapper. It will not care about your writes or if your database layer is a cluster and so also not affect them.

Sorry I wasn't clear in my original post.

I'm thinking about the application layer. If you have an application that writes data to a table, it's typical to run multiple instances of that application to support scale and reliability requirements.

If I send a write to one instance, how does it communicate and synchronise that write with the other application instances?

I ask because this can be a tricky thing to do, especially when consensus is required, as consensus algorithms such as Raft/Paxos require a number of network roundtrips which will introduce latency, and actually account for much of that latency in the database examples given in some cases.

EventReduce is a simple algorithm. It does not care or affect how you handle propagation of writes or how you handle your events, transactions or conflicts.

See it as a simple function that can do oldResults+event=newResults like shown in the big image on top of the readme.

This means then that if you run multiple application servers, which most do, that you’ll need to implement a data distribution mechanism of some sort.

I must admit, with limitations like this I’m struggling to figure out the use cases for this.

Edit: so I guess this is easier using the change subscriptions you mention in other comments. That does mean many subscribers, but hopefully that’s minimal load. This has the trade-off that it’s now eventually consistent, but I suppose that’s not a problem for many high read applications.

I’m still feeling like this could be solved in a simpler way with just simple data structures and a pub sub mechanism. Now I think of it, we do a similar thing with Redis for one service, and a custom Python server/pipeline in another, but we’ve never felt the need for this sort of thing.

Do you have more details about specific applications/use cases, and why this is better than alternatives?

I think the best example for why this is useful is described by david glasser at his talk about the oplog driver used in meteor.js https://www.youtube.com/watch?v=_dzX_LEbZyI

thank you for the clarification

Noria [1] is a research database that solves the same problem while still supporting all relational database queries.

[1] https://github.com/mit-pdos/noria

I think it is dangerous to propose a database product as a solution to the limitation of a simple algorithm.

This sounds like (a simpler version of?) Lambda Architecture [1, 2]

[1] https://en.wikipedia.org/wiki/Lambda_architecture [2] https://www.manning.com/books/big-data

To me this sounds a lot like how CRDT works (just simpler and not rigorously proven).

e.g. making deterministic changes to some data before you've heard back from any sort of central single-source-of-truth.

Difference with CRDT is that the goal is to never actually hear back from a central authority and still have consistency between clients.

Maybe this is a sort of hybrid CRDT/OT.

I do not think this has many in common. Lambda is used for stream processing much data. EventReduce is used for optimizing the latency of much (repeating) queries.

The goal of this is to reduce DB queries? Why not just queue up / batch writes? What benefits does this provide over application side batching of the event.

EvenReduce assumes there are no other systems interacting with the DB state (by using the old state that the current system saw). If there are no other systems, simple batching would work fine.

All your assumptions are wrong. Please read the other comments here or at least the readme of the repository. I will happily answer all ongoing questions you have afterwards.

I’m going to look at the code, but how does are transactions in the database handled in eventreduce?

Specifically, I’m wondering about isolation levels which determine whether uncommitted changes are queryable before commit/rollback.

IMO An open cursor of Change Stream with Aggregation pipeline (for given use-case) in MongoDB is more flexible solution to achieve this functionality.

In addition, it also tracks the history of changes and hence allows the cursor to go back if needed with "resumeToken"


There is a big difference between a change-stream and a realtime query. For example mongodbs cursor-stream is a good way to observe the events that happen to a specific collection or documents that match some criteria. If you want the realtime-results of a query that has sorting, skip limit etc. than it is really hard to warp the changestream into this. In fact this is exactly what EventReduce could do for you.

For more information about the difference I recommend the video "Real-Time Databases Explained: Why Meteor, RethinkDB, Parse & Firebase Don't Scale" https://www.youtube.com/watch?v=HiQgQ88AdYo&t=1703s

>> There is a big difference between a change-stream and a realtime query. For example mongodbs cursor-stream is a good way to observe the events that happen to a specific collection or documents that match some criteria. If you want the realtime-results of a query that has sorting, skip limit etc. than it is really hard to warp the changestream into this.

Have you actually tried it from mongo shell or any mongodb client driver ?

In the official document link which i shared it is clearly mentioned it supports aggregation pipeline. Any operator which is compatible with aggregation pipeline framework including "$sort" and "$skip" can be used. You can also use JOIN like operator "$lookup" or "$graphLookup".

See this link for info https://docs.mongodb.com/manual/core/aggregation-pipeline-op...

Yes I used it. I actually know it really well. I also did performance comparisons with mongodb and mongodbs change stream and cursors. What I posted here is just an algorithm. You could now compare it to mongodb (a product) and say it is a "more flexible solution" but I do not see the point in directly comparing it simply based on the documentation of both.

>> Yes I used it. I actually know it really well. I also did performance comparisons with mongodb and mongodbs change stream and cursors.

Can you share the link for the code and data in Database against which you are querying to prove your claim ?

No and I also do not want to "claim" something. Feel free to do your own tests.

Very cool! This reminds me of some research I did a few years ago on program consolidation: https://dl.acm.org/doi/10.1145/2594291.2594305

Thanks that looks interesting. I added it to my to-read list.

Databases like PostgreSQL don't offer insights into the query plans, does EventReduce parse the SQL statements to determine which tables and rows will be affected by a query and run the appropriate caching or cache invalidation logic?

This is only for queries where you explicitly want the query plan.

However, your query won't return the query results.

And regular PostgreSQL queries don't let you return the query plan they used on top the query results.

This is a great question. So EventReduce is only the algorithm that calculates your new results. The parsing of SQL is not done by it, you have to bring it by yourself. This works over providing some information about the query like Sort-fields and query-matchers. This is described good in the JavaScript implementation [1]. Providing this functions works easy for NoSQL-Queries because they are better composable. For SQL-queries you have to do some work before you can use EventReduce.

Also see the limitations of EventReduce in the readme.

[1] https://github.com/pubkey/event-reduce/tree/master/javascrip...

PostgreSQL does offer insights into the plan. Just put EXPLAIN before your query.

Right but that's when you explicitly ask for it.

However, you can't ask ask PostgreSQL to run a query and also return the query plan used for said query.

Unless there's some weird edge case that I'm not aware of, Postgres will execute what it's planner tells it to. Passing a query to EXPLAIN will show the plan.

Query plans are based on database statistics.

Explain will calculate the query plan for a query.

Explain analyze will calculate the query plan, run the query and compare the query plan expectations to reality.

However, if statistics change, so does the query plan. So if you run a query then run the same query again with explain analyze, you don't have a guarantee of getting the same information back. And since explain analyze doesn't return the query results you are obligated to run two separated queries.

> However, if statistics change, so does the query plan. So if you run a query then run the same query again with explain analyze, you don't have a guarantee of getting the same information back. And since explain analyze doesn't return the query results you are obligated to run two separated queries.

If that's a problem you can use auto_explain.

It reads like it cannot handle queries over multiple tables.

Many applications solve this by using memory caching (e.g. Redid, memcached, etc.) of performance sensitive datasets. There are a lot of drawbacks to the approach, to the point that I would avoid it altogether.

The difference is that with simple caching you have to run full queries again when the cache becomes invalidated. This is often expensive especially on write intensive data usages.

Have you compared with an in-memory data store like Redis? Due to the lack of support for joins, that seems like a more natural comparison than A relational database.

Interesting, seems similar to Firebase's Firestore NoSQL database. In that you can create a complex query and receive real-time updates on each query.

So... a write-through cache?

The use case seems better fit for a streaming database.

There is a big difference between a database with an event stream and a 'realtime query' that can be created with event reduce.

What is that difference?

I recommend the video "Real-Time Databases Explained: Why Meteor, RethinkDB, Parse & Firebase Don't Scale" https://www.youtube.com/watch?v=HiQgQ88AdYo

That does not answer what differentiates your solution. I work on steaming systems. I am aware of the spectrum of online, latency aware data processing. But what I can tell from your solution is that the changes are coming from the database itself. Since, as I understand it, the database is the still the source of all data, I don’t see why your solution is any faster than continuous queries in a database.

Sooo...Materialized/Indexed Views?

No, see the FAQ in the readme:

Materialized views solve a similar problem but in a different way with different trade-offs. When you have many users, all subscribing to different queries, you cannot create that many views because they are all recalculated on each write access to the database. EventReduce however has a better scalability because I does not affect write performance and the calculation is done when the fresh query results are requested not beforehand.

Others have pointed this out already: This is wholly dependent on the RDBMS used, and Oracle offers incremental refresh on demand. I have to admit though that I mistakenly thought that MSSQL would as well...

Correct me if I'm wrong, but Materialized Views have the limitation that you need to refresh the entire view! Often you know which rows will change based on the data you receive.

This depends on the RDBMS. Oracle for example has incremental materialized view updates.


Applications are open for YC Winter 2022

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact