Hacker News new | past | comments | ask | show | jobs | submit login

Materialize https://materialize.io/ Incremental update/materialization of database views with joins and aggregates is super interesting. It enables listening to data changes, not just on a row level, but on a view level. It's an approach that may completely solve the problem of cache invalidation of relational data. Imagine a memcache server, except it now also guaranties consistency. In addition, being able to listen to changes could make live-data applications trivial to make, even with filters, joins, whatever.

Similarly, someone is developing a patch for postgres that implements incrementally updating/materializing views[1]. I haven't tried it so I can't speak of its performance or the state of the project, but according to the postgres wiki page on the subject [2] it seems to support some joins and aggregates, but probably not something that would be recommended for production use.

[1] https://www.postgresql-archive.org/Implementing-Incremental-... [2] https://wiki.postgresql.org/wiki/Incremental_View_Maintenanc...






+1, very excited about this.

They're marketing it in the OLAP space right now, but at some point I'd like to try integrating it with a web framework I've been working on.[1][2] It'd be a more powerful version of firebase's real-time queries. Firebase's queries don't let you do joins; you basically can just filter over a single table at a time. So you have to listen to multiple queries and then join the results by hand on the frontend. Doesn't work if you're aggregating over a set of entities that's too large to send to the client (or that the client isn't authorized to see).

[1] https://findka.com/blog/migrating-to-biff/ [2] https://github.com/jacobobryant/biff


Thanks for the vote of confidence! One thing: We're not marketing it in the OLAP space. Our existing users very much are building new applications.

Initially we went for the metaphor of "what if you could keep complex SQL queries (e.g. 6-way joins and complex aggregations, the kinds of queries that today are essentially impossible outside a data warehouse) incrementally updated in your application within milliseconds? What would you build?

We're moving away from that metaphor because it seems it's more confusing than helpfuL. Tips always appreciated!


Ah, thanks for the correction. In any case I'm looking forward to trying it out eventually--got a number of other things ahead in the queue though.

My suggestion would be consider comparing it to firebase queries. Firebase devs are already familiar with how incrementally updated queries can simplify application development a lot. But, despite Firebase's best marketing attempts, the queries are very restrictive compared to sql or datalog.


I’ve always wanted to take the time to try to build this. It’s been possible in PG for a while to use a foreign data wrapper to do something like directly update an external cache via trigger or pubsub it to something that can do it for you.

Making it easy here would be absolutely fascinating.



Materialize is based on differential dataflow, that is based on timelly dataflow. The abstraction works like magic: distributed computation, ordering, consistency, storage, recalculation, invalidations... All those hard to since problems are handled naturally by the computing paradigm. Maybe the product is similar, but not the principles behind

Principles only matter to hackers, but the end result for end users is identical.

It’s just very unfortunate that materialize has a much much bigger marketing team than the datomic people.


Materialized is streaming, Datomic is poll.

How are they close?

This looks great - I've been looking into debezium for a similar idea but they don't natively support views which makes sense from a technical pov but is rather limiting. There's a few blog posts on attaching metadata/creating an aggregate table but it involves the application creating that data which seems backwards.

Would be huge if materialize supports this out the box. I believe it's a very useful middle ground between CRUD overwriting data and eventsourcing. I still want my source of truth to be a rdbms, but downstream services could use data stream instead


This is exactly what we do! This is a walkthrough of connecting a db (these docs are for mysql, but postgres works and is almost identical) via debezium and defining views in materialize: https://materialize.io/docs/demos/business-intelligence/

That's super interesting. Will need to read a lot more about it though.

Cool! It's interesting!

Doesn't Hasura or Postgraphile do this better? They give a GraphQL API over Postgres with support for subscriptions, along with authentication, authorization etc.

You could shoehorn hasura for this usecase but those tools are primarily intended for frontend clients to subscribe to a schema you expose.

Change data capture allows you to stream database changes to a message bus or stream which has much better support for backend service requirements. Example: if a downstream service goes down, how would it retrieve the missed events from Hasura? Using Kafka or a buffered message bus, you'd be able to replay events to the service.

Nevermind having to support websockets in all your services :/


It looks similar to couchdb?



Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: