The biggest difference between automatically keeping an MV up to date vs keeping indices up to date is that the write amplification of the latter is a function of the index count you have, while the former is a function of data and query. It’s easy to come up with cases when users update a single db row, and you end up having to update millions of rows in a MV (eg the every row in the MV has a name “Rob” and Bob changes his name).
I read the timely dataflow which underpins materialize.com. It seems like we don’t necessarily need the support of loops, which timely dataflow allows, for regular SQL, which is a DAG of operators. It appears that as long as the database supports snapshot reads, one can have a push-based query execution to enable incremental MV updates. The problem, I think, is still in the write-amp-as-a-function-of-data, which is unbounded. It is very cool regardless.
The technique can be used for cache invalidation as well, given the data cached needs to be described in SQL, which seems reasonable.
I read the timely dataflow which underpins materialize.com. It seems like we don’t necessarily need the support of loops, which timely dataflow allows, for regular SQL, which is a DAG of operators. It appears that as long as the database supports snapshot reads, one can have a push-based query execution to enable incremental MV updates. The problem, I think, is still in the write-amp-as-a-function-of-data, which is unbounded. It is very cool regardless.
The technique can be used for cache invalidation as well, given the data cached needs to be described in SQL, which seems reasonable.