Ground: A Data Context Service (2017) [pdf] (berkeley.edu)
46 points by espeed 4 months ago | hide | past | web | favorite | 4 comments

Needs a (2017) tag. Anyway, here's the repo: https://github.com/ground-context/ground

Interesting paper but the repo (https://github.com/ground-context/ground) seems abandoned. Have seen companies losing out on a lot of value they can extract from their data because metadata context (discoverability, lineage and semantics) is the last thing on plate or not considered important across the org.

The primary author is Joseph Hellerstein, head of the Berkeley RISELab (the new Spark/AMPLap). Their primary project is called Fluent (which until recently was called Anna [4])...it's the all encompassing name changes that get ya! ;-)

Similarly, see Matei Zaharia's (the original Spark lead) and his team at MIT (he's now at Stanford) SIGMOD 2018 paper called MISTIQUE [4]...

[1] Joe Hellerstein https://github.com/jhellerstein

[2] Berkeley RISELab https://rise.cs.berkeley.edu https://github.com/ucbrise

[3] Fluent Compute Platform https://github.com/fluent-project/fluent

[3] Anna: A Crazy Fast, Super-Scalable, Flexibly Consistent KVS https://databeta.wordpress.com/2018/03/09/anna-kvs/ Discussion: https://news.ycombinator.com/item?id=16551072

[4] MISTIQUE: A System to Store and Query Model Intermediates for Model Diagnosis https://cs.stanford.edu/~matei/papers/2018/sigmod_mistique.p...

Or that it's so hard to do well enough to be useful, that hardly anyone can, except possibly Google, as a special case.

