Hacker News new | past | comments | ask | show | jobs | submit login

A major problem with these table formats that will surface soon enough is that they use serial numerical ordering for versions.

It's like inventing SVN for data. Soon enough git will have to be invented as well.




"Project Nessie: Transactional Catalog for Data Lakes with Git-like Semantics"

https://projectnessie.org/

This supports lightweight branches, and transactional commits and merges. I haven't used it—and it seems cool—but it also seems a little heavyweight to get cross-table transactions on top of these table formats (which would be my primary use case).


Project Nessie also powers Dremio's Arctic service, so you can get all the branching and benefits of Nessie with an intuitive UI to browse branches, create branches and merge branches. Also, it is a cloud managed service with a free tier.



Check out lakeFS (https://github.com/treeverse/lakeFS). It doesn't rely on the object store for ordering and is highly influenced by Git itself (but designed to work on object stores at very large scales)




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: