I've used DVC in the past and generally liked its approach. That said, I wholeheartedly agree that it's clunky. It does a lot of things implicitly, which can make it hard to reason about. It was also extremely slow for medium-sized datasets (low 10s of GBs).
In response, I created a command-line tool that addresses these issues[0]. To reduce the comparison to an analogy: Dud : DVC :: Flask : Django. I have a longer comparison in the README[1].
In response, I created a command-line tool that addresses these issues[0]. To reduce the comparison to an analogy: Dud : DVC :: Flask : Django. I have a longer comparison in the README[1].
[0]: https://github.com/kevin-hanselman/dud
[1]: https://github.com/kevin-hanselman/dud/blob/main/README.md#m...