And further, with this pg_mooncake extension allowing you to store the data in S3, is Postgres simply providing compute to run DuckDB? I suppose it's also providing a standardized interface and "data catalog."
transactions are also managed by postgres as if they are native table, so that you don't need to worry about coordinating commits between postgres and the S3 data.
Yep. I personally like DVC's pipeline implementation because it's lightweight and language-agnostic, but haven't gotten into using their experiment tracking features.
I have noticed this as well. There is a huge resistance to learning Git, and I think it's partly warranted. Researchers know what it is, and know that it's valuable, but think it will take too long to learn and they want to move fast. I recently started building a tool called Calkit (https://github.com/calkit/calkit) in an attempt to simply and unify Git and DVC for these types of researchers. Hoping to convince folks that working reproducibly is actually faster in the long run, never mind the fact that it makes their work more directly usable for pushing the field forward more quickly overall.