Ask HN: Machine Learning Versioning
4 points by plumsempy 8 days ago | hide | past | favorite | 1 comment
Hello friends

I am looking for advice and resources on versioning and managing machine learning models and data.

Thanks in advance.

It depends. Most non toy projects are done by more than one person. The tools work great for one person and you can stitch them together, say MLflow or GuildAI for tracking experiments and saving models, MinIO for data as they have merged a pull request to support versioned buckets.

However, if you're more than two people, the increase in complexity is substantial, because the workflows are different.

We use our own machine learning platform, https://iko.ai, to do machine learning projects. What we've done is leverage our experience for the past seven years doing ML for large organizations, and we're removing every part that sucked and offloading it to the platform.

It really depends what kind of work you're doing. If you're experimenting in the week-end on toy projects with no stakes, then anything goes. If you're on a team and you need access to data, collaborate on notebooks, have fresh environments, track experiments, launch training jobs, deploy and monitor models, the complexity is exponential.

- [0]: https://news.ycombinator.com/item?id=25940923

