
Data Scientists Should Be Able to Deploy and Iterate Their Own Models - mikeyanderson
https://blog.algorithmia.com/data-scientists-should-be-able-to-deploy-and-iterate-their-own-models/
======
hadsed
I fully agree, but the tooling is super immature. I think there's going to be
incredible opportunity for engineers to build tools for doing ML in a very
efficient and scientifically rigorous way.

For instance, Jupyter is the best thing we have to an IDE for science. It is
an incredibly innovative project, but it is not what we need. We need a
Photoshop, a Visual Studio, a Final Cut Pro for doing ML.

There are a lot of interesting projects out there solving some of these
problems. My favorite ones are Prodigy (by Explosion AI), Pachyderm,
Paperspace to name a few. But it'll be a decade I think until we get to a
serious place with it as an industry.

I myself have found the process of understanding models after training is
incredibly difficult. I'm talking analyzing misclassifications, visualizing
embeddings, and looking at saliency maps. We just don't know enough about how
models work, and when we do it's only after great effort that most small shops
don't have the resources for. This was true when I was trying to get my last
company off the ground and is still true now that I'm running ML at my current
company. There is a pretty big opportunity especially given that most cloud ML
companies seem to focus on just training and deployment. Thinking about trying
to start this myself actually given how much of a pain point it is for me
today.

~~~
yboris
For model versioning / keeping track of ML experiments, I can recommend Comet
ML - [https://www.comet.ml/](https://www.comet.ml/) Takes just about 1 line of
code to set it up (and of course you can customize it a lot after).

~~~
hadsed
Comet looks great, thanks for the suggestion.

