Are there ML models that are trained while being used? Humans learn as we go alo...

logophobia · on April 14, 2023

That's a pretty standard part of MLOps. I have a fraud model in production, it's being incrementally retrained each week, on a sliding window of data for the last x-months.

You can do it "online", which works for some models, but for most need monitoring to make sure they don't go off the rails.

sorenjan · on April 14, 2023

That's good to hear, how does it work in practice? Is it basically running the same training as from scratch, but with only the new data, on a separate machine to produce a new version which is then replacing the old production version? Is part of MLOps starting a new training session each week, checking if the loss function looks ok, and then redeploying it?

I still think of how humans work. We don't get retrained from time to time to improve, we learn continually as we gain experience. It should be doable in at least some cases, like classification where it's easy to tell if a label is right or wrong.

logophobia · on April 15, 2023

The typical setup is very simple:

* Take the previous model checkpoint, retrain/finetune it on a window with new data. You typically don't want to retrain everything from scratch, saves time and money. For large models you need specialized GPUs to train them, so typically the training happens separately. * Check the model statistics in depth. We look at way more statistics then just the loss function. * Check actual examples of the model in action * Check the data quality. If the data is bad, then you're just amplifying human mistakes with a model. * Push it to production, monitor the result

MLOps practice differs from to team to team, this checklist isn't universal, just one possible approach. Everyone does things a little differently.

> I still think of how humans work. We don't get retrained from time to time to improve, we learn continually as we gain experience. It should be doable in at least some cases, like classification where it's easy to tell if a label is right or wrong.

For some models, like fraud, correctness is important. Those models need a lot of babysitting. For humans, think about how the average facebooker reacts to misinformation, you don't want that to happen with your model.

Other models are ok with more passive monitoring, things like recommendation systems.

Continuous online training can be done. Maybe take a look at reinforcement learning? It's not widely applied, has some limitations, but also some interesting applications. These types of things might become more common in the future.

sdkgames · on April 14, 2023

>Are there ML models that are trained while being used?

https://en.wikipedia.org/wiki/Reinforcement_learning#Explora...

sorenjan · on April 14, 2023

When I learned about RL we were taught to disable exploration when doing evaluation of the model since exploration part is stochastic. I don't think that would work in production.