
Ask HN: Do you know what your ML systems are doing? - ThePhysicist
Dear ML developers, data scientists and ML DevOps: How do you track your ML systems in production and which metrics do you monitor? How do you make sure your models do what they&#x27;re supposed to do when confronted with new data? Do you worry about security and robustness of your models? Can you debug problems in your ML pipeline as effectively as in your software pipeline?
======
jverre
Great question !

Monitoring models in production is actually quite tricky, especially when the
ground truth label is either not available or has a long delay to it (for
example if you are prediction the sales forecast for next quarter, you will
have a 3 month delay between your prediction and the ground truth label).

 _Monitoring:_

What I have found to work is to track data distributions instead. You can then
compare your training distribution to the distribution in real time using the
Wasserstein distance [1].

I've also heard of people training auto-encoders on the training data and then
tracking the reconstruction error in production. If the data changes
substantially, the reconstruction error should increase.

 _Debugging:_

Debugging ML pipelines can be difficult but what I have found to work is to
log all input and output features (with a correlation Id so that you can link
it to the other systems). A dashboard where you can enter a correlation Id and
can see for that request the values for each features overlayed on top of the
distribution for that feature in the training set is very valuable.

Hopefully this provides at least some answers ! But it's a very broad topic so
could continue talking about this for hours !

When deploying ML models in the past, I've encounter exactly the issues you
are talking about and didn't find monitoring solutions .. So I created a
Stakion [2] to do just this ! It tracks ML models in realtime and requires
just a few lines to integrate no matter how you deploy your models.

[1] [https://towardsdatascience.com/life-of-a-model-after-
deploym...](https://towardsdatascience.com/life-of-a-model-after-deployment-
bae52eb83b75)

[2] [https://stakion.io](https://stakion.io)

