Another good resource from Google is [1] which focuses more on operational impacts after you deploy a system that relies on ML (I'm a coauthor). [2], which was written by my boss, is also great.
Beautiful. I see so many software engineers that make design mistakes because they are over-confident in their understanding of ML, Statistics, and Economics. This is a welcome inclusion.
My guess (I haven't had a chance to fully digest the paper) would be that they are saying spam is designed to emulate high quality signals so discounting it in the quality ranking could mute actual organic quality signals.
Instead the approach they suggest (from my quick read) is to not include any information in the quality ranking system about spam but rather allow the spam classifier/filter to deal with it on it's own terms.
Great resource. I have a chrome extension http://compakt.nhatqbui.com where I naively added ML features but it was too inconsistent. I now understand the cautioned approach the guide advises on adding ML to a product.
It's Google's A/B experiment framework that runs on production traffic and allows you to slice by various attributes and apply experimental models to the slice.
Yeah, that rule isn't very well described, but if you read between the lines:
Play once had a table that was stale for 6 months,
and refreshing the table alone gave a boost of 2%
in install rate
Your best engineers will tell you that your fancy new ML system needs to have a silent, behind-the-scenes _heuristic_ system backing it up. The heuristic system in this example is reading from the same tables and running some heuristics to produce output that "makes sense."
If the ML system and the heuristic system are disagreeing in a lot of cases, you have a problem. (Or, for this example, the heuristic system just flags that the table hasn't been updated.)
These flagged items are the things you check _before_ exporting your model to production.
So between developing the heuristic system and chasing down regressions in unit testing, it turns out ML isn't going to save you 90% of your engineering time.
> Your best engineers will tell you that your fancy new ML system needs to have a silent, behind-the-scenes _heuristic_ system backing it up.
This should have been in guide somewhere. Makes a lot more sense than the "data not updated" example, which is a bit too narrow.
> So between developing the heuristic system and chasing down regressions in unit testing, it turns out ML isn't going to save you 90% of your engineering time.
There are 2 rules about whether to use ML. First, there has to be some underlying rules for the machine to learn. Second, you only want ML if it's harder to figure out the rules themselves.
Of course you can bend the second rule a bit if you are a ML expert (and maybe even have the heuristic system lying around already).
> your fancy new ML system needs to have a silent, behind-the-scenes _heuristic_ system backing it up.
Note that the heuristic only needs to be just "good enough".
Example: If Netflix's recommendation service is unavailable, the system just shows a list of popular movies instead of the personalized recommendations.
[1] https://sites.google.com/site/wildml2016nips/SculleyPaper1.p...
[2] https://research.google.com/pubs/pub43146.html