"To make great products:
Do machine learning like the great engineer you are, not like the great machine learning expert you aren’t."
Not sure if they are saying quality is an input to the spam classifier?
Instead the approach they suggest (from my quick read) is to not include any information in the quality ranking system about spam but rather allow the spam classifier/filter to deal with it on it's own terms.
This is almost an oxymoron :)
Play once had a table that was stale for 6 months,
and refreshing the table alone gave a boost of 2%
in install rate
If the ML system and the heuristic system are disagreeing in a lot of cases, you have a problem. (Or, for this example, the heuristic system just flags that the table hasn't been updated.)
These flagged items are the things you check _before_ exporting your model to production.
So between developing the heuristic system and chasing down regressions in unit testing, it turns out ML isn't going to save you 90% of your engineering time.
This should have been in guide somewhere. Makes a lot more sense than the "data not updated" example, which is a bit too narrow.
> So between developing the heuristic system and chasing down regressions in unit testing, it turns out ML isn't going to save you 90% of your engineering time.
There are 2 rules about whether to use ML. First, there has to be some underlying rules for the machine to learn. Second, you only want ML if it's harder to figure out the rules themselves.
Of course you can bend the second rule a bit if you are a ML expert (and maybe even have the heuristic system lying around already).
Note that the heuristic only needs to be just "good enough".
Example: If Netflix's recommendation service is unavailable, the system just shows a list of popular movies instead of the personalized recommendations.