Robust learning from untrusted sources (acolyer.org)
49 points by feross 28 days ago

Some of the comparisons do not seem fair. Here the optimizer has all the data (though it was collected from untrusted sources) while in some of the mentioned works the optimization steps are done at data owners. Specifically this means that trust related regularizers aren't as useful.

More recently, the Snorkel framework has been released, which seems to be solving the dataset-melding problem too: https://www.snorkel.org/ https://github.com/snorkel-team/snorkel

Snorkel is mentioned in the article and is at least 1 year old.

