How about training a classifier to detect and remove potentially bad stuff, then let the users who uploaded it argue that the classifier made a mistake? Only in those cases have human moderators look at it?

They already have classifiers, of course. It's not like the moderators go through everything posted to FB–you'd need millions of people to do that. What the mods actually do, presumably, is review content reported by users.

Then perhaps they need better classifiers? I don’t remember seeing any papers from FB about that, so it does not seem like a priority for them. Or just make the existing bad classifiers more aggressive and shift the moderator’s effort to reviewing good content.

