Yes, it’s surprised me how this meme was everywhere in the comments while the data does not support it. I’d bet it’s splashy headlines in news outlets. Important to correct it so that policy is focused on what’s most effective.
You can simply use specific training examples that teach the model what you please. Eg. a set of examples which lead ranking/retreival/filtering models. The models are already online training and weights likely updated every ~1 hour ( or even less).
It’d be easy to go from a set of “moderators” who find examples and use it to query related content and use it as negative training samples. Just a guess.
reply