This is less about 'self-supervised' learning, and more about ground truth. I se...

brookst · on March 15, 2023

I think you’re describing introduction of noise to training sets, which is a staple of training.

Your definitions of “fake” and “legitimate” are circular, and miss the central point of large ML models: they can extrapolate from imperfect data because of the massive scale.

Yes, the predictions will be imperfect. That’s true today, of both ML models and human radiologists. It’s about reducing the error rate, not designing a perfect algorithm that is never wrong. I’m pretty sure Gödel or someone can explain why the later isn’t even possible, for machine or human.

meh8881 · on March 15, 2023

That does not feel like a good example. The context here is largely generative models relaxing human creations. We don’t need to generate Brain MRIs. This use case you outline is a niche thing for trying to train better models. Not doing the thing we actually need to do at such a scale that humans aren’t doing the original task anymore.