Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The model makes inferences about the world from training data. When it sees more female nurses than male nurses in its training set, if infers that most nurses are female. This is a correct inference.

If they were to weight the training data so that there were an equal number of male and female nurses, then it may well produce male and female nurses with equal probability, but it would also learn an incorrect understanding of the world.

That is quite distinct from weighting the data so that it has a greater correspondence to reality. For example, if Africa is not represented well then weighting training data from Africa more strongly is justifiable.

The point is, it’s not a good thing for us to intentionally teach AIs a world that is idealized and false.

As these AIs work their way into our lives it is essential that they reproduce the world in all of its grit and imperfections, lest we start to disassociate from reality.

Chinese media (or insert your favorite unfree regime) also presents China as a utopia.



> The model makes inferences about the world from training data. When it sees more female nurses than male nurses in its training set, if infers that most nurses are female. This is a correct inference.

No it is not, because you don’t know if it’s been shown each one of its samples the same number of times, or if it overweighted some of its samples more than others. There’s normal reasons both of these would happen.


> As these AIs work their way into our lives it is essential that they reproduce the world in all of its grit and imperfections...

Is it? I'm reminded of the Microsoft Tay experiment, were they attempted to train an AI by letting Twitter users interact with it.

The result was a non-viable mess that nobody liked.




Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: