Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> If the AI is simply reflecting the data it was trained on and this data is a representative sample of all data, isn’t it unbiased by definition?

No, “data” is just information which has been gathered. “All data” can be biased.

Also, data can itself be bias, even if it isn’t biased. For instance, a text generation model that was based on unbiased collection of all text ever written by humans would, in one sense, produce “unbiased, human-representative text”. It would also reproduce the biases of the authors, weighted by the volume of writing coming from that bias.

> That’s just a convenient excuse for OpenAI (or others like them) to get away with what effectively is censorship of certain ideas or political views.

While one might object to the editorial choices, I can’t see any rational bounds for objecting to the idea that the creator of models would censor “certain ideas or political views” as a generality.



> It would also reproduce the biases of the authors, weighted by the volume of writing coming from that bias.

Yes, but I think there are ways we could reduce this bias, perhaps significantly, even.

> While one might object to the editorial choices, I can’t see any rational bounds for objecting to the idea that the creator of models would censor “certain ideas or political views” as a generality.

You are right, I was unfair with my words.

I think it would be more fair to say that OpenAI is inadvertently biasing ChatGPT answers as a side effect of their RLHF training being done using answers/rankings done by people (i.e. the AI trainers [1]) who are not a representative sample of the population, but rather, probably comprise a group of people who are likely to be significantly more leaning to one side of the political discourse (presumably, OpenAI employees or Silicon Valley-based contractors?).

This probably greatly biases ChatGPT to produce certain kinds of answers to certain kinds of questions that would likely not happen otherwise, and in fact, these answers are perceived to be quite biased by the other side of the political discourse.

[1] https://openai.com/blog/chatgpt/




Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: