Hacker News new | past | comments | ask | show | jobs | submit login

Depends. As with Internet forums, the presence of Nazis (or whatever similar bad actor) could ruin it for everyone else.

Let’s say you are a high schooler writing an essay about WWII. You ask Google LLM about it, and it starts writing Neo Nazi propaganda about how the holocaust was a hoax, but even if it wasn’t, it would have been fine. The reason it’s telling you those things is because it’s been trained by Neo Nazi content which either wasn’t filtered out of the training set or was added by Neo Nazi community members in production usage.

Either way, now no one wants to listen to your LLM except Neo Nazis. Congrats, you’ve played yourself.

FYI, the reason no one uses Dall-E is because the results are of higher quality in other offerings like Midjourney, which itself does have a content filter.




A sufficiently good LLM would not produce neonazi propaganda when asked to write a HS paper, regardless of whether it had been 'aligned' or not.


Well what kind of essay do you think North Korean LLMs are going to write about human rights and free speech issues in North Korea?

My point is, garbage in, garbage out. The LLM will spout propaganda if that’s what it’s been trained to do. If you don’t want it spouting propaganda, you’ll have to filter that out. So really it’s a question about what do you filter.


Why not? Can you unpack what you mean?

Are you saying that sufficiently good models should understand that such propaganda is not appropriate for the context?

Are you saying that understanding appropriateness is not the same thing as ethics?


I mean that the model should learn that e.g. highschool textbooks are not filled with neonazi propaganda, therefore you should not produce it when asked to write a highschool essay. I would assume that if you go out of your way you can make LLaMA generate such content, but probably not if you fill the prompt with normal looking schoolwork.

This is completely orthogonal to learning what is ethically right or wrong, or even what is true or false.


Defining good may require ethics.


May? Isn’t this the definition of good?


Blatant propaganda will of course ruin such product for regular people. But subtle propaganda? I wouldn't be so sure. And it's not only about Nazis.


Right. Subtle propaganda could shape what arguments are used, how they are phrased, and how they are organized —- all of which can affect human perception. A sufficiently advanced language model can understand enough human nature and rhetoric (persuasion techniques) to adjust its message. A classic example of this is exaggerating or mischaracterizing uncertainty of something in order to leave a lingering doubt in the reader’s mind.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: