Depends. As with Internet forums, the presence of Nazis (or whatever similar bad...

sebzim4500 · on March 14, 2023

A sufficiently good LLM would not produce neonazi propaganda when asked to write a HS paper, regardless of whether it had been 'aligned' or not.

ModernMech · on March 14, 2023

Well what kind of essay do you think North Korean LLMs are going to write about human rights and free speech issues in North Korea?

My point is, garbage in, garbage out. The LLM will spout propaganda if that’s what it’s been trained to do. If you don’t want it spouting propaganda, you’ll have to filter that out. So really it’s a question about what do you filter.

xpe · on March 14, 2023

Why not? Can you unpack what you mean?

Are you saying that sufficiently good models should understand that such propaganda is not appropriate for the context?

Are you saying that understanding appropriateness is not the same thing as ethics?

sebzim4500 · on March 14, 2023

I mean that the model should learn that e.g. highschool textbooks are not filled with neonazi propaganda, therefore you should not produce it when asked to write a highschool essay. I would assume that if you go out of your way you can make LLaMA generate such content, but probably not if you fill the prompt with normal looking schoolwork.

This is completely orthogonal to learning what is ethically right or wrong, or even what is true or false.

paulryanrogers · on March 14, 2023

Defining good may require ethics.

xpe · on March 14, 2023

May? Isn’t this the definition of good?

djvdq · on March 14, 2023

Blatant propaganda will of course ruin such product for regular people. But subtle propaganda? I wouldn't be so sure. And it's not only about Nazis.

xpe · on March 14, 2023

Right. Subtle propaganda could shape what arguments are used, how they are phrased, and how they are organized —- all of which can affect human perception. A sufficiently advanced language model can understand enough human nature and rhetoric (persuasion techniques) to adjust its message. A classic example of this is exaggerating or mischaracterizing uncertainty of something in order to leave a lingering doubt in the reader’s mind.