Depends. As with Internet forums, the presence of Nazis (or whatever similar bad actor) could ruin it for everyone else.
Let’s say you are a high schooler writing an essay about WWII. You ask Google LLM about it, and it starts writing Neo Nazi propaganda about how the holocaust was a hoax, but even if it wasn’t, it would have been fine. The reason it’s telling you those things is because it’s been trained by Neo Nazi content which either wasn’t filtered out of the training set or was added by Neo Nazi community members in production usage.
Either way, now no one wants to listen to your LLM except Neo Nazis. Congrats, you’ve played yourself.
FYI, the reason no one uses Dall-E is because the results are of higher quality in other offerings like Midjourney, which itself does have a content filter.
Well what kind of essay do you think North Korean LLMs are going to write about human rights and free speech issues in North Korea?
My point is, garbage in, garbage out. The LLM will spout propaganda if that’s what it’s been trained to do. If you don’t want it spouting propaganda, you’ll have to filter that out. So really it’s a question about what do you filter.
I mean that the model should learn that e.g. highschool textbooks are not filled with neonazi propaganda, therefore you should not produce it when asked to write a highschool essay. I would assume that if you go out of your way you can make LLaMA generate such content, but probably not if you fill the prompt with normal looking schoolwork.
This is completely orthogonal to learning what is ethically right or wrong, or even what is true or false.
Right. Subtle propaganda could shape what arguments are used, how they are phrased, and how they are organized —- all of which can affect human perception. A sufficiently advanced language model can understand enough human nature and rhetoric (persuasion techniques) to adjust its message. A classic example of this is exaggerating or mischaracterizing uncertainty of something in order to leave a lingering doubt in the reader’s mind.
Let’s say you are a high schooler writing an essay about WWII. You ask Google LLM about it, and it starts writing Neo Nazi propaganda about how the holocaust was a hoax, but even if it wasn’t, it would have been fine. The reason it’s telling you those things is because it’s been trained by Neo Nazi content which either wasn’t filtered out of the training set or was added by Neo Nazi community members in production usage.
Either way, now no one wants to listen to your LLM except Neo Nazis. Congrats, you’ve played yourself.
FYI, the reason no one uses Dall-E is because the results are of higher quality in other offerings like Midjourney, which itself does have a content filter.