>Train your AI's on this first, and make sure it never hallucinates over this issue
The creators have the same noble goal as you do here, but LLMs can't do that. You can't begin with a small text document. You end with it. And you're welcome to feed the UDHR into the system prompt of the OpenAI API - it won't be the fix to the bias problems. It has to go in at the end, because it's only after it trains on massive datasets that it begins using coherent English sentences at all. There's a saying - "If you want to bake an apple pie from scratch, you must first create the universe" - and if you want to add the Universal Declaration of Human Rights as a filter layer to an LLM, you must have the first billion weights it needs to even read those sentences in English and approximate the censorship you desire. And even after you add the final censorship layer, they're still a next-token approximator, and reality will never fit in a set of weights in RAM, so they will always hallucinate. If you want to skip deep learning (LLMs etc) entirely and construct a different kind of A.I. (perhaps a symbolic knowledge system from the 60s) then yeah you will need to manually feed it base truth like that, but those systems can't approximate anything beyond their base truth, so aren't as useful. Trust me, lots of smart people are trying to solve this bias problem, in earnest. It's not a matter of bad creators baking their personal lack of morals into the models. It's more that cleaning the datasets of every troll comment ever made online is an exhausting task. And discovering the biases later and filtering them out is an exhausting task. The LLM will even hallucinate (approximate) new biases you've never seen before. It's whack-a-mole.
The point was not for me to expose my ignorance of how AI is trained - but that is, after all, where we are at.
The point really was that the creators of AI have to test against the UDHR.
Perhaps, actually, this is a more legislative issue - which is why I would say its even more important for the technologically-elite to get this right, before it hits that wooden wall ..
The creators have the same noble goal as you do here, but LLMs can't do that. You can't begin with a small text document. You end with it. And you're welcome to feed the UDHR into the system prompt of the OpenAI API - it won't be the fix to the bias problems. It has to go in at the end, because it's only after it trains on massive datasets that it begins using coherent English sentences at all. There's a saying - "If you want to bake an apple pie from scratch, you must first create the universe" - and if you want to add the Universal Declaration of Human Rights as a filter layer to an LLM, you must have the first billion weights it needs to even read those sentences in English and approximate the censorship you desire. And even after you add the final censorship layer, they're still a next-token approximator, and reality will never fit in a set of weights in RAM, so they will always hallucinate. If you want to skip deep learning (LLMs etc) entirely and construct a different kind of A.I. (perhaps a symbolic knowledge system from the 60s) then yeah you will need to manually feed it base truth like that, but those systems can't approximate anything beyond their base truth, so aren't as useful. Trust me, lots of smart people are trying to solve this bias problem, in earnest. It's not a matter of bad creators baking their personal lack of morals into the models. It's more that cleaning the datasets of every troll comment ever made online is an exhausting task. And discovering the biases later and filtering them out is an exhausting task. The LLM will even hallucinate (approximate) new biases you've never seen before. It's whack-a-mole.