Pardon my ignorance -- assuming that range and codomain are approximately equivalent in this context, how do you specify a prompt with a large codomain? Is there a canonical example of a prompt with a large codomain?
It seems to me that, in natural language, the size of the codomain is related to the specificity of the prompt. For instance, if the prompt is "We are going to ..." then the codomain is enormous. But if the prompt is "2 times 2 is..." the codomain is, mathematically, {4, four}, some series of 4 symbols, eg IIII, or some other representation of the concept of "4" (ie different base or language representations: 0x04, 0b100, quatro, etc).
But if this is the case, a broad codomain is approximately synonymous with "no correct answer" or "result is widely interpretable". Which implies that the larger the codomain the easier it is to claim an answer "correct" in context of the prompt.
How do you reconcile loose interpretability with statistical rigor?
I ask the question, what is 2 * 2, which is an obviously loaded question that's pattern matched to death.
The LLM can answer "4" or "The answer is 4" of "looks like the answer is 4"
All valid answers but all the same. We count all 3 of those answers as just 4 out of the set of numbers. But we have to use our own language faculties to cut through the noise of the language itself.
> I ask the question, what is 2 * 2, which is an obviously loaded question that's pattern matched to death.
Yeah, that was my point. Small codomain -> easy to validate. Large codomain -> open to interpretation. You implied that to prove reasoning, pick a prompt with a large codomain and if the LLM answers with accurate precision, then viola, reasoning.
So my question was, can you give an example of a prompt with a high codomain that isn't subject to wide interpretation? It seems the wider the codomain the easier it is to say, "look! reasoning!"
It seems to me that, in natural language, the size of the codomain is related to the specificity of the prompt. For instance, if the prompt is "We are going to ..." then the codomain is enormous. But if the prompt is "2 times 2 is..." the codomain is, mathematically, {4, four}, some series of 4 symbols, eg IIII, or some other representation of the concept of "4" (ie different base or language representations: 0x04, 0b100, quatro, etc).
But if this is the case, a broad codomain is approximately synonymous with "no correct answer" or "result is widely interpretable". Which implies that the larger the codomain the easier it is to claim an answer "correct" in context of the prompt.
How do you reconcile loose interpretability with statistical rigor?