It's answering with "cherries", though "cherries" were never mentioned anywhere in the question since the task was to choose between "apples" and "pears" this time,
and not "cherries" and "pears" like the example found on the internet.
I agree with who you're responding to. Cheating, to me, would imply that there's some sort of hard coded guiding to the LLM. This just seems like typical LLM hallucinations?
I don't understand the leap to "cheating" either. LLMs aren't abstract logic models; they don't promise to reason from first principles at all. They give you an answer based on training data. That's what you want them to do. That they have some reasoning features bolted around the inference engine is a feature companies are rushing to provide (with... somewhat mixed success).
This is not hard to understand. LLM can solve never before seen logic puzzles. This specific one proves that it HAD encountered this before, proving it was not doing anything emergent, but just basic remembering. Worse, it's not even reading the prompt correctly.
There's no logical rules built in at all. But Transofmers architecture is specifically trained to learn combinatoric play and rules of engagement from the data, so it can extrapolate and do cool, new things, that are not in the training data. In a way, you give them a chess board, the rules of the game, and then it can play. You don't teach them every possible board state. What's interesting is with significant amount of parameters it seems to encode more and more abstract and human-like understanding of the 'elements' at play and the 'rules of engagement' on top of them.
Edit: Not native. I'm not sure 'rules of engagement' is the correct english term here.
> What makes them different to a very advanced Markov chain?
Really nothing. There's some feedback structure in the layers of the model, it's not just one big probability table. But the technique is fundamentally the same, it's Markov, just with the whole conversation as input and with billions of parameters.