Hacker News new | past | comments | ask | show | jobs | submit login

I'm confused on how this is "cheating" isn't it just getting the answer wrong?



It's answering with "cherries", though "cherries" were never mentioned anywhere in the question since the task was to choose between "apples" and "pears" this time,

and not "cherries" and "pears" like the example found on the internet.


I agree with who you're responding to. Cheating, to me, would imply that there's some sort of hard coded guiding to the LLM. This just seems like typical LLM hallucinations?


It's cheating because it has memorized the answer to the puzzle instead of using logic to solve it.


Your concept of cheating is simply how LLMs work.


It is not. LLMs do not just memorize; they also extrapolate, otherwise they would be useless. Just like any ML model.


I thought that is essentially what LLM's do? They learn what words/topics are associated with each other and then stream a response.

In some ways, this is proof that Gemini isn't cheating... It is just doing typical LLM hallucination


Well, sometimes. Sometimes not. https://arxiv.org/abs/2310.17567


Llm's can also do some exploring based on combinatoric play of learned elements.


I don't understand the leap to "cheating" either. LLMs aren't abstract logic models; they don't promise to reason from first principles at all. They give you an answer based on training data. That's what you want them to do. That they have some reasoning features bolted around the inference engine is a feature companies are rushing to provide (with... somewhat mixed success).


This is not hard to understand. LLM can solve never before seen logic puzzles. This specific one proves that it HAD encountered this before, proving it was not doing anything emergent, but just basic remembering. Worse, it's not even reading the prompt correctly.


Thank you for answering a question I had half formed in my head.

Do LLMs have logical rules built in? What makes them different to a very advanced Markov chain?

Are there any models out there that start from logical principles and train on top of that?

(Apologies for poor understanding of the field)


There's no logical rules built in at all. But Transofmers architecture is specifically trained to learn combinatoric play and rules of engagement from the data, so it can extrapolate and do cool, new things, that are not in the training data. In a way, you give them a chess board, the rules of the game, and then it can play. You don't teach them every possible board state. What's interesting is with significant amount of parameters it seems to encode more and more abstract and human-like understanding of the 'elements' at play and the 'rules of engagement' on top of them.

Edit: Not native. I'm not sure 'rules of engagement' is the correct english term here.


I understood you just fine, your English is great!

Thank you for the explanation. It seems like the LLM "plays" to learn? That's very cool, thank you again.


>Do LLMs have logical rules built in?

Handcrafted by humans ? No

But it's still possible to learn such rules from the data in an effort to complete the primary objective (predicting the next token)


> What makes them different to a very advanced Markov chain?

Really nothing. There's some feedback structure in the layers of the model, it's not just one big probability table. But the technique is fundamentally the same, it's Markov, just with the whole conversation as input and with billions of parameters.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: