LLMs are mostly trained on Internet posts, so they are Internet simulators.
If an Internet posts explains its reasoning, step-by-step it reaches a correct conclusion more often than other posts. Therefore LLMs that explain their reasoning and also more likely to reach the correct solution.
Find me good Internet posts that use verification and general reasoning. They are rare. The Internet posts I read suck at verification and general reasoning.
Therefore LLMs will suck at verification and general reasoning until we refine or augment our datasets.
Correct. Internet posts contain explanations, if at all, rather than reasoning, which is not the right model for LLMs because it starts with the conclusion and then describes in hindsight the steps that have already been taken, invisibly, to reach it. It's a retrospective trace of our own chain of thought, not the chain of thought in itself. We can only write explanations because our actual reasoning is, invisibly, present in our mental context window. But that's exactly the tokens that would actually help a LLM.
(That's also why we can explain answers backwards as easily as forwards.)
I always objected to the grumpy old people who complained "this has been covered and answered many times, so let's close this thread."
Having said that, this has been covered and answered many times. Can we please create a responder for any HN question that can be answered with "LLMs are Internet simulators, and they do that because that is how Internet posts are."
If an Internet posts explains its reasoning, step-by-step it reaches a correct conclusion more often than other posts. Therefore LLMs that explain their reasoning and also more likely to reach the correct solution.