Hacker News new | past | comments | ask | show | jobs | submit login

They're just outputting tokens that resemble a reasoning process. The underlying tech is still the same LLM it always has been.

I can't deny that doing it that way improves results, but any model could do the same thing if you add extra prompts to encourage the reasoning process, then use that as context for the final solution. People discovered that trick before "reasoning" models became the hot thing. It's the "Work it out step by step" trick but in a dedicated fine-tune.






> They're just outputting tokens that resemble a reasoning process.

Looking at one such process of emulating reasoning (got deepseek-70B locally), I'm starting to wonder how does that differ from actual reasoning? We "think" about something, may make errors in that thinking, look for things that don't make sense and correct ourselves. That "think" step is still a blackbox.

I asked that llm a typical question of gas exchange between containers, it made some errors and noticed some calculations that didn't make sense:

> Moles left A: ~0.0021 mol

> Moles entered B: ~0.008 mol

> But 0.0021 +0.008=0.0101 mol, which doesn't make sense because that would imply a net increase of moles in the system.

Well, that's totally invalid calculation, it should be "-" in there. It also noticed that those quantities should be same in other place.

Eventually, after 102 minutes and 10141 tokens, involving checking answers from different angles multiple times, it outputted approximately correct response.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: