Hacker News
new
|
past
|
comments
|
ask
|
show
|
jobs
|
submit
login
amelius
4 days ago
|
parent
|
context
|
favorite
| on:
Reasoning models reason well, until they don't
The problem is that the training data doesn't contain a lot of "I don't know".
pegasus
4 days ago
|
next
[–]
The bigger problem is that the benchmarks / multiple-choice tests they are trained to optimize for don't distinguish between a wrong answer and "I don't know". Which is stupid and surprising. There was a thread here on HN about this recently.
reply
astrange
4 days ago
|
prev
[–]
That's not important compared to the post-training RL, which isn't "training data".
reply
Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10
Guidelines
|
FAQ
|
Lists
|
API
|
Security
|
Legal
|
Apply to YC
|
Contact
Search: