Hacker News new | past | comments | ask | show | jobs | submit login
DeepSeek Laconic Decoding (twitter.com/alexgdimakis)
1 point by svenfaw 21 days ago | hide | past | favorite | 2 comments



Can we weight the sampling based on a prediction of how long an answer will be?

Like with another model that just says "Oh boy that word sounds like the beginning of an essay of nonsense, I don't think that's what I want to say"?


"Discovered a very interesting thing about DeepSeek-R1 and all reasoning models: The wrong answers are much longer while the correct answers are much shorter."




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: