He's just repeating a lot of disinformation that has been released about deepsee...

goosejuice · 2025-01-29T16:01:37 1738166497

Benchmarks are great to have but individual/org experiences on specific codebases still matter tremendously.

If an org consistently finds one model performs worse on their corpus than another, they aren't going to keep using it because it ranks higher in some set of benchmarks.

hn_throwaway_99 · 2025-01-29T17:16:25 1738170985

But you should also be very wary of these kind of anecdotes, and this thread highlights exactly why. That commenter says in another comment (https://news.ycombinator.com/item?id=42866350) that the token limitation that he is complaining about has actually nothing to do with DeepSeek's model or their API, but is a consequence of an artificial limit that Kagi imposes. In other words, his conclusion about DeepSeek is completely unwarranted.

throwup238 · 2025-01-29T17:26:49 1738171609

It mashed the header and C++ file together, which is egregiously bad in the context of QT. This isn’t a new library, it’s been around for almost thirty years. Max token sizes have nothing to do with that.

I invite anyone to post a chat transcript showing a successful run of R1 against this prompt (and please tell me which API/service it came from so I can go use it too!)

goosejuice · 2025-01-30T15:18:07 1738250287

I wasn't suggesting using the anecdotes of others to make a decision.

I'm talking about individuals and organizations making a decision on whether or not to use a model based on their own testing. That's what ultimately matters here.