More

fspeech · 2025-02-15T04:27:07 1739593627

You need a Chinese number to sign up and facial verification to make payments, as far as I can tell.

fspeech · 2025-02-14T07:37:56 1739518676

The smaller cluster is not necessarily a good thing. It may not be able to take advantage of unbalanced load on experts.

fspeech · 2025-02-13T18:35:40 1739471740

Deepseek doesn't have the infrastructure to support Apple. I suspect that they are also not interested in tailoring to Apple's needs given their mission and small size.

fspeech · 2025-02-12T03:11:26 1739329886

In a polytheistic context the invocation is much less severe, roughly equivalent to Apollo's Eyes or Zeus's Eyes.

rsynnott · 2025-02-12T10:44:41 1739357081

But hopefully not Odin's Eye.

fspeech · 2025-02-12T03:07:11 1739329631

That's due to the translation. The original term 天神 comes from the polytheistic Chinese folk religion so it doesn't have the same connotation.

fspeech · 2025-02-09T01:58:06 1739066286

Deepseek's open source inference code, while correct, may not be fully efficient. For example the MLA needs the right associative matrix multiplication order to be efficient.

fspeech · 2025-02-09T01:53:48 1739066028

Do you have any benchmark run yet? I am interested in knowing how many tokens/sec you can get to. Though in the end it should be more efficient to run the model on distributed server clusters.

fspeech · 2025-02-09T01:44:21 1739065461

Thanks for the thought provoking take.

fspeech · 2025-02-07T02:05:27 1738893927

Right. And the number is based on rental rates of GPUs so how many GPUs they own is irrelevant to the claim.

fspeech · 2025-02-06T23:58:42 1738886322

Recent developments like V3, R1 and S1 are actually clarifying and pointing towards more understandable, efficient and therefore more accessible models.