Hacker News new | past | comments | ask | show | jobs | submit login

Is there evidence for the claim that the training was that much cheaper? Or is is 'apparent' by inspection of the verified ingestion architecture?

I can't seem to find any evience beyond the team's statements. What have I missed?






The hypothesis I have is that China has way more compute resources than they are willing to share.

Compute resources they officially should not access to given export bans, where mentioning them might lead to their export ban bypass getting rolled up.


Maybe. They're under no obligation to tell the truth about this.

Hypothetical: Take a large short position on NVDA, announce the market that you trained a massive model without using 10s of millions of rare-as-hen's-teeth NVDA-sentsitive resources. Settle postition then quietly settle giant compute bill. Difficult to know either way, but the market seems have taken the team at face value. I guess we'll know if and only if this reduced training cost methodology is replicated.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: