Hacker News new | past | comments | ask | show | jobs | submit login

No. They publish PDFs that hype up their models, but they do not publish anything even resembling a high-level overview of model architecture



Given that you can download and use the weights, the model architecture has to be includded as part of that. And I did read a paper from them recently describing their MoE architecture and how it differs from the original GShard.


Excuse me? What weights can you download from OpenAI? gpt2 does not count


Sorry I meant that DeepSeek release their models. Wrong context.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: