Hacker News new | past | comments | ask | show | jobs | submit login
Yi-34B, Llama 2, and common practices in LLM training (eleuther.ai)
41 points by helloericsf 5 months ago | hide | past | favorite | 3 comments



In short, all modern large language models (LLMs) are made from the same algorithmic building blocks. The architectural differences between Llama 2 and the original 2017 Transformer were not invented by Meta, and are all public owing to open access publishing being the norm in computer science. So, even though Yi-34B adopts Llama 2's architecture, Meta's model did not give 01.AI access to any previously inaccessible innovation.


Yi-34B sounds like a stellar object.


Funny, I never saw the claim Yi 34B was based on LLaMA, but now I did.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: