Hacker News new | past | comments | ask | show | jobs | submit login
CrystalChat – 7B model from LLM360 beating Mistral (twitter.com/llm360)
3 points by sunandcoffee 11 months ago | hide | past | favorite | 2 comments



This does look like a truly open model with all the components needed to replicate under Apache 2. This seems to be a fine-tuned version of their CrystalCoder model.

Kudos for releasing a fully open model that will (hopefully) foster collaboration in the community. Looks like they are also planning to release a 65B model (see Diamond model: https://www.llm360.ai/).

CrystalCoder Dataset (including prep): https://github.com/LLM360/crystalcoder-data-prep CrystalCoder Training code: https://github.com/LLM360/crystalcoder-train CrystalChat Model & Weights: https://huggingface.co/LLM360/CrystalChat


Looks like the model is slightly worse on language ability but better on coding ability. This might be a good model for trying agents.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: