Hacker News new | past | comments | ask | show | jobs | submit login
OpenLLaMA 7B Training Completed to 1T Tokens (huggingface.co)
58 points by jncraton on June 7, 2023 | hide | past | favorite | 3 comments



be sure to read the warning in their repo: https://github.com/openlm-research/open_llama#loading-the-we...

> Please note that it is advised to avoid using the Hugging Face fast tokenizer for now, as we’ve observed that the auto-converted fast tokenizer sometimes gives incorrect tokenization


This is great. Based on the throughout of 2200 tokens/sec and the 1,000,000,000,000 tokens used to train this was at least $183k worth of compute (that's based on the three year committed use rate). And now we can have it for free!


The price for training their 7B, as stated by MosaicML[0] and Falcon 7B, is roughly the same.

[0] https://twitter.com/MosaicML/status/1660738892306485248




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: