> Please note that it is advised to avoid using the Hugging Face fast tokenizer for now, as we’ve observed that the auto-converted fast tokenizer sometimes gives incorrect tokenization
This is great. Based on the throughout of 2200 tokens/sec and the 1,000,000,000,000 tokens used to train this was at least $183k worth of compute (that's based on the three year committed use rate). And now we can have it for free!
> Please note that it is advised to avoid using the Hugging Face fast tokenizer for now, as we’ve observed that the auto-converted fast tokenizer sometimes gives incorrect tokenization