I commented this somewhere else, but word in the ether is that OLMo is not actually that good of a model given its size and compute budget. I am not entirely sure why, and it’s still good to have the full recipe for at least one model out in the open, but the current OLMo definitely is a cautionary tale for people training their own model.
https://huggingface.co/allenai/OLMo-7B