Cost 2: Training a 1.5 B parameter transformer network.
I'm assuming it could be trained with google's TPUs. And the model would just be a bigger version of the model open ai released.
> Thanks. So then it was 32 TPUv3s, to be more precise, and sticker-price training costs would then be per Smerity 32 * 24 * 7 * 8 = $43k?