
Ask HN: How much would it cost to replicate GPT-2's model? - sharemywin
Cost 1: Scrape 8 Million articles from Reddit posts.<p>http:&#x2F;&#x2F;files.pushshift.io&#x2F;reddit&#x2F;<p>Cost 2: Training a 1.5 B parameter transformer network.<p>I&#x27;m assuming it could be trained with google&#x27;s TPUs. And the model would just be a bigger version of the model open ai released.
======
tree_of_item
> Their model used 256 of Google's Cloud TPU v3, though I've not seen training
> durations. The TPU v3 is only available individually outside of @Google
> (though @OpenAI likely got special dispensation) which means you'd be paying
> $8 * 256 = $2048 per hour.

[https://twitter.com/Smerity/status/1096189352743301120](https://twitter.com/Smerity/status/1096189352743301120)

> Thanks. So then it was 32 TPUv3s, to be more precise, and sticker-price
> training costs would then be per Smerity 32 * 24 * 7 * 8 = $43k?

[https://www.reddit.com/r/MachineLearning/comments/aqlzde/r_o...](https://www.reddit.com/r/MachineLearning/comments/aqlzde/r_openai_better_language_models_and_their/)

~~~
solomatov
It's likely even more since they needed to perform a hyperparameter tuning.
Multiply it by 10 or 100 to get a more realistic estimate.

