Hacker News new | past | comments | ask | show | jobs | submit login
Ask HN: How much would it cost to replicate GPT-2's model?
4 points by sharemywin 36 days ago | hide | past | web | favorite | 2 comments
Cost 1: Scrape 8 Million articles from Reddit posts.

http://files.pushshift.io/reddit/

Cost 2: Training a 1.5 B parameter transformer network.

I'm assuming it could be trained with google's TPUs. And the model would just be a bigger version of the model open ai released.




> Their model used 256 of Google's Cloud TPU v3, though I've not seen training durations. The TPU v3 is only available individually outside of @Google (though @OpenAI likely got special dispensation) which means you'd be paying $8 * 256 = $2048 per hour.

https://twitter.com/Smerity/status/1096189352743301120

> Thanks. So then it was 32 TPUv3s, to be more precise, and sticker-price training costs would then be per Smerity 32 * 24 * 7 * 8 = $43k?

https://www.reddit.com/r/MachineLearning/comments/aqlzde/r_o...


It's likely even more since they needed to perform a hyperparameter tuning. Multiply it by 10 or 100 to get a more realistic estimate.




Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: