Hacker News new | past | comments | ask | show | jobs | submit login

All spaces do[0], but please don’t abuse it: it is just for demo purposes. If you hammer it, it will be down for everyone, and they might not bring it back up.

It can be run locally with ~16GB VRAM GPU; you might be able to configure it at a lower precision to run it with GPUs with half the RAM.

[0]: https://huggingface.co/spaces/togethercomputer/GPT-JT/blob/m...




There are also a number of commercial services that offer GPT-J APIs (and surely in a couple days GPT-JT APIs) on a pay-per-token or pay-per-compute-second basis. For light use cases those can be extremely affordable.


Cannot one do inference using a CPU?


thank you!




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: