Hacker News new | past | comments | ask | show | jobs | submit login

What does this mean? Can I download the trained model and run it on my machines? Assuming I won't need a supercomputer to run it.



Perhaps first try it out at https://huggingface.co/spaces/togethercomputer/GPT-JT to see what kind of things you can do with it.


I'm flabbergasted. I translated the tweets to Hebrew and reran the example - it returned the correct results. I then changed the input to a negative, and it again returned the correct results. So it's not only in English, and I'm sure that the Hebrew dataset was much smaller. Perhaps it is translating behind the scenes.


Thanks! Perhaps I'm not good at prompt engineering, but I could barely get anything useful out of it.


It's mainly for text classification, which explains why it's not really giving comparable outputs to GPT3


Yes you can download the trained model and run it on your machine. The article has a link to a hugging face model when you can play with in the web browser as a toy example and then download it locally and use with code.


another noob here - does this hugging face model expose an api? i have a light classification use case i might wanna try it out on but think running it on my machine/a beefy cloud machine would be overkill


All spaces do[0], but please don’t abuse it: it is just for demo purposes. If you hammer it, it will be down for everyone, and they might not bring it back up.

It can be run locally with ~16GB VRAM GPU; you might be able to configure it at a lower precision to run it with GPUs with half the RAM.

[0]: https://huggingface.co/spaces/togethercomputer/GPT-JT/blob/m...


There are also a number of commercial services that offer GPT-J APIs (and surely in a couple days GPT-JT APIs) on a pay-per-token or pay-per-compute-second basis. For light use cases those can be extremely affordable.


Cannot one do inference using a CPU?


thank you!




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: