1 token/s is way too slow for dialog, especially since a token isn't even a word... | Hacker News

Hacker Newsnew | past | comments | ask | show | jobs | submit

		danielbln on Jan 17, 2023 \| parent \| context \| favorite \| on: GPT-3 is the best journal I’ve used 1 token/s is way too slow for dialog, especially since a token isn't even a word, but often part of a word. 1 t/s might be sufficient for asynchronous processing, but if you want a chatGPT like therapy dialog, then that's not good enough.

generalizations on Jan 17, 2023 [–]

Do we know what the speed difference actually is? I'm not sure what benchmarks would measure that. My best plan so far is to just run a smaller model on one of the GPUs and see how long it takes.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact