>Does this mean it may be possible to self-host a ChatGPT clone assuming you hav...

beefield · 2023-07-27T16:47:31

I have 64gb on my 5 year old thinkpad. What kind of performance (tokens per sec) I could expect on that nowadays for a 70B model?

jerrygenser · 2023-07-27T19:37:14

Llama cpp speed is dramatically improved by avx instructions. If your CPU has those it would be much faster than not.

And if it doesn't you need to do some workarounds with compiling and it gets a bit harder to run.