Hacker News
new
|
past
|
comments
|
ask
|
show
|
jobs
|
submit
login
tommek4077
on Sept 11, 2023
|
parent
|
context
|
favorite
| on:
Nvidia’s AI supremacy is only temporary
Well the Llama.cpp running on CPUs with decent speed and fast development improvements, hints towards CPUs. And there the size of the model is less important as the RAM is the limit. At least for interference this is now a viable alternative.
redox99
on Sept 11, 2023
|
next
[–]
Outside of Macs, llama.cpp running fully on the cpu is more than 10x slower than a GPU.
tommek4077
on Sept 11, 2023
|
parent
|
next
[–]
But having 32 real cores in a cpu is so much cheaper than having mumtiple gpus. RAM is also much cheaper as VRAM.
kirill5pol
on Sept 11, 2023
|
prev
[–]
For local yes, but at data center level the parallelization is still often worth it.
Guidelines
|
FAQ
|
Lists
|
API
|
Security
|
Legal
|
Apply to YC
|
Contact
Search: