Hacker News new | past | comments | ask | show | jobs | submit login

Do LLMs have a way around the high end GPU requirements, or can CPU code potentially be much more optimized somehow?

This is the only thing I can think of, not everyone will have the latest high end GPUs to run such software..




If you're doing inference on neural networks, each weight has to be read at least once per token. This means you're going to read at least the size of the entire model, per token, at least once during inference. If your model is 60GB, and you're reading it from the hard drive, then your bare minimum time of inference per token will be limited by your hard drive read throughput. Macbooks have ~4GB/s sequential read speed. Which means your inference time per token will be strictly more than 15 seconds. If your model is in RAM, then (according to Apple's advertising) your memory speed is 400GB/s, which is 100x your hard drive speed, and just the memory throughput will not be as much of a bottleneck here.


Your answer applies equally to GPU and CPU, no?

The comment to which you replied was asking about the need for a GPU, not the need for a lot of RAM.


There will be LLM specific chips coming to market soon which will be specialized to the task.

Tesla already has already been creating AI chips for their FSD features in their vehicles. Over the next years, everyone will be racing to be the first to put out LLM specific chips, with AI specific hardware devices following.


The next generation of Intel/AMD IGPs operating out of RAM should be quite usable.


What exactly is the ideal sort of hardware to be able to run and train large models? Do you basically just need a high end version of basically everything?


Check out LLAMA-CPP


Looks like this was hacked together pretty quickly. This in CPU is exactly what needs optimized to run on more devices, if that's even possible..

I guess it will take hardware and software a while to catch up to compete with ChatGPT..


If you look at the news, yes it came together quickly, but it has also gotten a lot of performance upgrades which have made significant improvements.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: