Hacker News new | past | comments | ask | show | jobs | submit login

A data point for you: 7B models at 5-bit quantization run quite comfortably under llama.cpp on the AMD Radeon RX 6700 XT, which has 12GB VRAM and was part of a lot of gaming PC builds around 2021-22.

I can’t give this as a recommendation - there are far more tools available for Nvidia GPUs, but larger VRAM is available on AMD GPUs at lower prices from what I can see.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: