Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

AFAIK you want a model that will sit within the 24GB VRAM on the GPU and leave a couple of gigs for context. Once you start hitting system RAM on a PC you're smoked. It'll run, but you'll hate your life.

Have you ever run a local LLM at all? If not, it is still a little annoying to get running well. I would start here:

https://www.reddit.com/r/LocalLLaMA/



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: