Hacker News new | past | comments | ask | show | jobs | submit login

Llama v3.3 70B after quantization runs reasonably well on a 24GB GPU (7900XTX or 4090) and 64GB of regular RAM. Software: https://github.com/ggerganov/llama.cpp .



Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: