Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I tried the 27b-iat model on a 4090m with 16gb vram with mostly default args via llama.cpp and it didn't fit - used up the vram and tried to use about 2gb of system ram: performance in this setup was < 5 tps.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: