Hacker News new | past | comments | ask | show | jobs | submit login

When I said

> such great performance that I've mostly given up on GPU for LLMs

I mean I used to run ollama on GPU, but llamafile was approximately the same performance on just CPU so I switched. Now that might just be because my GPU is weak by current standards, but that is in fact the comparison I was making.

Edit: Though to be clear, ollama would easily be my second pick; it also has minimal dependencies and is super easy to run locally.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: