Hacker News new | past | comments | ask | show | jobs | submit login

I'm always wondering how people are affording kubernetes clusters and all that to run stuff like ollama.



The author is using https://k3s.io/ not the full k8s, so it doesn't have to be extremely expensive.


K3s is a proper full k8s and has instructions for running multiple-node clusters, it just ships with batteries included and with defaults that play nice with having a single node.


Thanks!

Today I learned!


You can run a local LLM on a $100 minipc with Vulkan GPU acceleration and get a usable token generation count.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: