Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

What's the minimum GPU/NPU hardware and memory to run Qwen3 locally?


There is a 0.6B model so basically nothing.

And the MoE 30B one has a decent shot at running OK without GPU. I'm on a 5800x3d so two generations old and its still very usable


I'm running 4B on my 8GB AMD 7600 via ollama


`model.safetensors` for Qwen3-0.6B is a single 1.5GB file.

Qwen3-235B-A22B has 118 `.safetensors` files at 4GB each.

There are a bunch of models and quants between those.


Does it run in 8x80G? Or does the KV cache and other buffers push it over the edge?


Qwen3 is a family of models, the very smallest are only a few GB and will run comfortably on virtually any computer of the last 10 years or recent-ish smart phone. The largest - well, depends how fast you want it to run.


There are models down to 0.6B and you can even run Qwen3 30B-A3B reasonably fast on CPU only.




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: