Try Kilocode with deepseek v4 (via API directly to deepseek, much cheaper than v...

sfifs · 2026-06-11T11:09:27 1781176167

Deepseek Flash v4 actually runs on 128Gb systems (about 14 tok/sec). Antirez created a fabulous 2 bit quant and a highly tuned LLM server

https://github.com/antirez/ds4

LoganDark · 2026-06-11T07:05:30 1781161530

I do use DeepSeek, it's exceptionally cheap! Inference is slow though, and it's not particularly intelligent but the experience is better than local inference.