Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Tried. The context windows just weren't big enough.
 help



Qwen3.6-27B supports a 1 million token context window.

Of course, you have to have the right hardware to be able to run with a context window like that, as it takes about 100GB of memory on my DGX Spark to do that with full f16 KV cache on the q4_k_xl model.


Got a similar result (my RTX 4070 only has 12 GB). I'm curious about whether 24/32 GB meaningfully improves this enough to make it useful.

Try it on RAM and CPU.

It’s slower but you can run them.


Good idea for evaluating the models, thanks.

Prompt more directly instead of open ended.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: