Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The PR confuse a but 32k/64k and 131k if paid API.

Also this model https://huggingface.co/Qwen/Qwen3-235B-A22B

Is native 32k. So the 64k and 131k use ROPE that is not the best for effective context.

While https://qwenlm.github.io/blog/qwen3-coder/ it's 256k native https://huggingface.co/Qwen/Qwen3-Coder-480B-A35B-Instruct.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: