Also this model https://huggingface.co/Qwen/Qwen3-235B-A22B
Is native 32k. So the 64k and 131k use ROPE that is not the best for effective context.
While https://qwenlm.github.io/blog/qwen3-coder/ it's 256k native https://huggingface.co/Qwen/Qwen3-Coder-480B-A35B-Instruct.
Also this model https://huggingface.co/Qwen/Qwen3-235B-A22B
Is native 32k. So the 64k and 131k use ROPE that is not the best for effective context.
While https://qwenlm.github.io/blog/qwen3-coder/ it's 256k native https://huggingface.co/Qwen/Qwen3-Coder-480B-A35B-Instruct.