The PR confuse a but 32k/64k and 131k if paid API. Also this model https://huggi... | Hacker News

Hacker Newsnew | past | comments | ask | show | jobs | submit

		mehdibl 25 days ago \| parent \| context \| favorite \| on: Cerebras launches Qwen3-235B, achieving 1.5k token... The PR confuse a but 32k/64k and 131k if paid API. Also this model https://huggingface.co/Qwen/Qwen3-235B-A22B Is native 32k. So the 64k and 131k use ROPE that is not the best for effective context. While https://qwenlm.github.io/blog/qwen3-coder/ it's 256k native https://huggingface.co/Qwen/Qwen3-Coder-480B-A35B-Instruct.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact