Hacker News new | past | comments | ask | show | jobs | submit login

how much context size?



Just 4K. Because deepseek doesn't allow for the use of flash attention it means you can't run quantised qkv




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: