Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Models with Large Context Window
4 points by Roshni1990r on May 9, 2024 | hide | past | favorite | 2 comments
Recently Gradient AI announced that they released the Llama 3 models with a 1M context window for both 8B and 70B sizes. And now they just dropped a 4M context window for the 8B size.

what do you think about this and models with Large Context Window?




I use SBERT-type models for ranking, retrieval, classification, etc.

These work properly for a document that fits in the context window but not for a document which is larger. You can cut the document into smaller pieces but it just isn’t the same.

I would like to see longer context models like that and could accept some inference cost for it.

I think the summarization-type use cases will benefit from the context window but the computational complexity of a problem posed can grow explosively as a function of problem size: so x10 context window might not really mean it can do a x10 sized problem consistently.


I am using moonshot ai they cliamed to have a 2m context window




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: