Hacker News new | past | comments | ask | show | jobs | submit login

Actually latest interfaces use cognitive compression to keep memory inside the context window.

It’s a widely used trick and pretty easy to implement.




Do you know of any chat tools that are publicly documented as using this technique?


No but we all talk about it behind the scenes and everyone seems to use some form of it.

Just have the model reflect and summarize so far and remember key concepts based on the trajectory and goals of the conversation. There are a couple different techniques based on how much compression you want: key pairing for high compression and full statement summaries for low compression. There is also a survey model where you have the llm fill in and update a questioneire every new input with things like “what is the goal so far” and “what are the key topics”

It’s essentially like a therapists notepad that the model can write to behind the scenes of the session.

This all conveniently lets you do topical and intent analytics more easily on these notepads rather than the entire conversation.


Right, I know the theory of how this can work - I just don't know who is actually running that trick in production.


I'm curious what summarizing prompts or specific verbs (e.g. concise, succinct, brief, etc.) achieve the best "capture" of the context.


“One sentence” does the trick




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: