Hacker News new | past | comments | ask | show | jobs | submit login

It uses a sliding context windows. Older tokens are dropped as new ones stream in



I don't believe that's the whole story. Other conversational implementations use sliding context windows and it's very noticable as context drops off. Whereas ChatGPT seems to retain the "gist" of the conversation much longer.


I mean, I explicitly have the LLM summarize content that's about to fall out of the window as a form of pre-emptive token compression. I'd expect maybe they do something similar.


I feel like we're describing short vs long term memory.


That’s exactly what it is. It’s just it turns out you need very good generalized or focused simple reasoning to do accurate compression or else the abstraction and movement to long term memory doesn’t include the most important content. Or worse distracting details.

I’ve been working on short and long term memory windows at allofus.ai for about 6 months now and it’s way more complex than I had originally thought it would be.

Even if you can magically extend the content window, the added data confuses and waters down the reasoning of the LLM. You must do layered abstraction and compression with goal based memory for it to continue to reason without distraction of irrelevant data.

It’s an amazing realization, almost like a proof that memory is a kind of layered reasoning compression system. Intelligence of any kind can’t understand everything forever. It must cull the irrelevant details, process the remains and reason on a vector that arises from them.


Is it unfair to consider this some kind of correlate to the Nyquist theorem that makes me skeptical of even the theoretical possibility of AGI claims?


I consider GPT4 AGI, so I'm probably not the one to ask this too. It reasons, it understands sophisticated topics, it can be given a purpose and pursue it, it can communicate with humans, and it can perform a reasonable task considering its modalities.

I don't really know what any sort of "big leap" beyond this people are expecting, incremental performance for sure. But what else?


I guess for me it needs to have active self-reflection and the ability to act independently/without directions. I'm sure there are many other criteria if I think about it some more, but those two were missing from your list.


This is mostly just that gpt4 API/app have this disabled rather than it’s not capable.

When you enable it, it is pretty shocking. And it’s pretty simple to enable. You just give it a meta instruct to decide when to message you and what to store to introspect on.


As a frequent user of the OpenAI APIs, I don't really know what you are talking about here. Could you point me to some documentation?


At least in 3.5 it's very noticeable when the context drops. They could use summarization, akin to what they are doing when detecting the topic of the chat, but applied to question-answer-pairs in order to "compress" the information. But that would require additional calls into a summarization LLM so I'm really not sure if it is worth it. Maybe they dump some tokens they have on a blacklist or text snippets like "I want to" or replace "could it be that" with "chance of".




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: