Hacker News new | past | comments | ask | show | jobs | submit login

Right, I should have been more clear than the words "active intelligence." But as one use of this... would unlimited, say 1 to 10 billion tokens of context, used as a system prompt, with "just" 32k left for the user, allow a model to be updated every day, in between actual training? (This is in the future, where training only takes a month, or much less.)

I guess part of what I really don't understand is how context tokens compare to training weights, as far as value to the final response. Would a giant context window muddle the value of weights?

(Maybe what I am missing is the human-feedback on the training weights? If the giant system prompt I am imagining is garbage, then that would be bad.)




In-context learning (ICL) is a thing. People aren't entirely sure how it works[1].

LLMs are very effective at few-shot learning via so yes, for all practical purposes yes, large context windows do allow for continuous learning.

Note that the context needs to be loaded and processed on every request to the LLM though - so all that additional information has to be "retaught" each time.

[1] https://openreview.net/pdf?id=992eLydH8G "These results indicate that the equivalence between ICL (ed: In-context-learning) and GD (ed: Gradient Descent) is an open hypothesis, requires nuanced considerations, and calls for further studies.


Thank you so much for your response. It's amazing what typing a little acronym like "ICL" can do as far as sharing knowledge. This is so cool!

Also, your link appears to exactly address my question. It's late here, but I am very excited to do my best at understanding that paper in the morning.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: