Hacker News new | past | comments | ask | show | jobs | submit login

He raises an important point that human learning is continuous and to some extent unavoidable, unlike LLMs (and basically most computing models) which freeze at some point, and as amazing as they can appear, do not update with each interaction (and more or less his point is that even if some system did start updating weights after each interaction, whether that is learning from mistakes or just a model with 1000001 examples instead of 1000000 is debatable.

But the argument that we should not learn while young because "It's not useful to us then" is a ridiculous premise to argue with. Perhaps I misunderstand it. Then saying that things benefit us far down the road, long after the original moment of learning...

I don't understand this line of thinking at all. Linking this to "educability" as a sort of hidden superpower makes me say "oh, really now?".

Like most scientists, he has to focus his ideas on a track with research directions which expand outwards, excite, and provide new directions and conversationality. He may not be aware of doing this.

Rare is the day when we see a psychologist studying things like how applying a new learning technique to class A effects all the other classes the student takes.

Rare is it we study flow state and motivation of the long term and find meaningful ways at a cohort intra-country level to increase happiness at school or final grades across the board, rather than just one course which the researchers focus on. And when that has been done in Scandinavia, who lead in PISA and happiness as adults and teens, scientists in the US ignore the research by and large. It's sad.

This is what I don't like about education research and theories. They can be worthwhile research directions, but diminished to theorizing, rather than application.




There is nothing stopping you in principle to keep learning on every interaction in an LLM. Practical benefits are likely small though.


yes there is. how LLMs "learn" is by training and every new "fact" moves the weights of older learnings. you can not be sure that this movement has not caused past learnings to be forgotten or brought to absurd limits without a full cross training. that is why everyone does training in stages and doesn't just let the models learn gradually daily. for LLMs all "knowledge" is in one big container to be updated all or nothing. Humans on the other hand, learn in different containers. you have your core beliefs that are not touched everytime you read or hear something new, it requires a pretty good shake to make anyone "update" their core beliefs. from there there are several corpuses of knowledge more or less isolated from one-another that may take various amount of "influence" to change but that more or less don't impact each other. for example learning a new foreign vocabulary does not really impact your math knowledge etc.

notice that LLMs "context" chat is not learning, it's a temporary effect that gets lost as soon as the chat closes.


What about fine-tuning?


fine-tuning is not learning, it's controlling the response and you see it's absurd effects in countless examples where in the name of being politically correct the "weights" have been modified also for the past. (classic example of Gemini's German Nazi "representative" photo or even more: https://art-for-a-change.com/blog/2024/02/gemini-artificial-...)


The mechanism for fine-tuning and the original training is exactly the same (gradient descent on weights). The effects you describe are results of what exactly is used to fine tune on.


the mechanism is the same (that's why it impacts all weights) but the target of gradient descent is not the same. in finetuning they aren't saying "go down the mountain" anymore, but "go down toward this plateau" ofcourse this changes the gradient.

is it "learning" in a certain sense? sure, the same way like all indoctrinations are sold as "teaching". the model "learned" to be rapresentative but forgot what 1940 nazi soldier was like...

and fine tuning is not a scalable approach because it has human feeback in the loop. could they fix this error with more fine tuning? yes and they tried abut then the users simply asked "give me a picture a viking warriors" and the problems was again there, you can't fine-tune everything even if we assume the purpose is always noble.


I think all this ideological stuff is completely unrelated to the issue we started talking about. You can fine tune a large model on whatever data you want, and so you can also fine tune it on the most recent user inputs. You can do unsupervised fine-tuning btw, no need for human in the loop. It all depends on what you want to achieve.


popular LLMs for public use already do this. chatGPT with a pro account remembers context from previous interactions.

It's simple really they just have the context inserted as part of the input. It's part of the prompt. The prompt is invisible to you as a user.

input:

   "You are an LLM with long term memory saved as json objects. When talking to the user you will receive a query from the user and some json memory as context. Add to this memory as you communicate. If the json object exceeds 4000 entries trim 100 entries that seem less important. The users query begins now:

   {}

   Hi my name is Bernard. 
   "
Output:

   "
   {"user name": "Bernard"}

   Hi Bernard, How are you?
   "
Context is extracted from the output and is given as input into other conversations.


Remembering context is not the same thing as learning.


There is a concept of in-context learning though where it maybe could be in the right scenario.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: