Hacker News new | past | comments | ask | show | jobs | submit login
What Does It Mean to Learn? (newyorker.com)
143 points by wallflower 14 days ago | hide | past | favorite | 82 comments




Read Gadamer’s Warheit und Methode (unless you have a pathological aversion to continental philosophy in any form). All understanding is hermeneutic and thus infinitely wheeling us around the hermeneutic circle. It’s the entry-point (without an exit) to answering Plato’s questions about questions. It’s how we learn and deepen language understanding. It’s how we interpret texts. It’s how science works, except that science enshrines its Methode as a sine qua non. It’s how we come to understand other people and why we can be continually surprised (or not!) by them.

The real shame is that academic fashion jerked violently in the direction of Derrida and company in the mid-to-late-60s before anyone had time to really dwell with and appreciate the power of Gadamer’s philosophical hermeneutics. His masterpiece was only published in 1960, when he himself was 60! It’s a mature and deep reflection on themes he had been studying for 40 years. Ricoeur grasped its power, of course, but the whole Habermasian-Derridan-Foucauldian critical project flavored Ricoeur’s approach. Richard Rorty gestured in similar directions, with less depth (and certainly far less phenomenological power).

But invest the time to read Gadamer. It’s worth it.


Your words are beautiful, yet devoid of any information. I hope Gadamer writes differently :)

Information I see:

I think all understanding and therefore all learning is hermeneutic.

I was persuaded by Hans-Georg Gadamer’s magnum opus, Truth and Method.

I think his theory, which he called “philosophical hermeneutics,” is a skeleton key for understanding our understanding.

Right around the time he published the book, there was a large shift in academic fashion towards critical theory and deconstructionism, of which Gadamer is neither.

The result is that Gadamer and his work took a backseat to critical theory.

We lost something important in that shift.

Others recognized the importance of Gadamer’s work but diluted it by trying to merge it with their pet theories (Ricoeur with critical theory, Rorty with American-style pragmatism).

——————————

That said, Gadamer’s work is dense but not impenetrable. And it is beautiful. And profound. Go forth and read!


Of course there’s information there. Perhaps - to take a note from the article - you’re not ready for it at this point in your life.

That sounds like the sort of response I'd expect from a mystery religion or cult. As opposed to say expending part of the contained information into a form that's comprehensible to non-initiates.

Here: https://en.m.wikipedia.org/wiki/Hermeneutic_circle

As for the rest, you really do have to read the book. (There’s no Cliffs Notes version.)


Something like (in software, trying to understand) a spaghetti monolith?

Adding cultural context won't make water flow faster through a tube, or allow planes to fly more efficiently, but you can make the pipes and planes really pretty I guess. Definitely something to read after you are 60.

Yes if you want something with phenomenological flavor that discusses the impact of technological development on human experience in a progressive manner, I suggest Walter Benjamin, who was Heidegger's classmate and developed his philosophy in opposition to him. Here is a good introductory article for him [0]

[0]https://monoskop.org/images/6/6d/Benjamin_Walter_1936_2008_T...


I… don’t know what point you’re trying to make, but I’m pretty sure it’s not relevant to anything I said above.

Would new developments (or old developments that had gone underappreciated) in an adjacent stem field count as ”added cultural context” ?

The postmodern scourge corroded the entire discoursive landscape!

All can be torn down, and Foucault is the arms-dealer of modernity


My favorite thing a history professor ever said to me about Foucault was “well we don’t really read him for figuring out the truth. We just read him for provocative ideas.”

I might check it out. I'll stick with Benjamin for now.

He raises an important point that human learning is continuous and to some extent unavoidable, unlike LLMs (and basically most computing models) which freeze at some point, and as amazing as they can appear, do not update with each interaction (and more or less his point is that even if some system did start updating weights after each interaction, whether that is learning from mistakes or just a model with 1000001 examples instead of 1000000 is debatable.

But the argument that we should not learn while young because "It's not useful to us then" is a ridiculous premise to argue with. Perhaps I misunderstand it. Then saying that things benefit us far down the road, long after the original moment of learning...

I don't understand this line of thinking at all. Linking this to "educability" as a sort of hidden superpower makes me say "oh, really now?".

Like most scientists, he has to focus his ideas on a track with research directions which expand outwards, excite, and provide new directions and conversationality. He may not be aware of doing this.

Rare is the day when we see a psychologist studying things like how applying a new learning technique to class A effects all the other classes the student takes.

Rare is it we study flow state and motivation of the long term and find meaningful ways at a cohort intra-country level to increase happiness at school or final grades across the board, rather than just one course which the researchers focus on. And when that has been done in Scandinavia, who lead in PISA and happiness as adults and teens, scientists in the US ignore the research by and large. It's sad.

This is what I don't like about education research and theories. They can be worthwhile research directions, but diminished to theorizing, rather than application.


There is nothing stopping you in principle to keep learning on every interaction in an LLM. Practical benefits are likely small though.


yes there is. how LLMs "learn" is by training and every new "fact" moves the weights of older learnings. you can not be sure that this movement has not caused past learnings to be forgotten or brought to absurd limits without a full cross training. that is why everyone does training in stages and doesn't just let the models learn gradually daily. for LLMs all "knowledge" is in one big container to be updated all or nothing. Humans on the other hand, learn in different containers. you have your core beliefs that are not touched everytime you read or hear something new, it requires a pretty good shake to make anyone "update" their core beliefs. from there there are several corpuses of knowledge more or less isolated from one-another that may take various amount of "influence" to change but that more or less don't impact each other. for example learning a new foreign vocabulary does not really impact your math knowledge etc.

notice that LLMs "context" chat is not learning, it's a temporary effect that gets lost as soon as the chat closes.


What about fine-tuning?

fine-tuning is not learning, it's controlling the response and you see it's absurd effects in countless examples where in the name of being politically correct the "weights" have been modified also for the past. (classic example of Gemini's German Nazi "representative" photo or even more: https://art-for-a-change.com/blog/2024/02/gemini-artificial-...)

The mechanism for fine-tuning and the original training is exactly the same (gradient descent on weights). The effects you describe are results of what exactly is used to fine tune on.

the mechanism is the same (that's why it impacts all weights) but the target of gradient descent is not the same. in finetuning they aren't saying "go down the mountain" anymore, but "go down toward this plateau" ofcourse this changes the gradient.

is it "learning" in a certain sense? sure, the same way like all indoctrinations are sold as "teaching". the model "learned" to be rapresentative but forgot what 1940 nazi soldier was like...

and fine tuning is not a scalable approach because it has human feeback in the loop. could they fix this error with more fine tuning? yes and they tried abut then the users simply asked "give me a picture a viking warriors" and the problems was again there, you can't fine-tune everything even if we assume the purpose is always noble.


I think all this ideological stuff is completely unrelated to the issue we started talking about. You can fine tune a large model on whatever data you want, and so you can also fine tune it on the most recent user inputs. You can do unsupervised fine-tuning btw, no need for human in the loop. It all depends on what you want to achieve.

popular LLMs for public use already do this. chatGPT with a pro account remembers context from previous interactions.

It's simple really they just have the context inserted as part of the input. It's part of the prompt. The prompt is invisible to you as a user.

input:

   "You are an LLM with long term memory saved as json objects. When talking to the user you will receive a query from the user and some json memory as context. Add to this memory as you communicate. If the json object exceeds 4000 entries trim 100 entries that seem less important. The users query begins now:

   {}

   Hi my name is Bernard. 
   "
Output:

   "
   {"user name": "Bernard"}

   Hi Bernard, How are you?
   "
Context is extracted from the output and is given as input into other conversations.


Remembering context is not the same thing as learning.

There is a concept of in-context learning though where it maybe could be in the right scenario.

I don't know what to learn means. But I know what it feels like. It feels hard. It feels like sucking at the thing I am learning. It feels like if I stick at it, I might be good in in twenty years. It feels like loving doing things I do badly. YMMV.


> It feels like loving doing things I do badly.

If false confidence gives you the ability to learn, the Dunning-Kruger effect isn't so bad after all.


I do things badly and am self aware enough to know it. Dunning-Kruger would require a lack of that self-awareness and in its place a belief that I had expertise.

It would be odd to say you have a thorough understanding of what you're doing badly though, no?

Being aware of your unknown unknowns is the opposite of the Dunning-Kruger effect, though. That would be denying their existence.

“Knowledge is not free; you have to pay attention”(Feynman)


You have to give up a piece of sanity too. Because knowledge ain't real. And, for the true philo of sophy, there's a tipping point.

(To paraphrase HP Lovecraft.)


We live in an age where learning has become an applied mathematics problem. The rigor involved in mathematics makes it closer than ever to a formal definition about what learning actually is.

We do know modelling learning will involve fitting a multidimensional curve among a set of data points. There are multitudes of ways to do curve fitting. Our biggest issue, however, is to imitate the way the human brain does it. For this we still don't know how to do learn like a human. But we do know in it's most basic form that learning can be simply fitting a 2 dimensional straight line among a set of two dimensional data points.


There are many formal definitions of (machine) learning. For example, see:

A Theory of the Learnable L. G. Valiant 1984

https://web.mit.edu/6.435/www/Valiant84.pdf

Also:

Statistical Learning Theory Vladminir N. Vapnik, 1998

https://www.wiley.com/en-us/Statistical+Learning+Theory-p-97...

And:

Language identification in the limit E. Mark Gold, 1964

https://www.sciencedirect.com/science/article/pii/S001999586...


What are those dimensions?

In it's most fundamental form those dimensions represent two quantities. Those quantities can be modelled as numbers in mathematics or given arbitrary symbolic categorical names.

At a minimum we need two dimensions. At least one for input and at least another one for output.


Leslie Valiant was on Sean Carroll's podcast to talk about the same topic. https://www.youtube.com/watch?v=FHW-nBIZc2g


> knowledge almost never arrives at the moment of its application

Anyone who has used StackOverflow (or ChatGPT) to solve a programming problem knows this is completely false. Has the author never heard of “learn by doing”?


Mimicry is not mastery.

A correct answer is not necessarily imply understanding.

Code reviews and incident reports should prove that true.


mimicry is not mastery but mimicry can lead to mastery.

Disagree. It's when the mimicry fails and the executor is forced to think about why and then draws a new conclusion - that right there is learning and is what eventually leads to mastery.

How can mimicry fail if the mimicry wasn't attempted? Mimicry again leads to mastery.

Additionally not all learning requires some deeper understanding. Sometimes the attempt to mimic is all that there is.


Experience vs Education.


Experience is part of education.


At one level, this is word-games. I wouldn't personally construct a sentence with "educability" in it very often, but I have read them, and the sense of the word is not a million miles from many of the others cropping up in this thread which go to the capacity to learn, distinct from the amount of knowledge acquired.

Personally, I think there's a set of outcomes here. To acquire knowledge usefully demands both retention and some ability to apply abstraction/synthesis to the knowledge acquired. Simply being a repositary of knowledge is not in itself usually useful, unless you want to win a competition for the most digits of pi, or ability to recite the Koran from memory, but latent within that ability are other skills which we often decry as "not intelligent" or "simply parroting facts" -the amount of facts you can recall aren't useful in the same way that synthesis over those facts is.


Or hey, the courage to even act upon the most basic of conclusion. Being useful is very different from being aware or intelligent. It's sometimes more useful to be ignorant and brave, than scholarly and asleep high in an ivory tower.

Knowledge, or its synthesis into derivative knowledge, have nothing to do with utility I think ? It's sometimes pointless to acquire knowledge, and useful to act upon the world. So I don't really understand what you meant by "useful" yourself there.

What is the goal to reach ?


There's interior applicability to knowledge: it feeds thought and the inductive reasoning process. If you have never been exposed to TV, being exposed to TV changes how you think.

There's exterior applicability to knowledge: feed a man a fish vs teach a man to fish.

I see both as "utility" although I suppose achieving nirvana is seen as utility in other people's domains. It's subjective.


By analogy is one of the most powerful tools that humanity has. Using a rod to get a lure where you want it might cross over later in to getting a rope across a ravine.

“The Importance of Being Educable,” he argues that it’s key to our success. When we think about what makes our minds special, we tend to focus on intelligence. But if we want to grasp reality in all its complexity, Valiant writes, then “cleverness is not enough.”

Isn't' this IQ? "Educable” seems like the latest iteration of 'multiple intelligences' or the 'street smarts' vs. 'book smarts' distinction.


IQ is just the wrong way to approach the subject, to be honest. Metacognition plays the biggest role in learning by far and is trainable. But in order to think properly about your thinking, you need the requisite declarative, procedural, and conditional knowledge.

Most people, including "intelligent" people, have no clue how learning works. Learning scientists actually study student perceptions of their learning and learners are terrible at selecting appropriate learning techniques. Even when forced to use effective learning techniques, they rate the effective strategies (as determined by their outcomes) as least effective.

We need to start teaching people how learning works to begin with.


> But isn't the ability to learn also a major component of IQ?

Not sure what you mean by this - it's certainly not something a single-day IQ test could possibly measure! The primary reason IQ is a discredited measure of intelligence is that people are perfectly able to learn how to perform better on IQ tests - any supposed influence of "intelligence" on IQ scores is hopelessly confounded with how much you've practiced similar tests / trivial logic puzzles / etc.

This stuff about "attempts to broaden the definition of intelligence to something that is more inclusive" is backwards. The whole problem is that nobody has managed to scientifically define (let alone measure) intelligence in any vertebrate species. IQ is dangerously misleading precisely because it is so narrow: its precision makes claims about IQ seem quantitatively rigorous when they are qualitatively meaningless.


The ‘ability to learn’ idea may not be so novel, but I am not sure if I agree with calling it ill defined or it having an agenda for more inclusivity. Generally, in schools, at least from my experience, it will still mostly be abstract thinking and memory capacities that are measured. Curiosity, and the ability to turn that curiosity in new knowledge, also broadens the mind (in addition to abstract thinking and remembering things), which can be an enormous quality that is highly valued on the job market, arguably more than what is mostly measured at schools. I think that’s the point the author tries to make.


While this article may be making a good point, in its construction it's quite weak. It reminds of something an old reporter once told me: "Any article whose headline is a question, fails to answer that question." Corollary: If the article had a point, it would have stated the point in the headline. But the writer didn't do the work, so it has no point to make.

Sidenote: it doesn't seem to me that the writer understands AI, its moving frontier, and its likely near-term abilities.


> This sounds chancy and vague, until you reflect on the fact that knowledge almost never arrives at the moment of its application. You take a class in law school today only to argue a complicated case years later; you learn C.P.R. years before saving a drowning man; you read online about how to deter a charging bear, because you never know.

Law students write papers, ER students to CPR on dummies, and when I learn any new concept, framework or language, I build something with it. Don't toy and learning projects count?


Yes but that’s like a robot learning in a simulation (which is rapidly accelerating now).

No matter how accurate the simulation, the real world is the true test. One could easily see a scenario where a robot is trained to handle a rare event and may or may not ever use that training.


'Educability' sounds a bit like the ancient Chinese saying, of how only empty cups can be filled:

> The "Empty Cup" or "The Empty Vessel" parable: 'A cup that is already full, whether with knowledge, opinions, or experiences, cannot be filled again. It is only the empty cup that can truly learn and absorb new information.'


And then 'intelligence is compression' is often said.


No, it’s mostly realizing that the cup was not full as previously thought. Either by discarding wrong notions or discovering new themes to explore. In the end, we understand that there’s no cup at all.


Modern version of this is 'Stay Hungry, Stay Foolish' i.e Hungry enough to work harder and foolish enough to learn more.

This article doesn't do a good job of getting at the main points of Valiant's book Educability, in my view. You can see some of them in e.g. this talk he gave here: https://www.youtube.com/watch?v=W4fIoLGjFtM

He makes various arguments in the book that I disagree with, two of which I've put below. On the whole I think it is directionally correct though, and worth reading.

The first quibble I have is about humanity's most characteristic trait. In the book, he writes: "The mark of humanity is that a single individual can acquire the knowledge created by so many other individuals. It is this ability to absorb theories at scale, rather than the ability to contribute to their creation, that I identify as humanity's most characteristic trait".

I don't think that ability to acquire knowledge from other people is our most most characteristic trait. Creativity is. Learning is a form of knowledge-creation, and it is a creative process. We don't passively "absorb" theories when we learn from someone else. Instead, we actively look for and attempt to resolve problems between our existing ideas and the new ideas to create something new.

Another thing I disagree with is when he touches on AGI. He makes the argument that "we should not be fearful of a technological singularity that would make us powerless against AI systems". This is because it will "asymptote, at least qualitatively, to the human capability of educability and no more".

This is reminiscent of David Deutsch's argument that people are universal explainers, and AGI will also be a universal explainer; there is nothing beyond such universality, so they will not fundamentally be different from us (at least, there is nothing that they could do that we couldn't in principle understand ourselves).

I think this is true, but it misses something. It doesn't address the point that there is a meaningful difference between a person thinking at 1x speed (biological human speed) and a person thinking at e.g. 100000x speed (AGI running on fast hardware). You can be outsmarted by something that wants to outsmart you, even if you both possess fully universal educability/creativity, if it can generate orders of magnitude more ideas than you can per unit time. Whether we should be fearful or not about this is unclear, but I do think it is an important consideration.

His overall message though, is good and worth pondering: "Educability implies that humans, whatever our genetic differences at birth, have a unique capability to transcend these differences through the knowledge, skills, and culture we acquire after birth. We are born equal because any differences we have are subject to enormous subsequent changes through individual life experience, education, and effort. This capacity for change, growth, and improvement is the great equalizer. It is possible for billions of people to continuously diverge in skills, beliefs, and knowledge, all becoming self-evidently different from each other. This characteristic of our humanity, which accounts for our civilization, also makes us equal."


> there is nothing that they could do that we couldn't in principle understand ourselves

It is trivial to prove otherwise - AlphaZero move 37. After 4,000 years of gameplay (yes, it is that old!) we still didn't get this level of insight in its strategy.

The core ability of humans might not be learning but search. Creativity is just an aspect of search. AlphaZero was both searching and learning. It's what we do as well, we search and learn. Science advances by (re)search. Art searches for meaningful expression. Even attention is search. Even walking is - where will I place my next foot?

Why is search a better concept that creativity? Because it specifies both the search space and the goal. Creativity, intelligence, understanding and consciousness - all of them - specify only the subjective part, omitting the external, objective part. Search covers both, it is better defined, even scientifically studied.


AlphaZero was an example of "learn by doing", not education.


That's weird, when I went and got an education, a huge portion of that was 'doing'.

Fair, education is the wrong word really. That said there's a difference between learning by exercises (AlphaGo) and learning "in production". Synchronous vs asynchronous or something, is there a better name?

> It is trivial to prove otherwise - AlphaZero move 37. After 4,000 years of gameplay (yes, it is that old!) we still didn't get this level of insight in its strategy.

Are you saying that AlphaZero contains knowledge that we can't understand, even in principle? It is somehow beyond science, beyond all explanation?

> Why is search a better concept that creativity?

Search suggests a fixed set of options, whereas what is crucial is creating new ones.


A very accomplished older professor once told me in grad school that “research” was a process of first intuiting patterns, and then “searching” for further examples of said pattern, and then “re-searching” until you had statistical confirmation.

I very much agree.


You don't keep searching until a point of "statistical confirmation". This implies you have arrived at an infallible truth. Instead you look for ways you could be wrong, and try to correct any errors you find.

For instance, if you guess 'all swans are white', you don't ever get "statistical confirmation" that your guess was right. When you eventually see a black swan, you find out you were wrong. Then it's time to come up with a new theory.


There are very large search spaces. Consider the space of all text documents (Borge's Library of Babel), which includes all research papers and all novels. Also, the space of all mathematical theorems, the space of all images, all videos, all songs, whatever evolution searches over.

These cover many creative activities.

But it's true that some search spaces are less well-defined.


I think one thing we've learned in the past 5 years or so is how inadequate our vocabulary is to describe all the different aspects of intelligence and consciousness (and really just psychology in general). Everything is so handwave-y. What is educability, exactly, in a formal sense -- how could we quantify it and measure it? Mostly we've tried to answer questions like that by writing tests and trying to tease out some reliable measure from it, but that requires so many layers of indirection -- it would be much better to examine the internal state and activity of a "thinking system" directly, something that is rarely possible in humans. I think one way to show that the tests are inadequate is to read the responses to people when those tests are applied to AI. People insist that they simply don't measure what they're supposed to measure in people when they're applied to AI, and for all anyone knows, they may be right -- but _why_? What _exactly_ are those tests measuring, and how could we measure it in a way that _would_ apply to artificial intelligence?

These are philosophical questions that really, despite our best efforts, have never transitioned to a true science, and philosophy has been working on it for thousands of years. We've been hamstrung by the fact that as far as we knew, we were the only intelligent beings in the universe, so it's extremely difficult take any aspect of "the way we think" and separate it, by finding some system that thinks in some ways like us, but in other ways doesn't. It's really only been since we've had large neural networks that anything has approached the way we think in _any_ aspect, so this is probably a once-in-history opportunity to formalize and systematize all of this.


> it would be much better to examine the internal state and activity of a "thinking system" directly

The particular problem here is complexity management. The 'problem' with thinking is it is a system, very possibly one of the largest and most complex systems we know about. Thinking as we know it in DNA based life is something that's at least 500 million years old and developed a few bits at a time. There is no unwinding the different components of the thinking from each other. Motor skills, reactions, learning, etc are all compressed and mixed together in the same code.

So, going from top down isn't working. But going the other way isn't working either. The computational complexity of a system that gets anywhere close to a biological thinking system requires quadrillions of calculations.


> I don't think that ability to acquire knowledge from other people is our most most characteristic trait. Creativity is.

Chimps (and some other animals) can be very creative: https://www.youtube.com/watch?v=fPz6uvIbWZE


“At least qualitatively” is doing a lot of work in that sentence. A computer has the same capability as a human to do any computable algorithm, “at least qualitatively”. But Google search (to name one example) is so far beyond human practical ability that calling it “qualitatively” equivalent is not useful.


This is blogspam for a book:

Leslie Valiant, an eminent computer scientist who teaches at Harvard, sees this as a strength. He calls our ability to learn over the long term “educability,” and in his new book, “The Importance of Being Educable,” he argues that it’s key to our success.


This is commentspam for self-importance.


You're absolutely right and it's a trollish question to make people think they have a novel definition to something everyone already knows.

People really hate someone pointing out that they have been manipulated by clickbait and cheap marketing though.


They just hate being manipulated.

So by your measure, would Hacker News be spam for websites and blogs?


Admit it, you just wanted an excuse to say blogspam


To learn is to teach one's self!


language blocks learning because it reduces thought into words

shameless plug: AIs don't learn, they get indoctrinated

https://x.com/edulix/status/1827493741441249588




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: