Bizarre article. Just a rant from someone incredibly out-of-touch and who is mis...

leoh · on March 10, 2023

This is arguably bizarre and out of touch comment too, which is merely adds fuel to a fire blazing in the comments section of HN, which is not particularly reputable for its opinions on anything except software (and even then is frequently rather questionable).

^ I hasten to add: some snark intended for effect

It’s a NYT Opinion piece, which means it doesn’t come with citations. Let’s not ignore the medium and it’s conventions here.

It is a bummer that such a weighty argument was in fact conveyed in this citation-free medium, given that Chomsky is engaging with such a weighty subject.

But that is an entirely distant matter.

And it would probably be far more productive to step back and realize the limitations of the medium and instead ask “what are the citations here?” (or seek them out for oneself, or ask for help finding them) and then seek to evaluate them on their specific merits; as opposed to choosing the least charitable interpretation and effectively resorting to an ad hominem (“this man is out of touch; I’m done here.”) or merely saying “we don’t know that!” (ibid.) without any apparent reference to any kind of thoughtful or careful literature regarding the subject at hand.

Unless you too are an established academic with decades of research in a field which is profoundly cognate to neuroscience?

wilg · on March 11, 2023

??? I wasn't talking about citations at all?

leoh · on March 13, 2023

You are questioning Chomsky’s premise, which is almost certainly supported by implicit citations (that do not appear due to the medium they are presented in); your arguments, though not entirely unreasonable, are presumably not

leoh · on March 10, 2023

s/distant/different

YeGoblynQueenne · on March 10, 2023

>> Think of all the data that has entered all your senses in your entire lifetime. More than goes into ChatGPT, I'll tell you that.

The question is how much of that was only text data, or only language anyway. Th e answer is- not that much, really. Chomsky's famous point about "the poverty of the stimulus" was based on research that showed human children learn to speak their native languages from very few examples of it spoken by the adults around them. They certainly don't learn from many petabytes of text as in the entire web.

If you think about it, if humans relied on millions of examples to learn to speak a language we would never have learned to speak in the first place. Like, back whenever we started speaking as a species. There was certainly nothing like human language back then, so there weren't any examples to learn from. Try that for "zero-shot learning".

Then again, there's the issue that there are many, many animals that receive the same, or even richer, "data" from their senses throughout their lives, and still never learn to speak a single word.

Humans don't just learn from examples, and the way we learn is nothing like the way in which statistical machine learning algorithms learn from examples.

wilg · on March 11, 2023

Thinking about it as "text data" is both your and Chomsky's problem -- the >petabytes of data aren't preprocessed into text. They're streams of sensory input. It's not zero shot if it's years of data of observing human behavior through all your senses.

Other animals receiving data and not speaking isn't a good line of argument, I think. They could have very different hardware or software in their brains, and have completely different life experiences and therefore receive very different data. Notably, where animals and humans do have much potentially learned (or learned through evolution) behavior in common -- such as pathfinding, object detection, hearing, and high level behaviors like seeking food and whatever else.

YeGoblynQueenne · on March 12, 2023

>> Thinking about it as "text data" is both your and Chomsky's problem -- the >petabytes of data aren't preprocessed into text. They're streams of sensory input. It's not zero shot if it's years of data of observing human behavior through all your senses.

I'm a little unsure what you mean. I think you mean that humans learn language not just from examples of language, but from examples of all kinds of concepts in our sensory input, not just language?

Well, that may or may not be the case for humans, but it's certainly not the case for machine learning systems. Machine learning systems must be trained with examples of a particular concept, in order to learn that concept and not another. For instance, language models must be trained with examples of language, otherwise they can't learn language.

There are multi-modal systems that are trained on multiple "modalities" but they can still not learn concepts for which they are not given specific examples. For instance, if a system is trained on examples of images, text and time series, it will learn a model of images, text and time series, but it won't be able to recognise, say, speech.

As to whether humans learn that way: who says we do? Is that just a conjecture proposed to support your other points, or is it something you really think is the case, and believe, based on some observations etc?

wilg · on March 12, 2023

I think you’re missing the meat of my point. The stuff LLMs are trained on is in no way similar to what human brains have received. It’s a shortcut to train them directly on text tokens. Because that’s the data we have easily available. But it doesn’t mean the principles of machine learning (which are loosely derived from how the brain actually works) apply only to text data or narrow categories of data like you mentioned. It just might require significantly more and different input data and compute power to achieve more generally intelligent results.

What I believe personally is I don’t think there is any reason to rule out that the basics of neural networks could serve as the foundation of artificial general intelligence. I think a lot of the criticism of this sort of technology being too crude to do so is missing the forest for the trees.

I have a brain and it learns and I’ve watched many other people learn too and I see nothing there that seems fundamentally distinct from how machine learning behaves in very general terms. It’s perfectly plausible that my brain has just trained itself on all the sensory data of my entire life and is using that to probabilistically decide the next impulse to send to my body in the same way an LLM predicts the most appropriate next word.

YeGoblynQueenne · on March 13, 2023

>> But it doesn’t mean the principles of machine learning (which are loosely derived from how the brain actually works) apply only to text data or narrow categories of data like you mentioned.

When you say "the principles of machine learning", I'd like to understand what you mean.

If I were talking about "principles" of machine learning, I'd probably mean Leslie Valiant's Probably Approximately Correct Learning (PAC-Learning) setting [1] which is probably the most popular (because the most simple) theoretical framework of machine learning [2].

Now, PAC-Learning theory is probably not what you mean when you say "principles of machine learning", nor is it any of the other theories of machine learning we have, that formalise the learnability of classes of concepts. That's clear because none of those theories are "derived from how the brain actually works", loosely or not.

Mind you, there isn't any "principle", of machine learning, anyway, that I know of that is really "derived" from how the brain actually works; because we don't know how the brain actually works.

So, based on all this, I believe what you mean by "principles of machine learning" is some intuition you have about how _neural networks_, work. Those were originally defined according to then-current understanding of how _neurons_ in the brain "work". That was back in 1943, by Pitts and McCulloch [3], what is known as the Perceptron. That model is not used any more and hasn't for many years.

Still, if you are talking about neural networks, your intuition doesn't sound right to me. With neural nets, like with any other statistical learning approach, when we train on examples x of a class y, we learn the clas y. If we want to learn clases y', y", ... etc, we must train on examples x', x", ... and so on. You have to train neural nets on examples of what you want them to learn, otherwise, they won't learn, what you want them to learn.

The same goes with all of machine learning, following from PAC-Learning: a learner is given labelled instances of a concept, drawn from a distribution over a class of concepts, as training examples. The learner can be said to learn the class, if it can correctly label unseen instances of the class with some probability of some degree of error, with respect to the true labelling.

None of this says that you can train a nerual net on images and have it learn to generate text, or vice-versa, train it on text and have it recognise images. That is certainly not the way that any technology we have now works.

Does the human brain work like that? Who knows? Nobody really knows how the brain works, let alone how it learns.

So I don'tthink you're talking about any technology that we have right now, nor are you accurately extrapolating current technology to the future.

If you are really curious about how all this stuff works, you should start by doing some serious reading: not blog posts and twitter, but scholarly articles. Start from the ones I linked, below. They are "ancient wisdom", but even researchers, today, are lost without them. The fact that most people don't have this knowledge (because, where would they find it?) is probably why there is so much misunderstanding on the internet of what is going on with LLMs and what they can develop to in the long term.

Of course, if you don't really care and you just want to have a bit of fun on the web, well, then, carry on. Everyone's doing that, at the moment.

____________

[1] https://web.mit.edu/6.435/www/Valiant84.pdf

[2] There's also Vladimir Vapnik's statistical learning theory, Rademacher complexity, and older frameworks like Learning in the Limit etc.

[3] https://www.cs.cmu.edu/~./epxing/Class/10715/reading/McCullo...

calf · on March 11, 2023

Not OP, but I'm not convinced by the talking point that a baby has an equivalent or greater petabytes of data because they are immersed in a sensory world. I can't quite put my finger on it but my feeling is that that line of reasoning contains a kind of category error. Maybe I'll wake up tomorrow and have a clearer idea of my objection, but I've seen your talking point echoed by many others as well, and this interests me.

wilg · on March 12, 2023

What is all the “video” and “audio” and other sensory input but petabytes of data streaming into your brain? Seems like a pretty objectively measurable concept, right?

julienchastang · on March 10, 2023

The article was full of cherry-picked examples and straw man style argumentative techniques. Here are a few ways I have used ChatGPT (via MS Edge Browser AddOn) recently:

- Generate some Dockerfile code snippets (which had errors, but I still found useful pointing me in the right direction).

- Help me with a cooking recipe where it advised that I should ensure the fish is dry before I cook it in olive oil (otherwise the oil will splash).

- Give me some ideas for how to assist a child with a homework assignment.

- Travel ideas for a region I know well, yet, I had not heard of the places it suggested.

- Movie recommendations

Yes, there are a lot of caveats when using ChatGPT, but the technology remains remarkable and will presumably improve quickly. On the downside, these technologies give even more power to tech companies that already have too much of it.

cbozeman · on March 10, 2023

Yeah, this is actually really ridiculous... the human mind is nothing *but* a pattern matcher. It's like this writer has no knowledge of neuroscience at all, but wants to opine anyway.

toss1 · on March 10, 2023

>> "the human mind is nothing but a pattern matcher"

wow, tell me you know only a tiny bit of neuroscience without telling me you know only a tiny bit of neuroscience ...

For starters, the myriad info filtering functions from the sub-neuron level up to the structural level are entirely different from pattern matching (and are not in these LLMs)

didntreadarticl · on March 10, 2023

To be fair, Noam Chomsky is 94 years old now

julienchastang · on March 10, 2023

So what? Henry Kissinger who is 99 just wrote a fantastic article [0] (along with Eric Schmidt and Daniel Huttenlocher) about the AI revolution recently. (Much more worthwhile than the Chomsky piece.)

[0] https://www.wsj.com/articles/chatgpt-heralds-an-intellectual...

wilg · on March 10, 2023

I think I'm being fair – he published this in the New York Times for gosh golly sake.