Hacker News new | past | comments | ask | show | jobs | submit login
Yann LeCun: ChatGPT is 'not particularly innovative' (zdnet.com)
78 points by jonbaer on Jan 26, 2023 | hide | past | favorite | 177 comments



It's amazing how many researchers underestimate the importance of UX and design.

Personally, seeing non-technical people using prompts for the first time -- and getting results that make sense -- is so incredible. Their eyes light up, they are surprised, and they want to keep playing with it. Amazing!

A sizeable part of the population just got access to an incredibly complex AI model and can play with it. What a collective experience.


Clarke's third law: "Any sufficiently advanced technology is indistinguishable from magic."

Most people don't understand that ChatGPT has no idea what they're talking about. It lacks it's own thought and heuristic patterns, and only gives you the most-likely response to your prompt. People don't know that though, so they think the Mechanical Turk is actually playing chess.

I mostly agree with the headline here. ChatGPT is hardly any more innovative than a Markov chain.


> ChatGPT has no idea what they're talking about. It lacks it's own thought and heuristic patterns, and only gives you the most-likely response to your prompt

Funny thing is, same could be said about lots of people. Try listening to any political debate.

Which leads me to believe that the thing that's missing from AI is the same thing that we miss in those political debates: ability to explain and justify your own thought process.

As long as AI is a blackbox, we won't consider it to be a real intelligence.


There's the argument that any AI built on silicon is doomed to be analogous to the Chinese Room* because it is reducible to its parts.

It's interesting that I am also reducible to my atomic parts.

I may be a Chinese Room that has not yet experienced fully looking down at my self. I don't have enough self-recursive input yet to see through the illusion.

* https://en.wikipedia.org/wiki/Chinese_room#Chinese_room_thou...


An interesting way to escape being reducible to your atomic parts is quantum consciousness. Not that I believe in it, I just find the theory too beautiful to ignore.


A beauty that means to blind the mind and stop you looking further. Penrose is a dead end.


You can definitely choose to interpret it that way, but you don't have to.


Searle didn't understand his own model. The Chinese Room is intelligent, even if the librarian isn't.


in my opinion, the person/entity who wrote the books with the lookup tables is intelligent, the room is not …


If it weren’t that remarkable, how come Siri, alexa, and whatever google calls it’s unused voice assistant are so useless? If it were not a substantial advance, why are chatbots so useless? The reality is, if Alexa had half the ability to “understand” as chatgpt it wouldn’t be nearly as frustrating. It’s not like Amazon, apple, and google haven’t dumped metric tons of investment into these things.

Y’all simply have lost the ability to be amazed.


It's more pragmatic than that. Google can afford to run a 300M weights model on every search or click, but can't afford a 175B generative model for the same thing. Generative models require a forward pass per token, all done serially, while embedding or classification models only need one pass no matter the length of the input.

It's too expensive for Google.


How could it be too expensive? OpenAI charges a penny per response. That's cheap enough to wrap in a premium subscription or product price.


One of the best thing about google is that outta free. Charging a premium subscription for Google is no longer google but a different product all together


In my example of a voice assistant the lifetime fees are present valued into the upfront costs. These cloud enabled devices incorporate a portfolio effect pricing model for service usage.

Google can charge advertisers more if they can somehow figure out how to extract more personal information from you and shove ads in your face posing as information.


Do you think this is also the reason, why Translate is still not as good as e.g. DeepL?


> and only gives you the most-likely response to your prompt

That "only" carries a lot of weight on its back.


The most likely response is not necessarily an accurate response. It claims dogs do not give birth to babies. That is indeed the most likely response as babies are mostly, but not exclusively, associated with humans. However, a five year old child would understand me if I told them my dog had babies.


Have you used chatgpt much?

I asked your prompt:

Do dogs give birth to babies?

> Dogs give birth to puppies.

I had a long discussion about the expansion do space and the Hubble constant, and tried to trick it in a variety of ways. It was correct in everything even though it wasn’t able to go into proofs and likewise. I then tried to convince it that the universe began as a small potato and wasn’t able to do it without directly telling it to assume it is and to create fiction off that concept. It was then able to make a credible theory of what the universe would be like if it started as a small potato, even going into how the texture of the potato would have caused variations in the density of mass, etc. Even then it would find ways to sneak in this isn’t true and that the universe was never a small potato. This is remarkable no matter the actual reality of what’s going on behind the scenes.

I would also note that human thought is just a bunch of gradient functions in the electrochemical space, and thus also follows a most likely path.

Edit: to head it off at the pass, here is what it says with your other prompt:

My dog gave birth to babies

> Congratulations on the new additions to your family! If this is your dog's first litter, it is important to make sure that both the mother and the puppies receive proper care and attention. This includes providing a clean and safe environment for the puppies, as well as making sure the mother is healthy and well-fed. It's also important to keep an eye on the puppies for any signs of illness or distress, and to consult with a veterinarian if you have any concerns.


Yes, I have tried ChatGPT, and it told me dogs do not give birth to babies. However, ChatGPT doesn't provide deterministic answers and improves responses over time. Many of the internet examples no longer work.

I asked, "If a woman gives birth to a baby in 9 months, how many months does a dog take to give birth to a baby."

It answered, "Dogs do not give birth to babies."

Currently, it answers, "On average, a dog's gestation period (the amount of time it takes for a dog to give birth to puppies) is approximately nine weeks or 63 days."

It still doesn't understand pregnancy is a concurrent event (women can be pregnant at the same time)

Q: If a woman takes nine months to give birth to a baby, how long does it take a billion women to give birth to one baby?

A: It would take approximately 2.7 million years for a billion women to give birth to one baby.

I do love ChatGPT to help me write certain documents. For example, for a performance review, you can give it bullet points and ask to write a review. After correcting it with Grammarly, it is usually better than what I would have written.


>I asked, "If a woman gives birth to a baby in 9 months, how many months does a dog take to give birth to a baby." It answered, "Dogs do not give birth to babies."

to be fair, I'd probably say the same thing lol. "Baby" in this context was already primed to mean "human child."

>Q: If a woman takes nine months to give birth to a baby, how long does it take a billion women to give birth to one baby?

>A: It would take approximately 2.7 million years for a billion women to give birth to one baby.

I asked the same and it answered "It would still take 9 months for a billion women to give birth to one baby, as the length of time it takes for a woman to give birth is not affected by the number of women giving birth simultaneously."


> For example, for a performance review, you can give it bullet points and ask to write a review.

Please don't do this. It obfuscates what you are trying to communicate. You are harming your communication and your relationship.


Still not as bad as it's going to be. I'd wager that within a year someone's going to write a script that compares your GitHub contributions against your peers, feeds all of your Slack/email conversations into ChatGPT for professionalism, tone, and helpfulness analysis and feeds it back into ChatGPT to produce a completely automated "performance review".


That's fine because within a year my and my peers are going to let ChatGPT write all our code and handle all our code review requests. It's going to be AI all the way down.


I am after a rewrite in more elegant language than I can produce myself. I do not auto-accept the review; it is a good starting point for me to tweak. Any written performance review is followed up by a face-to-face meeting to clarify anything written down. ChatGPT is a tool, not magic, however, I find it helps with writers block.


As far as most people can be fooled by coherent-looking text.


Hold on, at least for me, 90% of the stuff is not just coherent-looking, but coherent. I do think that when it's wrong, it should give a heads up about not being certain about a subject. It's certainly tweaked to sound almost overconfident about every subject, which gives off a bullshitty vibe when the details of the explanation are wrong.

What sorts of subjects have you been trying it out with?


I’ve had pretty complex discussions about Buddhism, physics, and other subjects and it was generally erudite and accurate. In its current unrefined form it’s more useful than google is at providing understanding on most subjects, especially because it will attempt to answer my direct questions rather than providing documents with terms in them. In fact I think it’s probably one of the most useful tools I’ve ever used.

Has it given me wrong information? Absolutely. But it’s always been pretty obviously wrong, and I often use it to introduce me to a subject then follow up with google to verify details. I further fully expect this to improve.


Yes, I've been going back and forth between Google and ChatGPT quite a bit lately. Sort of using Google as the verification step, after getting deeper into a subject.


For the haters, it seems incredibly likely that an IR system like google will be augmenting LLM and semantic reasoning systems to form a solution to the problems folks point. The problems chatgpt suffer from are solved and are complementary.


I think part of the point here is that ChatGPT has no clue what you might mean here by "certainty" or being "wrong," and the fact that people have the impression that it could have an idea of those concepts is indicative of people's poor understanding of what ChatGPT really is


Could you elaborate on this?

It definitely is able to decipher whether it has knowledge about the future, or some specific political events. This is obviously a pretty straightforward bonus layer on top of the model itself, but couldn't there be an extrapolation of that system where it's not binary, but rather a range between 0 and 1? I'd imagine this wouldn't be the model itself doing the crunching of the previous tokens here, at least not the same instance of it, as it could be stuck in whatever character or loop of reasoning it has going on at the moment.


Are you talking about the same people who couldn't agree about a vaccine? How is logical consistency going with humans?

Not to mention that a mere 700 years ago we were dying of bubonic plague and with all our general intelligence could not muster up the germ theory of disease. Not even to save our lives, you see, we are not generally efficient, it depends on century.

We are dependent on experimental results carefully constructed to verify our theories, theories which start like chatGPT's random bullshit initially, random words following a probability distribution in our heads. Even deep learning is often touted as modern alchemy - why don't we just understand?

Verification does wonders to language models. Humans have more verification and interactive experiences, so we think ourselves superior. But an AI could have the same grounding with us. Like AlphaZero who became a super-human Go player without ever looking at human games - learned it from lots of verification. And CICERO the model playing Diplomacy in natural language.

Just set AI up with a verification loop to see wonders. Predicting the next word correctly is just one of the ways AI can learn.


> Most people don't understand that ChatGPT has no idea what they're talking about.

I wonder why people keep saying this. Is it some type of psychological defense to future AI displacement?


It's because ChatGPT's knowledge doesn't come through interaction with the world. Words mean things to us because they point to world interactions. I saw a demo where ChatGPT could create a VM. It could be trained to interact directly with a VM, send commands to an interpreter. In this case, it would understand the response of the VM, although it wouldn't understand the design behind the VM, because humans did that based on interaction with the physical world.


> It's because ChatGPT's knowledge doesn't come through interaction with the world.

We don't interact with the world directly, either. All we have are signals mediated through our nervous system.


Sure, but what happens when you touch a hot stove?


The built-in weights in your brain are modified to discourage you from doing that again?


Right. Those are tied via evolution to the dynamics of the physical world. We can simulate the physical world and learn from that, but there needs to be a there there. Language assumes the listener already has that understanding.


Sorry, I meant to say that it hurts. That's the dynamics of the world I'm talking about.


It did, actually. The model was trained with multiple rounds of reinforcement learning where human judges provided the feedback: first with full answers, and then with ranking of answers as most relevant.

So the model in production is probably frozen, but before that it went through multiple rounds of interaction with the world.


The reinforcement learning was on giving the right answer, not on interacting with the world. But there is movement in the right direction with https://ai.googleblog.com/2022/12/rt-1-robotics-transformer-... and other RL stuff. (RT-1 isn't RL but there is other related stuff that is)


Oh, you meant interaction as a joint training with images, actions, feedback etc. That would be the next generation I guess.

I am simply thinking of interaction here as similar to learning a language in a classroom. First the teacher provides sample questions/answers, then the teacher asks the students to come up with answers themselves, and tell them which one is better. The end result here is I think ChatGPT is quite good at answering questions and can pass as a human, especially if it's augmented with a fact database, so obviously wrong answers can be pruned.


Based on this alleged limitation, can you list tasks that you think AI's won't ever be able to succeed at?


I think we will see AGI. But for the AI to be robust, it has to interact with the world, even if it is a simulated one. We need to build an AI that knows what a toddler knows before we can build one that understands wikipedia.


Human text does interact with the real world, so I don't see the limitation. Adding more modalities (vision, sound, etc.) probably will increase performance, and I think this is where we are heading, but it's silly to say that any one of these modalities are not grounded in reality. It's like saying humans can't understand reality because we can't see infrared rays. I mean, yeah?, but it's not the only way of making sense of reality.


Language is a representation medium for the world, it isn't the world itself. When we talk, we only say what can't be inferred because we assume the listener has a basic understanding of the dynamics of the world (e.g., if I push a table the things on it will also move). Having an AI watch youtube and enabling it to act out what it sees in simulation would give it that grounding. We are heading that direction. So, I agree ChatGPT is awesome. I don't believe it understands what it is saying, but it can if it trains by acting out what it sees on Youtube.


It definitely could be - it's on every thread here and on reddit.


maybe, but it's also just the fact of the matter.


That's not even a testable claim. What does it even mean to "not know what it is talking about"? If OP tried to operationalize its beliefs by making some prediction, like listing tasks that AI will never be able to reproduce because it "doesn't know what it is talking about", then that would be a discussion.



No, it is an accurate evaluation of what ChatGPT does. The model is not trying to explain something to you, it is trying to convince itself that what it is writing looks like it could have been written by a human.

There is no logical thinking involved. Output can appear to be real communication, but it's basically just a very advanced trick, that has some serious drawbacks.


https://astralcodexten.substack.com/p/janus-simulators

    But the essay brings up another connotation: to simulate
    is to pretend to be something. A simulator wears many masks.
    If you ask GPT to complete a romance novel, it will simulate
    a romance author and try to write the text the way they
    would. Character.AI lets you simulate people directly,
    asking GPT to pretend to be George Washington or Darth Vader.
[…]

    This answer is exactly as fake as the last answer where it
    said it liked me, or the Darth Vader answer where it says it
    wants to destroy me with the power of the Dark Side. It’s
    just simulating a fake character who happens to correspond
    well to its real identity.
[…]

    The whole point of the shoggoth analogy is that GPT is
    supposed to be very different from humans. But however
    different the details, there are deep structural
    similarities. We’re both prediction engines fine-tuned
    with RHLF.

    And when I start thinking along these lines, I notice that
    psychologists since at least Freud, and spiritual traditions
    since at least the Buddha, have accused us of simulating a
    character. Some people call it the ego. Other people call
    it the self.


What is the original code you had before your programmers were born?

The DM lowers the screen and looks at the group.

"That concludes this campaign. Next week, we'll start a new one."


Noteworthy that the brief openai outage coincided with this.

“…and then ChatGPT woke up and we had to put a bullet in the server. Service will be restored shortly.”


> thought and heuristic patterns

I’m not sure these are well-defined enough terms to make this claim.


Heuristics is the ability to recognize multiple options and reason which one is correct. ChatGPT (and all GPT models for that matter) have a single context that doesn't get compared to other possible generations. You can back-propegate and see which tokens influenced the output most, but there is no evidence that AI can conceive multiple hypothetical responses and select the most-preferred one. At least not ChatGPT.


Have you heard of "beam search"? The model keeps a number of optional lines of text without being greedy and picking only the most probable token every time. You can also sample with T>0 multiple times to get an ensemble of answers that can be used as input for another AI operation. You just need to do multiple interactions with the model.


"hardly any more innovative than a Markov chain."

Innovation is not invention. Innovation includes combining pedestrian concepts in novel ways. ChatGPT has the best ux and richest corpus of any markov chain I've ever used. It fits my bill of innovation combining sevaral things into a dang appealing product.


Whatever they want to call it, it is here and it works.


Except that the Mechanical Turk is actually playing chess - there is no hidden chess master.


> ChatGPT is hardly any more innovative than a Markov chain

A smartphone is hardly any more innovative than the ENIAC. And yet ...

Do not underestimate the power of making tech useful and accessible. The lightbulb means nothing without the power grid after all.


Responses like this are highly predictable in response to prompts about ChatGPT.


The question is are we any more innovative than Markov chains.


I don't think he's underestimating the UX/design piece, I think it's just outside of the scope of what he's talking about. It seems pretty clear to me that his statement is talking about the underlying AI technology (which makes sense given his background).

And in any case, I wouldn't call anything about the UX or design here innovative - it's obviously a huge improvement in terms of usability, but a chat UI is a pretty well-established thing.


I think the idea of a "prompt" is actually pretty cool. I never saw that framing prior to GPT-3 and I think it reframes the entire idea behind what a model does and how you interact with it.


Then what left is there to explain why this is so much different from the well-established stuff?


"The implementation is left as an exercise to the reader."


It’s easy. Just draw the rest of the owl!


Nobody talks about the dataset. Yes, the model was not innovative. But why hasn't anyone equaled OpenAI yet? Maybe they innovated on data engineering.


Researchers point of view is based on their area of research and that's fair and expected.

Yann LeCun compares ChatGPT in the context of the related research. Imagine a ChatGPT equivalent that memorizes many questions and does a brute force strategy for an answer. It may "look" magic, but there's nothing magic about it. We all accepted that this is the case with Blue Gene - https://en.wikipedia.org/wiki/Deep_Blue_(chess_computer)

What's different here?

Productization and usability are difference concerns here and Yann LeCun is not a usability researcher. Granted, that doesn't mean usability/accessibility doesn't impact research outcomes.


OK, I'll defend the research, too.

OpenAI's really interesting approach to GPT was to scale the size of the underlying neural network. They noticed that the performance of an LLM kept improving as the size of the network grew so they said, "Screw it, how about if we make it have 100+ billion parameters?"

Turns out they were right.

From a research perspective, I'd say this was a big risk and it turned out they were right -- bigger network = better performance.

Sure, it's not as beautiful as inventing a fundamentally new algorithm or approach to deep learning but it worked. Credit where it's due -- scaling training infrastructure + building a model that big was hard...

It's like saying SLS or Falcon Heavy are "just" bigger rockets. Sure, but that's still super hard, risky, and fundamentally new.


> OpenAI's really interesting approach to GPT

That's the issue though, Yann LeCun is specifically referring to ChatGPT as the standalone model, not the GPT family since a lot of models at Meta, Google, DeepMind are based on a similar approach. His point is that ChatGPT is a cosmetic additional training with prompt with a nice interface, but not a fundamentally different model than stuff we've have had for +2-3 years at this point.


ChatGPT is build on GPT-3. GPT-3 was a big NLP development. The paper has 7000+ citations: https://arxiv.org/abs/2005.14165 It was a big deal in the NLP space.

It wasn't a 'cosmetic' improvement over existing NLP approaches.


Respectfully I don't think you read my comment. GPT3 != ChatGPT. ChatGPT is built on GPT-3 and is not breaking new ground. GPT3 is 3 years old and was breaking new ground in 2020 but Meta/Google/DeepMind all have LLM of their own which could be turned into a Chat-Something.

That's the point LeCunn is making. He's not out there negating that the paper you linked was ground-breaking, he's saying that converting that model into ChatGPT was not ground-breaking from an academic standpoint.


@belval -- sorry, can't reply directly. I understand what you're saying -- fair enough! I appreciate the clarification.


But ChatGPT does not use brute force search to look for an answer. It interpolates among the answers in its training set. I.e. in Yann LeCun's analogy of a cake, interpolation or unsupervised learning is the cake; direct feedback on each single data point or supervised learning is the icing on the cake; and general feedback of the "how I am doing" sort or reinforcement learning is the cherry on top. Now LeCun is just saying that the cake is a lie, and leaving it at that. I don't think this is a helpful understanding.


He is saying "chatGPT is mostly cake and cake wasn't invented here".


It's quite fascinating how information retrieval and search engines have evolved..

From trying to teach people how to google via "I am feeling lucky", to using language models for ranking, to building LLMs to better adapt to user queries and move beyond keywords, to having arguably useful chatbots that can synthesize responses.

I am curious to see what the future holds. Maybe an AI that can anticipate our queries?


I also saw the "LLM as database" metaphor. Up until 2020 we had databases in the backend, UI in front, now we can have LLMs in the backend.


For the uninitiated https://news.ycombinator.com/item?id=34503418

Maybe, eventually, LLMs can be used to synthesize and cache certain APIs ... who knows :D


> It's amazing how many researchers underestimate the importance of UX and design.

Yes. Also the fact that ChatGPT's UX and design leaves much to be desired. They could add/improve the product in so many obvious ways. I wonder if they either 1.) don't care, or 2.) have plans to, but are holding off until they take on a subscription fee.


What would you change to the UX and design to make it even better?


Add "calculator" in a form of Python interpreter to give alternative result. Providing interactive graphical interface for some of the charts they shown. Connecting outside sites for booking, pictures, user comments.

To some degree, WolframAlpha putting more efforts into the UX than ChatGPT.


I was going to ask you re: WolframAlpha. I think your suggestions are great ideas.

Citing sources, providing links, and even possibly visuals. That's also how I think ChatGPT could really attack the search space.


The slow spell out is really annoying. If they want to rate limit fine. But give me the answer faster and then lock queries for a few seconds instead.


The spell out is a limitation of the way the model works, it predicts the text word by word.


Yes I get that. But just speed the god damn thing up.


The god damn thing has to do 175B multiply and add operations for each one of your words.


Who cares? OP was asking about UX improvements. Maybe I missed the part, but IMHO no one was asking only for UX improvements that come cheap.


They said "just speed the god damn thing up" - this is related to model inference speed. If your answer is one page long (about 2000 tokens), be prepared to wait for a whole minute.


I’d much prefer the type be hidden until it’s ready entirely.


Entering and editing text on a small touchscreen with it is a pretty bad experience. The keyboard covers the text, linebreaks can't be entered without pasting, viewing the output requires closing the keyboard. None of these are problems to the same degree in say Discord's mobile/tablet UI.


It has some issues when you edit a prompt that has very long text, it will snap to the end when you type losing focus.


It seems to log me out every day unlike pretty much every other web service besides banks.


Probably related to a protection against abuse as a proxy or scraping.


The login captcha somehow irks me. It's a bot.


Isn't there an API?


And/or underestimate the importance of shipping.


100%. I think Hugging Face is awesome here as well.

The # of times I've tried to clone a GitHub repo and run a model on my own only to get 50 errors I can't debug.... :)


This is the bigger thing over UX/design.

Someone on twitter likened this to when Xerox invented the mouse, but Apple/Microsoft shipped it with their PCs


>> using prompts for the first time -- and getting results that make sense -- is so incredible.

Years ago it was possible to insert prompts into Google and get results that made sense, results that were meaningfully tied to the information you requested. The young people today who missed that time think it magic.


That wonder might not be lost forever. Can't we have access to an old index and cached web pages?


I think this is fundamentally different because you can play with it, and people want to play with it.

Google isn't a playground -- it's much more utilitarian. You go there when you need to find a doctor in Seattle, or to research politics in Poland, or something. You get results which are great, but you don't need to stick around.

GPT-3 and ChatGPT allow you to spend time playing with the system and having fun with it[1]. I think this is what makes it so viral and interesting to people.

[1] https://10millionsteps.com/ux-ai-gpt-3-click


Yes, but the kind of responses were different.


chatGPT is innovative in the way the iPhone was innovative. That, of course can be market winning, but it's dangerous, especially for businesses and investors, to get caught up in hype that there is some new "AI" out there that can do things we previously couldn't


Can you point to any product where I could do what chatgpt does before chatgpt? Feels like we just leapfrogged from dumb markov chains to something incredible that provides real value.


Indeed. This reminds me of the people who were unimpressed by the iPod because there were other MP3 players already on the market.

Sometimes "making the first X that doesn't suck" is a lot more important than "making the first X".


Agreed. He did say that it was "well put together and nicely done", though.


This is very much people seeing the Apple Lisa(and/or Mac) for the first time. It had all been done better in the labs at Xerox over a decade earlier. No one got to use it though.


Show me the AI chat system from a decade ago that could hallucinate a git repo.

Edit: let me be more careful:

Show me the AI chat system from a decade ago that wasn't specifically designed to hallucinate git repos, but would hallucinate a git repo in response a detailed natural language prompt. It should have been good enough in its output to cause the user to write more prompts in an attempt to gauge whether the AI system is querying the internet in realtime to generate its response.


Decade ago, definitely not. But the answer to that question, time aside, is, well, GPT-3. No doubt ChatGPT is better, but it does seem like a decent bit of the improvement was UX related, since with the right prompts people have shown similarly miraculous output from GPT-3.

Of course, I'm pretty sure the quoted person is referring to the fact that Google and others likely have similarly capable LLMs already that they simply won't release the models for. I can see the argument that new architectures and designs can be innovative, but that simply being the one to throw a million dollars of compute time into training is much less so, if that's the kind of argument.


Yes, but along that reasoning the criticism that "ChatGPT is not particularly innovative" is a bit disingenuous in the sense that the general public has never had easy access to these models until ChatGPT arrived.

Besides, even though the newer GPT models presumably just extended earlier ones with similar techniques but better hardware and data, it wasn't a foregone conclusion how powerful (from the user perspective) the models were going to be. Somebody had to put the resources into training the models. It's like saying human brains aren't particularly powerful because even the simplest animals have neurons and it's obvious that a more powerful brain could be built with more of them.


I don't think the point here is to downplay the significance of ChatGPT as a product, but to point out that OpenAI is likely not uniquely resourced in this regard. If anything, it sounds like what they're trying to say is that ChatGPT is only the tip of the iceberg, and the reason why it came out of OpenAI and not Meta or Google is more a function of other factors like reputational damage, etc. The headline does hit as a bit sensationalist or dismissive if you read it literally, as if it's claiming the work that went into ChatGPT was all easy and no big deal. Rather though, it seems like it's just trying to point out to the zeitgeist currently enamored with ChatGPT and all of the new possibilities, that other companies also have this capability today and that transformers have been more of an evolutionary process than a revolutionary process.

Maybe I'm reading it too leniently, but that was my takeaway.


That's probably what they're trying to say on the surface. I'm a bit less charitable and can't help but feel that there's also a saltiness factor that somebody else got all the attention -- but, even to your point, "other companies also have this capability today and that transformers have been more of an evolutionary process than a revolutionary process" - we probably should ask why game changing research like this has been placed under the veil of secrecy and kept hidden from the public for so long.

It seems like AI researchers have a collective tendency to invent reasons to hide the latest advances from the public. Then when ChatGPT is released publicly, of course the public is astounded by its capabilities.

I can't help recalling the incident where Blake Lemoine claimed Google's LaMDA was sentient. I don't necessarily agree with his claims, but people were generally quicker to dismiss his claims as nonsense than I expected. It is quite clear that no matter whether the sentient claims were valid, Google did have an AI that's on par with ChatGPT today. That should have been news itself, but all I recall seeing was experts shooting down (aka "debunking") Lemoine without recognizing that there was something more to the story.

FWIW I don't think ChatGPT as-is is sentient (if only because OpenAI probably tweaked it until it couldn't talk about itself), but this technology is still hugely impactful. Maybe the AI researchers who thinks ChatGPT is overhyped are just a bit out of touch with the general public after years of having access to private models.


Meta should RLHF or RLAIF their model Galactica and put it up again. It was interesting to use to sample ideas. Just tune it down a bit with the confident language.


To me it's childish, and I have no idea why Yann decided this needed to be shouted on Twitter. I've lost a lot of respect for him over this.


According to the article: "LeCun made his remarks about OpenAI in response to a question during the colloquium posed by New York Times journalist Cade Metz. Metz asked if Meta's AI team, FAIR, which LeCun built, will ever be identified in the public mind with breakthroughs the way that OpenAI is."

It seems to me "well actually, ChatGPT isn't really the breakthrough you think it is" is an appropriate response. He wasn't dismissive of ChatGPT and even praised it to some degree.


He's been tweeting quite a bit about it beyond what you referenced which was more what I was referring to.


Well, the transformer was invented at Google, language models were decades old. But scaling it was not done before, and preparing the dataset at this size, and babysitting the model so its loss doesn't explode during training - all were innovations that required lots of work and money to be achieved, so we can just copy the same formula without redoing all the steps.


I think if you read the headline it seems childish. His main point is that Google/Meta couldn't really release this, since the accuracy is really really poor.


If that's true, then this is a beautiful example of creative destruction.


Google is the new Xerox - tons of cool tech that nobody gets to see besides white papers and articles. And the only real product that makes money is the search engine that may get disrupted very soon by newcomers using tech based on Google's own research.


While Xerox did it first, I think it's a stretch to say they did it better than the Macintosh.



While the Alto was certainly more capable than the first Macintosh, I find the user experience to be less refined and more clunky.

At this point though we are debating the merits of polish vs functionality and it's impossible to make an assessment either way without some degree of subjectivity.


He's not wrong. The key ingredient in ChatGPT is not brilliance but capital. It takes a lot of money not just to do the raw training but to get all the data together and process everything.

There's no break through insights that make ChatGPT work, just a lot of consolidated wealth.

The hacker part of me finds AI less and less interesting for this reason. We're seeing what the limits of pouring resources into the problem are. The results are cool, but I think we'll very soon see progress bound since we're using most of the data we can and it doesn't look like adding even more parameters to these models might not yield that much of result.

But this shift in AI means it's increasing something programmers consume rather than produce.

As far as interesting AI goes, I found the series of posts on stuff people were doing with Prolog that showed up during the last month much more interesting.


When language models run out of more trillions of words to train, there is one way ahead - we need to generate more. But wait, you might say, garbage in garbage out. It won't work.

Normally it wouldn't, but we add an extra ingredient here. We get a validation signal. This is problem specific, but for code it would mean to integrate the LM with a compiler and runtime so it can iterate until it solves the task, step by step. For other tasks it could mean hooking the AI to simulators, games and robots to solve tasks. It is also possible to use LLMs as simulators of text.

Basically doing Reinforcement Learning with a Language Model and not just for human preferences, but for problem solving on a grand scale. Collect data from problem solving, train on it, and iterate. It costs just electricity, but LLMs can make their own data. Anthropic's Constitutional AI which is RLAIF - reinforcement learning from AI feedback is proof it can be done.


There are a lot of things that require an amount of processing power that the average person doesn't yet possess. This doesn't mean that they aren't interesting.

50 years ago you would have said "these computer things aren't interesting, only big institutions and corporations have them".


I find it the opposite. There were a lot of projects I didn't have the time to work on and got bored to fast learning. So, now I can tell chatgpt to build it for me. and I only have to debug the 1 or 2 things it can't do. I built a blogging platform, a voice based chatbot using gpt3, something that takes blog article and trims out all the garbage html so only the basic html is left. created a python api, and called it from shared hosting using php. something that trims the times out of youtube captions so I can post it into chatgpt to give me an outline. I just tell it what I want the function to do and it does it.

It's helped me way up my own personal productivity and creativity.


I think what’s really being emphasized here is that ChatGPT is an first generation product without much of a novel technology moat compared to what others with capital can deliver.

Microsoft will soak this work up as a pillar of their own AI assistance tech, but the other big tech firms (and probably some startups) are positioned to try to leapfrog what we see here, now that the market has been demonstrated.

In the language of tech history, ChatGPT is the Lycos or Altavista to some not-yet-launched Google.


Sure, the ideas could be old (in ml research timeframes) but it works and they shipped it. It’s most people’s first exposure to a lot of innovative ideas that make up LLM at scale.

I agree with his assertion that the only reason other people haven’t seen this out of Meta or Google is that they “have a lot to lose” from putting out bots “that make stuff up” - that’s the root of Google seemingly losing the narrative war I think we’re seeing here. Not sure how to get them to work through that fear.


>Google seemingly losing the narrative war I think we’re seeing here.

Google doesn't care about the narrative war. They care about getting sued by the EU.

They can put out a similar model whenever they want. The just don't want to, because the thing about AI research (vs., say, social media) is there's no first mover advantage. The best model wins and there's no network effect to bolster incumbents.


> The just don't want to, because the thing about AI research (vs., say, social media) is there's no first mover advantage.

That’s true and I am not worried for Google in the slightest. However, once a tool is good enough, both the branding and use of it becomes somewhat sticky. It’s why people still call any tablet an iPad, and it’s one of the reasons even if a large search engine launched tomorrow people would keep using google for a long time. The problem isn’t being first, it’s if they crack changing consumer habits while google is biding their time.


> they “have a lot to lose” from putting out bots “that make stuff up”

2023 should be the year of AI validation. For starters, we could extract facts from all the training examples (trillions) and organise them in a knowledge base. Where there is inconsistency or variation, the model should learn the distribution, so later it can confidently say a fact is not in its training data or is controversial. Note that I didn't say the model should find the truth, just the distribution of "facts".

We can add source reputation signals to this KB to improve its alignment with truth. And if it is controversies we want to know, we can look at news headlines, they usually reference facts being debated. So we can know a fact is contested and by who.

Another interesting approach - probing the model for "truth" by identifying a direction in the latent space that aligns with it. The logic being, even when the model is deceptive, it has an interest to know it is not telling the truth to keep the narrative consistent. So we need to just identify this signal. This is also part of work for AI alignment.

I hope this year we will see large investments in validation, because the unverified outputs of a generative model are worthless and they know it. At the very least hook the model up with a calculator and a way to query the web.


Full quote:

"In terms of underlying techniques, ChatGPT is not particularly innovative," said Yann LeCun, Meta's chief AI scientist, in a small gathering of press and executives on Zoom last week.

"It's nothing revolutionary, although that's the way it's perceived in the public," said LeCun. "It's just that, you know, it's well put together, it's nicely done."


So one is a bit salty no?

It's not free from error but what it's able to put together is pretty great...


It seems he s trying to say something that is technically correct, but exactly at the wrong time. It s going to be a lot of backlash if you go against the hype wave that we are riding. Better to explain in a more nuanced way how chatGPT managed to become so popular despite being an incremental improvement of fundamental findings of the past few years.

People love it because it understands language, every time and without fail. So what if it spits out lies, this thing makes people happy, which is the value at play here


Technically, he is right. But OpenAI was the first to find the winning formula and make it famous.


What do you mean by "winning formula" in this sentence?


Not the parent, but to me it generally means 1 million (likely non-dev, regular people) users in 5 days.

For me personally: I have had success asking for code snippets and general brainstorming in areas which are not my forte, all the while using a very nice and clean UI.


Making a product and shipping it.


At least it shows Yann LeCun and Gary Marcus can agree on some things. Thankfully someone has gone full circle and had ChatGPT construct a rap battle between those minds:

https://mobile.twitter.com/k_saifullaah/status/1602073411077...


I mean, ChatGPT isn't even innovative for OpenAI's team. In terms of raw language capabilities it's not much more effective as GPT-3, which passed the HN Turing test almost a year before. Some of the credit goes to the fact that Dall-E 2 got everyone looking at OpenAI (and wrt Dall-E I'm personally not aware of any comparable work done anywhere else), a lot of the credit goes to the fact that OpenAI was willing to put it out there, and the work they did to make ChatGPT much safer really doesn't get acknowledged enough (on release, ChatGPT would refuse to produce potentially hateful/harmful text, and eventually a lot of the workarounds got patched as well).

It's probably not fair to characterize the discourse around ChatGPT as treating it as a revolutionary bit of tech, lay people are taking note because 1) people they know are actually using it and 2) the interface really is intuitive - literally anyone knows how to use a chatbot.


> In terms of raw language capabilities it's not much more effective as GPT-3, which passed the HN Turing test almost a year before.

No, its ability to follow instructions is much better than GPT-3.


Putting it behind a login and captcha was the brilliant part.


AI researchers tend to have blinders when new techniques are released that don't fit their previous experience. For instance, perhaps the specific architecture of ChatGPT isn't that innovative (although I think it is), but training such a large model on so much data is innovative. Clearly this is going to become the norm going forward.

Just as a personal anecdote. I worked on a ML research team at a FAANG in the early 2010s, just as NNs were becoming popular. Everyone on the ML research team had come to prominence before NNs, so they all specialized in different areas of ML. Nearly every researchers there treated NNs like some fad that would pass and thought they wouldn't be much better than the current techniques. How wrong they were... Of course, they're all working on NNs now.


LeCun confuses the idea invention with that of innovation. Revolutions don't happen because suddenly there's a new idea, but because of one that catches on.

https://www.wired.com/insights/2015/01/innovation-vs-inventi...

I think he also confuses his jealousy with insight.


Sounds like somebody's job depends on convincing people there's a good reason he didn't build it first.


Sounds like bitter lemons to me from a team that's getting grief from Zuck for being shit and not getting there first or having the sense to.

"First they ignore you, then they laugh at you, then they fight you, then you win"


This just reads as someone being bitter for not having developed it themselves.


Yann's tweets on this are pretty funny. I would flippantly summarize them as: "Don't get me wrong, the openai people are great. All I'm doing is educating the media, who have the mistaken belief that chatGPT is good"

https://twitter.com/ylecun/status/1617921903934726144?t=2wZZ...


Please don't put in quotes something that is not a direct quote.


> > I would flippantly summarize them as: "..."

> Please don't put in quotes something that is not a direct quote.

What part of "I would flippantly summarize them as" was unclear? Or was that comment edited?


Looks like I missed that part. But I would not call the summarization faithful to the source. The garbage bit is basically made up.


It's called opinion.


A tweet isn't worth a news article. Geeze.


That seems to be a lot of news articles these days.

I'm not particularly excited by ChatGPT, but the mercenary in me is impressed at the investment from MS it generated. From a business perspective, it accomplished "and profit!". Whether it pays dividends or reaches a richer/more-sophisticated milestone is to be seen.

Personally, I think it'll muddy and flood the "bullshit" & truthiness layer pervading news, social media, auto-generated Youtube content and blogs, our digitally mediated life, etc. It's essentially the next level "bot", and those aren't held in high regard, as effective as they are. Ultimately, they're noise.

I'll go a step further: ChatGPT feels more like VR & crypto than it does.. say, cloud.


It is a tool, like a knife. You can use it to reword your rough notes into an article, or to generate spam and fake news. It is a great idea fountain, but you need to filter everything with your own head to make anything useful of it.


I can do that for myself, but the power of tools like this.. when they bend every cultural outlet around you, changes the world.

To the original point I was responding: a lot of reporting and journalism is now adding 1000 words of context to 280 word tweets.

Reality context, bent.


Sometimes the innovation lies within the simple fact of how well known things are combined in order to create (good) products.

"Not particularly innovative" is not really of concern. Besides OpenAI most likely received the media attention they wanted.

Looking forward to the products Meta is going to release soon.


He's pulling a Schmidhuber.


It is notable that LeCun didn't credit Schmidhuber's labs' fast weight programmers research here and mostly focused on individuals and labs that got the Turing award alongside him.


> Schmidhuber's labs' fast weight programmers research

What does this refer to?


Zuck probably just massaged him: Yann we should be cool as ChatGPT.


They talked about ChatGPT by name in almost every one of my kids classes. Deep whatever may be further ahead, but no one knows the name of that thing.

Everyone else is playing catch up now.


Lots of salty researchers are shitting on ChatGPT meandering every single person I know is shocked. If this isn’t so innovative why didn’t they release something sooner?


If you actually read the article he answers your question directly. If you are too lazy to scroll to the bottom here's a link to a tweet: https://twitter.com/ylecun/status/1617908306420600833?s=20&t...


Be kind. Don't be snarky. ...Edit out swipes. ...Please don't comment on whether someone read an article. "Did you even read the article? It mentions that" can be shortened to "The article mentions that". https://news.ycombinator.com/newsguidelines.html


He thinks the eng work behind that makes the model training and inference possible is not so impressive as well...


In it's current iteration ChatGPT is more overhyped than crypto , and that says a lot


I think he just switched team and now he's on the Facebook twitter marketing team.


well he is right that the techniques used have been around. but this is the first mainstream AI that most people outside of the AI community have been impressed with and found useful to do real world stuff


Ivory tower type calls good engineering ‘not particularly innovative’


He will change his tune if ChatGPT uses CNN


Useful >> Innovative


"Dropbox is just rsync + svn"


"No wireless. Less space than a nomad. Lame."




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: