
GPT-3 Is “Mindblowing” If You Don’t Question It Too Closely - EllyFant
https://mindmatters.ai/2020/07/gpt-3-is-mindblowing-if-you-dont-question-it-too-closely/
======
pornel
It doesn't matter if the GPT really "understands" anything. That's like
pondering whether submarines can swim.

The fact is, it is more general-purpose, and generates text that is much more
coherent than what we've had before. Decade or two ago even that level of
apparent comprehension was science fiction.

~~~
Udik
> That's like pondering whether submarines can swim.

It's an old adage. But it's less smart than it sounds [1]. Swimming is an
action defined by its mechanics- you swim when you do certain precise
movements with your body. Understanding (or thinking, in the original
formulation) is a process that is defined by its consequences. The
consequences of understanding are simply that you respond in an appropriate
way to the original stimulus with speech or actions. To go back to the
original metaphor, it's not about the specific movements of swimming, it's
about going from point A to point B in water. And yes, submarines can do it.
And yes, GPT-3 seems to understand a lot, in that it responds to a lot of
inputs in a meaningful way.

[1] Take that, Dijkstra!

------
gwern
Tired lame criticisms.

For example, Bender's award-winning paper might impress you less if you knew
that GPT-3 already solved their counterexamples which supposedly demonstrated
what language models will never be able to do:
[https://www.gwern.net/GPT-3#bender-
koller-2020](https://www.gwern.net/GPT-3#bender-koller-2020)

Van den Broeck is just wrong, there's a lot of incredible scientific value in
the GPT-3 work. The meta-learning, which is the main result of the paper, is
itself a landmark finding, plus all the bonus material about scaling curves.
Dismissing that is a little like dismissing finding the Higgs because 'we all
knew the Higgs existed, it just shows how much money the EU was willing to
throw at the LHC'. Good grief.

And Kevin Lacker's post, which I am apparently doomed to see cited endlessly,
is less than meets the eye; many of Lacker (and Shane's) failure cases can be
solved with better prompts and sampling settings:
[https://www.gwern.net/GPT-3#common-sense-knowledge-animal-
ey...](https://www.gwern.net/GPT-3#common-sense-knowledge-animal-eyes) and
following sections. (I hadn't tested the prompts about 'what number comes
before one thousand' etc, but testing the 10,000 one right now, better
sampling fixes that one as well.)

~~~
randomsearch
Fair point but perhaps the comparison with the Higgs somewhat overblown.

------
SpicyLemonZest
A lot of the referenced analysis makes sense, but this article comes across as
pessimistic far beyond what any of the commenters they're quoting said or
would endorse. Identifying the political leader with the most power over the
American colonies demonstrates significant conceptual understanding, even if
that's not quite what "president of the United States" means. And innumeracy
isn't "odd for a computer system" \- there's no fundamental property of
numeracy we'd expect to reach out from the motherboard and make a text-based
AI good at math.

~~~
wombatmobile
Could we be optimistic for a moment?

There's plenty of time left for progress. Imagine if all of the issues raised
in the article were addressed and implemented in GPT-4.

Imagine more iterations, all the way up to GPT-42, sometime in your great
grandchildren's lifetime.

Q: What is the difference between what GPT-42 might do, and what you might do
to answer this question?

~~~
SpicyLemonZest
You'd have to study the GPT architecture to answer that question. For example,
there's a comment upthread describing how its arithmetic capacity may be
artificially low due to the way it parses math problems. I've seen another
analysis, I forget where, explaining that it's fundamentally incapable of
certain kinds of complex reasoning because the sequence of transformations
producing the output is static and non-rewindable.

While the article's extreme skepticism is unwarranted, most experts do seem to
agree that GPT can't scale to general human-level intelligence.

~~~
wombatmobile
Thank you for a considered response, SpicyLemonZest. When I posted that
question I half expected to be howled down as a troll, a naysayer, or off-
topic. I'm glad you did none of that and answered by sharing your thoughts. As
humans in the community of HN, we are richer for it.

> most experts do seem to agree that GPT can't scale to general human-level
> intelligence.

What is it about human-level intelligence that's unreachable by machines, even
machines as advanced as V42 of GPT, would you consider? Can you put your
finger on it, or make a little start?

~~~
SpicyLemonZest
I doubt there's any property of human-level intelligence that's inherently
unreachable by machines. The question is whether GPT-42 would actually be
tremendously advanced, or whether we're close to the limit of what the
architecture can produce.

The consensus among experts I've read is that known architectural limitations
make the GPT-n series unlikely to scale beyond "sounds very human if you don't
question it too closely". Algebra, to pick a concrete example, is something
that there's reason to believe the GPT-series can't learn effectively - the
output generation mechanism is fundamentally incapable of looping or
recursion, which severely limits how well it can break things down into
subproblems.

------
renox
I remember an article which showed that GPT-3 quality of answers were very
dependent of the previous questions which I find quite surprising as it has
already been trained on a large number of texts..

And there were an attempt to train GPT-2 with a common sense database:
[https://www.quantamagazine.org/common-sense-comes-to-
compute...](https://www.quantamagazine.org/common-sense-comes-to-
computers-20200430/) it would be interesting see the results with GPT-3..

------
lambdatronics
I would love to see someone give GPT-3 some verbal reasoning tests (a la
SAT/ACT).

