
What is GPT-3? written in layman's terms - skylarker
https://tinkeredthinking.com/index.php?id=841
======
solidasparagus
I have some issues with how this article describes GPT and its impact. The
scenarios that are described as using GPT-3 remains the stuff of science
fiction as far as I know. They would require major technological breakthroughs
in areas of autonomous agents, combining language models with 'concept-based'
models, and more mundane things like labelling (can you imagine how difficult
it would take to accurately label all legislation for 'nefarious details',
given that even humans can't agree on what is nefarious?).

More specifically:

\- training on data is a lossy process. In your examples, GPT would actually
have a worse memory than your lawyer or therapist. There is no way to combine
language models and something more abstract like 'facts'.

\- GPT has shown zero ability to do anything consistently successfully without
a human-in-the-loop. When it comes to bring AI models into production, this
matters a lot. There's no way autonomous therapists are coming from GPT-3 when
half the time the model spews out potentially dangerous garbage. You can't
teach GPT-3 to not hurt people because it has no concept of people or hurting
them. It JUST knows the shape of English.

\- GPT is an unsupervised (in terms of data labelling work required) model. It
has not made any breakthroughs in requiring labelled data for fine-tuning the
model to do a specific task. Which remains a gigantic problem for
productionalizing models. Like how are you going to build an autonomous
therapist? That data remains as inaccessible and impossible to label as ever.

\- Please stop telling people that neural nets are related to brain neurons.
They have essentially no relationship other than the name and it just fosters
this fear of Terminator and obscures the real issues that need to be thought
about. This is just my personal opinion but I'm so tired of having to spend my
time telling otherwise smart people who don't know better that we aren't close
to Terminator.

GPT is an impressive technical accomplishment, but it's impact on the world
has been exaggerated quite a bit IMO. Some of the demos I've seen are almost
certainly smoke and mirrors or very carefully chosen, human-in-the-loop
examples.

~~~
innagadadavida
A much more realistic and accurate take on the tech than the article. One
combination that can work is to have GPT3 with an unskilled human judge in the
loop. The human can discriminate with common sense while the machine generates
good jargon with correct grammar.

~~~
mattkrause
That seems even worse!

Jargon (sometimes) isn't just obfuscation; it carries important shades of
meaning that would be tedious to spell out every time. 'Homicide' and
'murder', for example, are sometimes used interchangeably, but are not
actually legally interchangeable. You would not want your involuntary
manslaughter plea upgraded to murder, a more serious--and therefore more
severely punished offense.

You could, of course, try to train "unskilled labor" to detect things like
that. By the time you are done, the labor won't be unskilled and you will have
reinvented the paralegal.

------
bozzcl
> GPT-3 continuation: A confused voice came from inside. When I opened the
> door, the person that looked back at me was Hayama Hayato. Why was Hayama,
> who I only shared memories of me playing soccer with, in my room at this
> hour of the night? That question immediately flew out from my mouth.

[I knew that name felt
familiar]([https://oregairu.fandom.com/wiki/Hayato_Hayama](https://oregairu.fandom.com/wiki/Hayato_Hayama)).
Does that mean GPT-3 was trained on an arbitrary, huge database of text? I
wonder how copyright applies here.

~~~
lumost
There's an open question if GPT-3 simply provides a text completion API for
the internet. The original paper discusses the risk that datasets for specific
evaluation tasks (or equivalents) were simply memorized by the model.

A completion API for the internet could still be an incredibly valuable
component.

~~~
fock
yes! if a thing of similar performance as GPT-3 could just spew out links to
where its "inspiration" comes from, this would be (super expensive?) real
context-aware search. This could be really great.

------
jcims
One thing I will say after seeing GPT-3 running live is that somehow the
examples that you see trotted about in various articles never really seem to
capture the spirit of the thing. Interacting with it directly and exploring
the universe behind the prompt is one of the more profound computing
experiences I've ever had.

~~~
SaltyLemonZest
Agreed. I've been playing AI Dungeon with their GPT-3 model, and it really
does feel like there's a scatterbrained but human DM on the other side.

------
euske
To me, what GPT-n really tells is how redundant human text in general is. It
doesn't provide a lot of new information (if it does, they are accidental),
but more or less wandering around the given topic. It is the redundancy that
gives the system a space to churn.

~~~
cblconfederate
The reduntancy is not low. GPT-* are basic "pattern generators" for our
language organ. We have pattern generators for our gait for example -- all
those synchronized movements of so many muscles can be effectively controlled
by a well understood circuit, to the point that even a deafferented animal can
still be be made to recognizably walk.

GPT is the equivalent, but for language: it can be used to phrase thoughts.
Real language however, has thoughts, GPT doesnt have any thoughts. People are
impressed by how many responses it has learned, but forget that it is contains
a lot of gigabytes of "compressed" text associations. It needs "something
else" to become actually useful.

------
thimkerbell
I have seen comments on Reddit that smelled as though something like this was
used to create them. I don't see how, in future, we are going to be able to
meet up and discuss constructively with strangers online, if this thing can be
used to emulate people.

~~~
WA
Or imagine the amount of useless blog posts for SEO spam. We already suffer
from information overload. This will add a lot more.

~~~
logifail
> We already suffer from information overload

I would suggest this isn't quite right, we actually suffer from _content_
overload, not information overload.

The quantity of useful information in a lot of the content we are offered is
depressingly small.

~~~
gverrilla
I would suggest this isn't quite right, we suffer from advertisement overload
and "content" is actually a "hype" word - itselft a product of advertisement
(propaganda).

------
ryanjodonnell
Let me guess. GPT-3 wrote this article??

~~~
vannevar
That was my thought as well. It has a lot of the idiosyncrasies I've noticed
in other GPT-3-generated articles: coherent, but meandering, with a lot of
cliches and rhetorical questions that don't really add to the article. Of
course, you also see that in human writing. :-)

------
noiv
I think GPT-3 will change the assumption written language or sentences a
priory have meanings, because experience taught actions might follow. From now
determining the author and their intentions is kind of an survival skill. Just
imagine half of the Internet was written by GPT-3 and nobody knows which half.

------
codingslave
A comment from r/machinelearning:

"It seems obvious from the demos that GPT-3 is capable of reasoning.

But not consistently.

It would be critical, imo, to see if we can identify a pattern of activity in
it associated with the lucid responses vs activity when it prodcues nonsense.

If/when we have such a apattern we would need to find a way to enforce it to
happen in every interaction"

And people agree:

"Dunno why you are getting downvoted, I agree with you. It seems like to get
GPT-3 to do good reasoning you have to convince it that it is writing about a
dialogue between two smart people. Talking to Einstein, giving some good
examples, etc. all seem to help. Shaping really seems to matter, but I don’t
think we have enough access to the hidden state to determine if there are
quantitative differences between when it is more lucid and when it isn’t.

It’s like Gwern said: “sampling cannot prove the absence of knowledge, only
the presence of it” (because whenever it fails, maybe with a different
context, different sampling parameters, using spaces between letters, etc. it
would have worked)"

Its interesting that this kind of speculation is entering the conversation. I
think we are on the cusp

~~~
SaltyLemonZest
I'm not convinced that "capable of reasoning, but not consistently" is a
meaningful claim. The examples seem to primarily consist of people spending
hours trying things, until eventually GPT-3 outputs a chunk of reasoning they
could personally do in seconds. Does that mean that GPT-3 is doing the
reasoning, or does it mean that GPT-3 is an English-based lookup table and
they managed to find a clever sequence of search keys?

The fact that there _could_ be reasoning going on is certainly exciting by
itself. But I don't think it's fair to call it obvious without a compact
specification for how to make GPT-3 perform a general class of reasoning. Less
"here's a script to make it output stuff about balanced parens", more "here's
a strategy to teach it most basic string manipulations".

~~~
TomMarius
Have you seen the database prompt?

[https://www.gwern.net/GPT-3#the-database-
prompt](https://www.gwern.net/GPT-3#the-database-prompt)

~~~
SaltyLemonZest
As I mentioned, I don't think any single prompt can demonstrate the presence
of true reasoning. If the prompt isn't shown to broadly generalize, it might
just be doing a text match to something that was said before on the depths of
the internet. You can see this in the next section; Kevin Lacker gets GPT-3 to
demonstrate it knows some basic trivia questions, but it "knows" _any_ prompt
with the same textual structure as a basic trivia question, even if the prompt
is nonsense. This strongly suggests that it's parsing out key words and doing
a lookup on them rather than accessing a consistent internal model.

------
cjauvin
> So embeddings are bits of binary code that are associated with the words and
> word-snippets that the computer ‘reads’. These embeddings never change.

I'm not entirely sure but I think this definition of embedding is wrong: first
they are not binary, they are floats (as the other parameters/weights) and
they do change, as the error backpropagates through them. They are simply
"swappable parameters", explicitly corresponding to each word, thus it's
possible to detach and reuse them for other purposes after training, which is
not necessarily easy (or meaningful) with any other weight matrix in a model.

------
interestica
> If prompted correctly it responds identically a human.

It appears to be grammatically incorrect. It made me think "ok, probably
written by a human." But then also realized that typos and grammar issues are
probably prevalent in the data set. How would they manifest? Will they
reinforce emerging changes in language? (eg, think of how the meanings of
words have slowly mutated). And how much of the scraped content is itself
written by a AI?

------
jibbit
Apologies i realize this is a bit of a tangent but what i would like and can't
find is an understandable description of how Generative neural networks work.
My very basic understanding of neural networks is as a classifier. Cat and dog
pictures in, likelihood out. I can't get any intuition about how this can
generate new cat pictures.. is it a search problem? How does that search work?

~~~
BorisTheBrave
Classifiers are just one example of output, one where the final layer of
output is interpreted to be probabilities of different items. Using a
different loss function (or reward function), you can train for the output to
be different things.

But in fact text generators work basically identically to classifiers. You
train the model to classify texts according to which single word comes next.
Then you append a word to the text according to that output, and repeat.

------
imranq
Good article. I was half expecting at the end to read that the article itself
was generated by GPT3. Luckily GPT3 can’t explain itself just yet.

------
tylerwince
Just wrote about something similar and what we can learn about GPT-3 inputs.
Applicable for lots of domains but written to product managers.

[https://productsolving.substack.com/p/openais-gpt-3-will-
cha...](https://productsolving.substack.com/p/openais-gpt-3-will-change-how-
we)

~~~
jcims
I'm curious how long this prompting UI into language models like GPT-3 will
continue. It's so fiddly and imprecise and unpredictable. It's better than
nothing, of course, but there's a lot of art in it.

------
zxcb1
The idea of garbage in, garbage out is worth considering in the context of
human minds and priming. Advertisements and algorithmic curation could
contribute to hazardous information environments. Perhaps GPT agents may
assist in revealing this dynamic and simulate the extent, and consequences of
manipulation?

------
Causality1
"Generate text that could believably have been written by a human" is a task
we're getting very good at. I'm curious how far it extends towards concrete
applications. When can we start using it to fill functionality gaps between
the functions of a human-coded framework?

~~~
Krollifi
In about the average life span of a redwood tree.

~~~
Causality1
I'm probably going to be dead before Google Assistant can correctly interpret
"take me to the McDonald's beside the Target"

~~~
londons_explore
I reckon that's 3 years away...

Current assistant features rely on trying to match your query fuzzily to an
action template. Eg. "Set alarm for 10pm" might match the template "Make alarm
at time {TIME}", with $TIME=22:00. Minor transformations of parameters can
occur (Converting the time from 10pm to 22:00), but those are mostly hard-
coded.

Future assistants will use a neural network to do the matching and parameter
conversion, with the networks big enough to encode world data. So things like
"Mcdonalds beside the target" can be encoded by the neural net so your query
matches "Navigate to ${ADDRESS}", with $ADDRESS="Mcdonalds, 21 Foo Avenue".

It's all do-able today, but nobody has done it yet. Probably 3 years away I'd
guess.

------
frsandstone
Looks like they also made a podcast.

"#828 - What is GPT-3?"

[https://podcasts.apple.com/us/podcast/828-what-is-
gpt-3/id13...](https://podcasts.apple.com/us/podcast/828-what-is-
gpt-3/id1397732851?i=1000485603786)

------
skylarker
Great introduction to GPT-3 and machine learning for non-technical people.
Interesting implications/possible use cases described at the end, including
therapy and regulatory capture.

~~~
visarga
GPT-3 probably has thousands of different skills like these two, we just
haven't explored it much yet.

------
jadbox
Images in the article are not showing up.

~~~
qwertox
Isn't this a text-only article? My issue is rather with the title of the page,
"Tinkered Thinking", instead of "Tinkered Thinking - What is GPT-3?", which
harms bookmarking.

~~~
pferde
Don't just chuck your bookmarks into a pile. Sort them, label them - curate
them.

Otherwise it's pretty much "garbage in, garbage out".

------
tinus_hn
Sounds like it’s similar to the tool that creates these deep dream images, but
for text.

------
dwaltrip
How long until someone builds GPT-4 with several trillion parameters? How much
would that cost?

~~~
Veedrac
Training GPT-3 costs only a few million dollars. Scaling up is still pretty
cheap. I wouldn't be surprised if we see a quadrillion parameter model in 5
years, given the potential value.

Google has actually already trained a trillion parameter model IIUC [1],
though that was a Mixture of Experts so was way cheaper to train.

[1]
[https://twitter.com/lepikhin/status/1278174444528132098](https://twitter.com/lepikhin/status/1278174444528132098)

------
data4lyfe
When is typical Hackernews commenter GPT-3 response bot coming out?

~~~
sk0g
On common topics like Facebook, neural networks, and China, the comment
section would look no different. Hell, the exact same conversations end up
happening most of the time, too.

------
esseti
where one should start learning how to leverage OPENAPI to build some complex
logic? i've seen gym, but it seems to me that they are predefined envs where
you can just play.

