> Or maybe it-is-only-about™ gathering and labeling large amount of data for the...

ma2rten · on Sept 4, 2021

As someone in the deep learning community I disagree with your assessment that GPT-3 is a not scientific breakthrough. The GPT-3 paper won a best paper award at the most prestigious machine learning conferences after all. GPT-3 didn't make any modeling advances, but it introduced a completely new paradigm with few-shot learning.

PeterisP · on Sept 5, 2021

Perhaps we have to distinguish between GPT-3-the-model and GPT-3 paper. IMHO GPT-3 as a model is straighforward engineering, putting a lot of resources in an oversized GPT-2; and while there's significant novelty in the "Language Models are Few-Shot Learners" paper about how exactly you apply these models, that is orthogonal to GPT-3-the-model, the scientific content of that paper applies to any other powerful language models and isn't intimately tied to specifics of GPT-3.

In essence, I feel that the same people introduced two quite separate things - a completely new paradigm on how to obtain few-shot learning from a language model in a way that competes with supervised learning of the same tasks; and the GPT-3 large model which is used as "supplementary material" to illustrate that new paradigm bit is also usable and used with the old paradigms, and by itself isn't a breakthrough. And IMHO when the public talks about GPT-3, they do really mean GPT-3-the-model and not the particular few-shot learning approach.

ma2rten · on Sept 5, 2021

Those two are tied together because many of those few-shot capabilities only emerge at scale. If OpenAI had trained a large model and not analyzed it rigorously it would have very little scientific value. But it would have been impossible to have the scientific value without the engineering effort.

mountainriver · on Sept 4, 2021

Agree, just because it took money and scale to do it doesn’t mean it isn’t a breakthrough.

mikkel · on Sept 4, 2021

Bigger models might get us to AGI alone. I say that because of the graphs in this paper: https://arxiv.org/pdf/2005.14165v4.pdf

Quality is increasing with parameters. Even now, interfacing with codex leads to unique and clever solutions to the problems I present it.

Sanguinaire · on Sept 4, 2021

Not in the field, so genuine question: what is the evidence/theory to support the notion that deep learning is at all a reasonable route towards AGI? As I understand, this is nothing like how actual neurons work - and since they are the only "hardware" which has ever demonstrated general intelligence hoping for AGI from current computational neural networks feel like a stretch, at best.

goatlover · on Sept 5, 2021

Why is AGI the goal instead of continuing to augment human intelligence with better tools?