Hacker News new | past | comments | ask | show | jobs | submit login

I am not a fan of this trend of "Language Models Are X" in recent work particularly out of OpenAI. I think it's a rhetorical sleight of hand which hurts the discourse.

Like, the exact same paper could have instead been titled "Few-Shot Learning with a Large-Scale Language Model" or similar. But instead there seems to be this extremely strong desire to see certain ineffable qualities in neural networks. Like, it's a language model. It does language modeling. Turns out you can use it for few-shot learning and do amazingly well. Beyond that, what does it mean to say it "is" a few-shot learner?

On one hand, it's literally the same claim in a strict sense. On the other hand, it implies something much broader and more sweeping, that language modeling / unsupervised learning as a task over long contexts inherently implies meta-learning ability — which is a statement that is very difficult to properly formulate, let alone back up. But that's the argument that I feel is being slipped under the table by these titles. (And indeed it's very close to what they suggest in the text, though with no more than a wave of the hands.)

Don't get me wrong: their intuition is reasonable, it's super cool that they got this to work, and the results are very impressive on lots of tasks (though there are clear gaps). But as a VERY publicly watched lab, they have a serious duty (which I think they're neglecting) to frame their results more carefully. In particular, there's a sort of religion that if you train a big enough model on big enough data with self-supervision, it will somehow become AGI and/or learn to solve arbitrary problems. Claims like "Language Models are Few-Shot Learners" are clearly designed to fit into that worldview, even though the research doesn't point at it any more than a more conservative interpretation like "Lots of NLP Tasks are Learned in the Course of Language Modeling and can be Queried by Example." They touch on this limitation in their discussion section but I guess flashy titles are more important. I wish they would use their status to set a better example.

For a specific example of how I think their framing is unhelpful: in the LAMBADA evaluation (sec. 3.1), they suggest that one-shot performance is low "perhaps...because all models still require several examples to recognize the pattern." This may be the first thing you'd think of for a few-shot learner, but then why is zero-shot performance higher than one-shot? If you remember that you're working with a language model, there's another possible explanation: the model probably models the last paragraph is a narrative continuation of the previous ones, and gets confused by the incongruity or distractors. (The biggest model is able to catch on to the incongruity, but only when it's seen it before, i.e., with >1 example.) Of course, this is just one possible explanation, and it's arguable, but the point is I think it's more useful to think of this as a language model being used for few-shot learning, not a few-shot learner where language modeling is an implementation detail.

But as a VERY publicly watched lab, they have a serious duty

I was nodding right along with you, and then...

OpenAI has no duty. It doesn't matter if they're publicly watched. What matters is whether the field of AI can be advanced, for some definition of "advanced" equal to "the world cares about it."

It's important to let startups keep their spirit. Yeah, OpenAI is one of the big ones. DeepMind, Facebook AI, OpenAI. But it feels crucial not to reason from the standpoint of "they have achieved success, so due to this success, we need to carefully keep an eye on them."

Such mindsets are quite effective in causing teams to slow down and second-guess themselves. Maybe it's not professional enough, they reason. Or perhaps we're not clear enough. Maybe our results aren't up to "OpenAI standards."

As to your specific point, yes, I agree in general that it's probably good to be precise. And perhaps "Language Models Are Few-Shot Learners" is less precise than "Maybe Language Models Are Few-Shot Learners."

But let's be real for a moment: this is GPT-3. GPT-2 is world-famous. It's ~zero percent surprising that GPT-3 is "something big." So, sure, they're few-shot learners.

In time, we'll either discover that language models are in fact few shot learners, or we'll discover that they're not. And that'll be the end of it. In the meantime, we can read and decide for ourselves what to think.

I think all researchers and science communicators have a duty to present science in a way which educates and edifies, and doesn't mislead. It's not just that they're successful, but that their publicity gives them a prominent role as science communicators. Science is all about and questioning your assumptions, and acknowledging limitations. They claim the public interest in their charter. I think it's reasonable to demand integrity from them, at least as much as it is from any other researcher, if not more. And I think OpenAI would agree with me on that point.

It's easy to say: they 'have a duty to present science in a way which educates and edifies, and doesn't mislead'. But sometimes it takes years even for scientists to really understand what they have created or discovered. It's cutting edge, not well known, hard to communicate. How could lay people keep up where not even scientists have grasped it fully?

Of course, if the same scientists were asked about something where the topic has settled, they could be more effective communicators.

> OpenAI has no duty. ...

Of course they do! It's the same duty as every scientist has in advancing the public understanding of science. You seem to be replying to OP as if they said that only big AI research groups this duty, but this is just not so. Furthermore, when a prominent group of scientists conduct themselves poorly, it is not enough to say that they have no special extra duty due to being famous, they already must communicate properly because they are scientists and part of the scientific community.

I think one reason these conversations get so muddled is because it's all new and really pretty cool, so it becomes hard to tell what's skepticism and what's naysaying.

> Such mindsets are quite effective in causing teams to slow down and second-guess themselves.

Absolutely not, this goes directly against the scientific method. Such "mindsets" of trying to make sure that your results are correct and accurately presented without embellishment are a cornerstone of science. Of course it causes them to slow down! They have more work to do! Second-guessing themselves and their experiments is the whole fucking point.

Well. OpenAI did have their own "Don't be evil" moment.


Given their prophylactic strategy, "AI for everyone", they could argue that hype generates public interest.

Yeah its kind of odd why OpenAI makes these weird titles.

"Few-Shot Learning with a Large-Scale Language Model" makes more sense.

Even with their robot hand paper, they titled it along the lines of "we solved a rubrix cube" not "a robot hand manipulated the cube and solved it"

It's academic PR. Many disciplines have really cute titles that don't really match the research. My old discipline (psychology) was really known for this.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact