Hacker News new | past | comments | ask | show | jobs | submit login

> Is there actual overconfidence?

There isn't. Arvind is just making this up. (Anytime a post leads with many controversial major claims and then handwaves away the entire discussion with "We won’t waste your time arguing with these claims.", while providing almost zero references for anything at all, you should smell something rotten. No no, go right ahead, feel free to 'waste our time', you're not charging us by the word. By all means, do discuss recent DL work on time-series forecasting and why DL is unable to predict, the industrial/organizational psychology lit on predicting job performance, prediction of recividism, why you think scaling is now the dominant paradigm both in terms of researcher headcount and budgets, etc - we're all ears!)

And we know this in part because surveys regularly ask about relevant questions like AGI timelines (median is still ~2050, which implies most believe scaling won't work) or show that AGI & scaling critics are still in the majority, despite posturing as an oppressed minority; most recently: "What Do NLP Researchers Believe? Results Of The NLP Community Metasurvey", Michael et al 2022 https://arxiv.org/abs/2208.12852 If you believe in scaling, you are still in a small minority of researchers pursuing an unpopular and widely-criticized paradigm. (That it is still producing so many incredible results and appearing so dominant despite being so disliked and small is, IMO, to its credit and one of the best arguments for why new researchers should go into scaling - it is still underrated.)




Why is the scaling hypothesis so disliked? I guess "we just need MOAR LAYERS" comes off as a bit naive and implies a lot of extant research programs are dead ends, so there are elements of researchers not wanting to appear to be AI techbros and of mutual backscratching. But it's proven to be surprisingly productive for the last few years, so I'd have expected more people to jump on if only out of hype.


> Why is the scaling hypothesis so disliked?

This is a difficult question. I have several hypotheses. When one works with highly productive/prestigious/famous people, one can't help but notice[1] how they instinctively avoid subproblems in their niche which don't let them leverage their intelligence/energy/fame to a sufficient degree and/or don't give their personal brand enough credit. This dynamic is at play in academia and in tech corporations as well (though arguably, tech is still more amenable to large-scale engineering). These people live by (public perception of) credit and have very fine sense for it.

Scaling could be (during the recent history, at least, maybe increasingly not so in the near future as the obvious breakthrough results became widely known) an inconvenient lever for applying one's academic presence simply because there is little prestige behind it, and (at least as understood by academics) it's merely engineering work, which doesn't require true genius.

Naturally, academics are led to creation of sophisticated theories, or even whole fields (even better, because it's hard to measure the output in this case), preemptively predicting and deflecting as much criticism as possible - and view such efforts and their conceptual output as a mark of true intelligence. Engineering large-scale systems executing gradient-descent doesn't really look valuable from such point of view, because it tells little about the brilliance of the chief investigator.

1. http://yosefk.com/blog/10x-more-selective.html




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: