> Is there actual overconfidence? There isn't. Arvind is just making this up. (A...

scarmig · on Aug 31, 2022

Why is the scaling hypothesis so disliked? I guess "we just need MOAR LAYERS" comes off as a bit naive and implies a lot of extant research programs are dead ends, so there are elements of researchers not wanting to appear to be AI techbros and of mutual backscratching. But it's proven to be surprisingly productive for the last few years, so I'd have expected more people to jump on if only out of hype.

sinenomine · on Aug 31, 2022

> Why is the scaling hypothesis so disliked?

This is a difficult question. I have several hypotheses. When one works with highly productive/prestigious/famous people, one can't help but notice[1] how they instinctively avoid subproblems in their niche which don't let them leverage their intelligence/energy/fame to a sufficient degree and/or don't give their personal brand enough credit. This dynamic is at play in academia and in tech corporations as well (though arguably, tech is still more amenable to large-scale engineering). These people live by (public perception of) credit and have very fine sense for it.

Scaling could be (during the recent history, at least, maybe increasingly not so in the near future as the obvious breakthrough results became widely known) an inconvenient lever for applying one's academic presence simply because there is little prestige behind it, and (at least as understood by academics) it's merely engineering work, which doesn't require true genius.

Naturally, academics are led to creation of sophisticated theories, or even whole fields (even better, because it's hard to measure the output in this case), preemptively predicting and deflecting as much criticism as possible - and view such efforts and their conceptual output as a mark of true intelligence. Engineering large-scale systems executing gradient-descent doesn't really look valuable from such point of view, because it tells little about the brilliance of the chief investigator.

1. http://yosefk.com/blog/10x-more-selective.html