Hacker News new | past | comments | ask | show | jobs | submit login
Why are deep learning technologists so overconfident? (aisnakeoil.substack.com)
97 points by leonidasv on Aug 31, 2022 | hide | past | favorite | 153 comments



Is there actual overconfidence?

I think that if just in 2020 we did a survey about whether the capabilities of (for example) the image generation systems discussed in other today's threads would materialize at 2022, then more than half of deep learning technologists would have said that it's not going to happen so fast, demonstrating underconfidence instead of overconfidence.

For another example, when people in NLP put up tasks like the GLUE dataset, a reasonable expectation was that it would be a meaningful benchmark for some time, but it turned out that the previously too hard tasks became too easy just a year or so later - again, demonstrating underconfidence instead of overconfidence.

IMHO the overconfidence appears when people listen to PR of various companies overpromising products, not to the actual technologists doing research on the technology.


I agree. There are two different sets:

1. Actual deep learning researchers, actively doing research. I rarely hear nonsense about impending AGI or mass technological unemployment from these folks. They might express tepid concern about the ethics of various applications of deep learning and tepidly point out that some technological disruption in employment is possible, but in both cases with an emphasis on tepid. And who, if anything, tend to under-estimate progress. (Self-driving is still very far off and also much further along than I thought it would be 9 years ago. I turned down jobs at self-driving cos because I didn't believe they would make enough progress to even have an ADAS product, but I was clearly wrong and if I could do it over I'd work on self-driving from 2013 onwards even though L5 is still a long way out.)

2. Deep learning fan-boys, for lack of a better term. The rationalist community in particular has a sub-community of tech-adjacent folks who aren't publishing in major conferences every cycle or running research labs but do talk a lot about AGI/UBI.

IMO it's not that dissimilar from climate science or even in the extreme the existence of aliens. Scientists with a lot of real expertise will sort of tepidly talk you through the full complexity. And then some "true believers" who aren't actually expert will sort of run to the extremes of anti-natalism or aliens among us. If that makes sense.


> IMO it's not that dissimilar from climate science or even in the extreme the existence of aliens.

Those may be similar to the contrasting attitudes regarding AI, but the analogy I first thought of for what you describe is the topic of cryptocurrency.

1. On the one hand, most people who have deep knowledge and experience of software development and databases, or professional experience of financial markets and banking, tend to be extremely sceptical, critical, or outright dismissive of the whole idea of cryptocurrency.

2. On the other, people who are technology "enthusiasts" and have some limited or self-taught programming skills, or those who have some moderate knowledge of the basics of finance and investments (often motivated by personal ambition), are much more likely to be cryptocurrency fan-boys.


This is just the hype cycle in action. New technology comes out and its fanboys and shallower (from a usecase perspective) users are adamant it'll change everything everywhere. Deep practitioners understand limitations because they're involved in the work. Eventually the fanboys are proved wrong or gravitate to the next hype and everyone else finds the place for a new technology.

A good example is Go some years ago and these days Rust. But it's the same thing, just the hype cycle at work.


I guess John Carmack falls somewhere in between but as a complete layperson when he said he gave AGI by 2030 a 50 percent chance, that at least indicated to me that some really smart people think there is a chance.


1. He predicts "signs of life" by 2030, which is a (probably intentionally) vague statement.

2. He raised $20MM for an AI startup, which is fine and well but also makes him not entirely disincentivized from hype.

3. I wouldn't characterize him as someone in the trenches of deep learning.

More of a meta point: technical depth in more than a few things is impossible in a human lifespan, and just a bit harder once you become a "somebody" since a portion of your life becomes consumed by the fact that you're a "somebody". You end up doing things like raising VC money and starting companies with bold ambitions. Its own time sink.

I had this realization when I had a conversation with Lamport about a niche topic in distributed systems and he expressed a position that was just wrong. It was a minor point that didn't really matter much at all, but he was pretty confident in a conjecture I knew was wrong. To be clear, the fact that no one can be an expert on everything -- even everything within a subfield of CS -- doesn't detract from the fact that geniuses exist. Someone can "forget more than you know" and also not know something that you know. Life is just sadly very short.


Thats true, he seems to have flopped on the rockets startup too, although you might argue AI is still much closer to his wheelhouse than aerospace so it make more sense. Before hearing that I had some vague idea that the most AI experts timeline for AI was significantly longer and some never.


> Is there actual overconfidence?

There isn't. Arvind is just making this up. (Anytime a post leads with many controversial major claims and then handwaves away the entire discussion with "We won’t waste your time arguing with these claims.", while providing almost zero references for anything at all, you should smell something rotten. No no, go right ahead, feel free to 'waste our time', you're not charging us by the word. By all means, do discuss recent DL work on time-series forecasting and why DL is unable to predict, the industrial/organizational psychology lit on predicting job performance, prediction of recividism, why you think scaling is now the dominant paradigm both in terms of researcher headcount and budgets, etc - we're all ears!)

And we know this in part because surveys regularly ask about relevant questions like AGI timelines (median is still ~2050, which implies most believe scaling won't work) or show that AGI & scaling critics are still in the majority, despite posturing as an oppressed minority; most recently: "What Do NLP Researchers Believe? Results Of The NLP Community Metasurvey", Michael et al 2022 https://arxiv.org/abs/2208.12852 If you believe in scaling, you are still in a small minority of researchers pursuing an unpopular and widely-criticized paradigm. (That it is still producing so many incredible results and appearing so dominant despite being so disliked and small is, IMO, to its credit and one of the best arguments for why new researchers should go into scaling - it is still underrated.)


Why is the scaling hypothesis so disliked? I guess "we just need MOAR LAYERS" comes off as a bit naive and implies a lot of extant research programs are dead ends, so there are elements of researchers not wanting to appear to be AI techbros and of mutual backscratching. But it's proven to be surprisingly productive for the last few years, so I'd have expected more people to jump on if only out of hype.


> Why is the scaling hypothesis so disliked?

This is a difficult question. I have several hypotheses. When one works with highly productive/prestigious/famous people, one can't help but notice[1] how they instinctively avoid subproblems in their niche which don't let them leverage their intelligence/energy/fame to a sufficient degree and/or don't give their personal brand enough credit. This dynamic is at play in academia and in tech corporations as well (though arguably, tech is still more amenable to large-scale engineering). These people live by (public perception of) credit and have very fine sense for it.

Scaling could be (during the recent history, at least, maybe increasingly not so in the near future as the obvious breakthrough results became widely known) an inconvenient lever for applying one's academic presence simply because there is little prestige behind it, and (at least as understood by academics) it's merely engineering work, which doesn't require true genius.

Naturally, academics are led to creation of sophisticated theories, or even whole fields (even better, because it's hard to measure the output in this case), preemptively predicting and deflecting as much criticism as possible - and view such efforts and their conceptual output as a mark of true intelligence. Engineering large-scale systems executing gradient-descent doesn't really look valuable from such point of view, because it tells little about the brilliance of the chief investigator.

1. http://yosefk.com/blog/10x-more-selective.html


I think the big issue is that we have bad intuition about what works well with machine learning and what doesn't, leading to both under- and overconfidence at the same time. When I was at university I came across machine learning in medical imaging. The job of radiologists seemed obvious to be a classic machine learning problem that should be easily solved by the upcoming possibilities. In the end other stuff materialized.


The radiology stuff is happening. My father is an administrator at a hospital. They went from having 3 radiologists on staff to 2 because they use ai to screen which increases productivity. These things don't have to be all or nothing, the hospital is saving hundreds of thousands a year now because radiologists are so expensive.


The "radiology stuff" is not happening. Data from BLS (https://www.bls.gov/oes/current/oes_nat.htm)

Employed Radiologic Technologists in the US:

2014 - 193,400

2021 - 216,380

The growth is larger than the overall population growth in the same period. And that is in a very outsourceable occupation.


To be fair, Radiologic Technologists are different than a Radiologist. The former is a certification requiring a 2 year degree, the latter requires a Doctorate of Medicine which takes an additional 4 years at least on top of 4 years of undergrad. Then you need 4 years of residency and an additional sub-speciality year on top of that to be a Radiologist.

So, very different.


The latter is a subset of the former in BLS statistics and I am using the larger category only because BLS apparently only started counting Radiologists as a separate subcategory in 2019.


Yes. Techs and doctors are different!


Two points (your conclusion that AI hasn't had much effect yet may still be right):

Population growth isn't everything. The uninsured rate dropped from 12% to around 8% in that time period. Not many uninsured people get radiologist services unless it is an emergency.

And cheaper AI-assisted Radiologists could increase the overall employment of Radiologic Technologists.


out of curiosity, which hospital is this?


Do you happen to know what stopped the radiologist stuff from happening?

I know I read press releases ages ago saying "hey look, we can do this machine learning", but I haven't really heard anything about it since.


Part of it is medical regulation.

For example, only a human doctor can make a "diagnosis" or write a prescription. Even just making "treatment recommendations" (e.g., a more-effective treatment or one with less side effects than the current one), which would then be reviewed and approved by an appropriately-licensed physician, is a problem.

Not having worked in the radiology space, my naive hunch is that it has to be framed as a "screening tool" which "helps the physicians find cancers." (Which can be justified to regulators because ML tends to have higher recall than humans on these tasks, albeit lower precision).

In theory, ML helps doctors spot cancers they wouldn't ordinarily see. In practice, it probably means that the physicians still technically "review" all images, but only give a cursory glance to those not flagged by the system.


I am a doctor and an AI practioner. A radiologists job is far far more than pathology detector. No AI will replace the image recongition aspect of their job in my lifetime. And that's the most automatable aspect of their job. The human and legal aspects of their job is a whole different beast. At best, technologies will allow radiologists to better or more efficient


My impression was that medical data access for training and eval was the really big thing holding it back... You wind up with deep learning folks obsessing over datasets with tens of examples.


Agree, deep learning seems to be exceeding all most expectations over the last decade. Only time it doesn’t is when some vc funded startup is making insane claims.

GPT-3 and stable diffusion are much more capable than most predicted


Much more capable at what?


Maybe the problem is trying to understand this in terms of confidence and not in terms of accuracy. In fact, we might say that some are underconfident, some are overconfident but almost all are inaccurate in predicting the form and/or details of progress. In that case we don't have to care about the technologists who thought the recent wave of image generation would take 10+ years AND we don't have to care about the people who think evil AI will bury us under paperclips in the next 30.


Is there actual overconfidence?

No. This is pretty much made-up bullshit. OK, there may be "overconfidence" among spectators and people on the fringes, commenting from the peanut gallery. But among people actually doing real meaningful research? No, this whole story is a big nothing-burger.


remember that time that lesswrong proved agi would be present by 2014?

you're talking about the people that came up with roko's basilisk, which is religion / d&d tier overconfident


WTF does LessWrong have to do with anything? Is anybody associated with LessWrong an actual researcher, as opposed to just an armchair philosopher? And even if some token number are, they hardly represent any kind of field-wide consensus.


> WTF does LessWrong have to do with anything?

It's an example of the previous discussion: people talking about AI in the ridiculously infinitive, in threads. Nothing there says "practicing researchers in papers."

I even agree with what appears to be your evaluation of LessWrong folks, but the thing being asked didn't gate the discussion to people who /should/ /be/ in the discussion, and AI's discussion is about as weighed down by outsider fakers as politics or earth-shapeology is

I mean, a discussion of the discussion of vaccines has to include the cranks, doesn't it? As much as they shouldn't be there, they very much are, and the impact they're having is very real

In case there's any inclarity, LessWrong are the people that got famous for writing incorrect parables about logic principles in the actual tone of Harry Potter fan fiction, an act I can't even make fun of because it's funnier than any of my jokes about it. I had not intended the HPMoR authors as a north star reference.

I had actually indicated them to remind us that the vast majority of the AI discussion isn't by AI people.

In the meantime, even if you gate it to real AI people, it'd be pretty surprising if you can't name any overconfidence. How many different years has Kurzweil assigned the proven birth of AI to? What did Cyc promise? When did Tesla self-driving first miss its mark? Do you remember the promises they used to make about Siri? Am I the only person who remembers that Cue:CAT was sold as an AI device, as was Teddy Ruxpin? Who here remembers IBM Watson? Remember when AI was going to solve credit card fraud? Remember when AI was going to defeat child pornography?

It's not like it was rare. There was a toothbrush called the Ara that was going to use AI to solve cavities. AIs can see your heart attack five years in advance! AIs can predict psychosis from how you write. AIs can pick out criminals by their faces! AI can tell if you're gay.

And then like, you try it in the field, and chihuahua vs blueberry muffin is still a struggle. Tanks and time of day, you know?

AI has more reliably lampooned itself with public promises and failures than any other industry I can think of offhand except maybe Hollywood, and I live on the planet with "fusion is just 10 years away" as a slogan

Also I'm in AI, and I'm responsible for a few of those failures. Go team! Let's promise the moon next, then deliver them a Lego Challenger Shuttle playset a year late.

Don't get me wrong - Stable Diffusion is literal, actual magic. I'm scared for emad, that Lord Oberon is going to beat down his door one night and take it back, and possibly turn him into a tree. He's violated the "the humans turned that star into a donut and we don't actually know how" rule. https://imgur.com/gallery/wpZ4w

But also, AI was going to have given us self-driving cars before Tesla was founded, and I think people have just forgotten that

I mean we haven't even lived up to the 1990s AT&T commercials yet. It's really weird to be this far in the future and this far in the past at the same time.

It's almost 2023 and I'm still yelling "no" into the phone four times in a row to get the menu to understand a single syllable, and barely half of that is my mobile carrier's fault

.

> Is anybody associated with LessWrong an actual researcher

I have no idea. I don't read that stuff. I just like making fun of Roko's Basilisk. If I was going to bet money, I'd put 100% on no, but also, in a community that size, y've gotta figure they got at least a couple over the years?

I'd bet fewer than a quarter as many as you'd expect from just a random sampling of the populace.

I mean. Also let's remember there have been more than a few questionable researchers over the years. I'd probably get flagged if I Bogdanoved a list here, but, look, if you're at a quack convention, a fringe "researcher" is gonna show up for some credibility trading, right?

And also Gwern just said he was there since 2014 in a different thread, and he's the very much the real deal, so I guess the answer is "at least one apparently."

Anyway, if you can't laugh at Roko's Basilisk, buddy, I feel for you. That's eternal comedy gold. There are 1980s cartoon villains who broke the fourth wall and time travelled to the modern day just to make fun of it. You can write it on two pounds of paper and get the gold exchanged for money at most banks. Gary Larson cannot be reached for comment. If you don't know what it means, pretty please look it up; I promise you that it's worth your time. I didn't say D&D tier for no reason.

They genuinely large-group panicked that a scary AI from the future of an alternate universe was going to come back in time to punish them to death for not dumping all their money into speculatively inventing it. Like I'm not even kidding, that's really what it is.

I was definitely not invoking Harry Potter and the Basilisk of Rationality as a bastion of credibility. Pinky swear. I'm not even spoiling it; that's just the beginning. I didn't get to the funny part. Please, set up some warm candle light, some soft music, a couple of your favorite beverages, and a bucket. You're in for a wild ride. (Two points, Quidditch themed presumably, to anyone who chooses to interpret Basilisk in the Carolingian sense.)

Always remember: the discussion about Foo is not limited to Foo professionals


I mean, a discussion of the discussion of vaccines has to include the cranks, doesn't it? As much as they shouldn't be there, they very much are, and the impact they're having is very real

Always remember: the discussion about Foo is not limited to Foo professionals

Fair enough, and I mostly agree. I guess there's just a subtle point there to me, about the distinction between the value/merit of the conversation between the professionals and the peanut gallery. But yes, in the broadest sense, the peanut gallery discussion does matter to some extent.


> remember that time that lesswrong proved agi would be present by 2014?

Can't say as I do, and I was there in 2014, so presumably I would remember everyone being super disappointed. Link?


> Can't say as I do, and I was there in 2014, so presumably I would remember

if you've been there since 2014, you know that all of their rokos get taken down, in the hope of curating mistakes away from being noticed. most of the people there since then don't know about that, or much of the other high comedy, either.

i would also describe them as having proven that a future ai from a different universe is going to quantum reality travel to this one to punish us for not devoting our wallets to pay for its invention, which is basiliskically hilarious. one of my favorite things in the history of the internet, right up there with timecube.

sure, a person might say "Well they didn't prove that, that was just one user and the founder and a couple other easily scared campfire listeners," but that would miss the point of how comedy narratives work, in the effort to disprove a statement that wasn't actually meant to be factual, but was rather (and somewhat obviously) meant as a ... let's see, victorian voice, "cutting jibe." nobody's really motivated by "the fact remains that it was not a proof."

if they subsequently just-asking-questionsed me to prove roko's basilisk was real, i definitely wouldn't spend the time, because it's perfectly obvious what i'm talking about, that reality is in no way a supporting column in what i actually said, that what they're asking me for is a direct contravention of what i actually said, and mostly because i don't want to set myself up for a later unpleasant to listen to false meta-narrative where someone patiently explains things to me that aren't related to what i was actually saying

.

> Link?

if, at the end of my saying "to get to the other side," someone asks me to link them to the road, is sure they don't remember an avenue there, and doesn't remember being disappointed by the lack of available poultry, i'm not going to bother.

that isn't how jokes work. it's just not worth my time

unpopular opinion: 95% of the time, someone asking for a link is just trying to call bullshit without coming out and saying it. much like corporate speak has developed "let's circle back" and "let's take this offline," you know? a way to not feel like you're saying the thing you're actually saying.

this cumbersome failure to interact with natural language does, i seem, feel intentional

at any rate, here i am at the intersection of poe and brandolini. i enjoy your webpage. have a good day


I don't argue with the general claim of this article, but the following statements are just wrong:

> Deep learning entered the sphere of mainstream awareness less than a decade ago, but the science is ancient. This 1986 paper in Nature contains almost all of the core technical innovations that make it work well. (The term deep learning hadn’t been coined yet and neural networks was used instead).

> However, the dataset sizes and compute resources that were available in the 1980s weren’t enough to demonstrate the effectiveness of deep learning, and other machine learning techniques like support vector machines took center stage.

I think there are two glaring errors here. First, multilayer error backpropagation wasn't developed in 1986. It was developed in the early 1970s, but was quashed by the elimination of funding after the Perceptrons papers and book. It was rediscovered several times but didn't take hold until the PDP group (Rummelhart et al) pushed things through with enough gravitas.

Second, and this is the big one: there were plenty of reasonably large datasets, and frankly plenty of sufficiently large compute resources in the late 1980s and certainly the 1990s to do this research. That wasn't the problem. The problem was that neural networks were largely restricted to two layers due to the vanishing gradient problem, so they had to be fat rather than deep, which wasn't good. That's the whole point of the name deep learning -- by overcoming issues which prevented deep networks, researchers were able to do the stuff they can now. That didn't come until later.


>Second, and this is the big one: there were plenty of reasonably large datasets, and frankly plenty of sufficiently large compute resources in the 1980s and certainly the 1990s to do research. That wasn't the problem. The problem was that neural networks were largely restricted to two layers due to the vanishing gradient problem, so they had to be fat rather than deep, which wasn't good.

Even that was not the decisive problem - given enough compute multilayer NNs could be trained just fine. The real obstruction was a political failure of deep learners (connectionists) in securing academic prestige and associated access to large-scale compute; By now it's all but forgotten, but then this was a prominent academical political debate of the decade: https://en.wikipedia.org/wiki/Neats_and_scruffies and even though "neats" eventually prevailed, the connectionists were still sidelined by statistical ML practicioners (whose efforts didn't happen to scale, although more than a few weighty theses and books happened to be published as a result).

Perhaps in the near future, when this wave of AI will give us impossible to deny life-changing outcomes (for example: new cures for chronic diseases) we will look back and think about how many years or even decades of progress were postponed, essentially lost for nothing but academic hubris of a camp that won the funding battle then, only to be swept away in a few decades.

Until we haven't really understood the Bitter Lesson http://incompleteideas.net/IncIdeas/BitterLesson.html every bit of progress that was not realized is our fault.


> Even that was not the decisive problem - given enough compute multilayer NNs could be trained just fine.

As someone who has worked in deep learning, this isn't true. Figuring out the right recipe is extremely important. The regime in which you can effectively train a neural network with good performance is just very small. If you structure things just slightly wrong, training just won't converge.

IMO we totally could have trained something like a smaller ImageNet with a decently large dataset on a supercomputer in the late 90s. We just didn't know how. In the late 90s and early 2000s, NN researchers didn't know about (or didn't appreciate the importance of) batchnorm, or ReLUs, or which hyper-parameters to use. Convnets had been invented but hadn't yet been popularized.

In the late 90s, you might have tried training an 8-layer deep MLP on a dataset of 50K images on your Cray T3E supercomputer. But you would most likely have failed, because you weren't using convnets, your sigmoid activation function lead to a vanishing gradient, and your hyper-parameters were off. You might let this run overnight, but 12 hours later, your loss barely went down, and you were out of compute budget on a supercomputer you were sharing with other researchers. Given the right code, the right recipe, you could have made it work on that 90s supercomputer and with the compute budget you had, but the right recipe just wasn't known at the time.

We needed time to figure out and to publish about the really basic ingredients needed to train bigger neural networks successfully. Having access to more compute can help figure out what the right recipe is through trial and error, but ReLUs are a super basic innovation. They are a very simple function, and essentially a theoretical innovation. Someone needed to sit down and figure out that vanishing gradients were actually a problem we needed to think about.


> Convnets had been invented but hadn't yet been popularized.

> Given the right code, the right recipe, you could have made it work on that 90s supercomputer and with the compute budget you had, but the right recipe just wasn't known at the time.

We could expect the best of the best given access to these powerful computing machines, surely they would know their lenet (1989) http://karpathy.github.io/2022/03/14/lecun1989/ and could have studied statistics of gradients and activations to implement something like ReLu - which is one line of code.It was a low-hanging fruit.

> Someone needed to sit down and figure out that vanishing gradients were actually a problem we needed to think about.

If Sepp Hochreiter could do it, surely someone from CalTech could do it as well, given the caliber of people who study there. Again, looks like a simple problem of misapplication of best and brightest (who, surely, know what they want to work on in life, but are still influenced by academic prestige and their advisors' advice which would repel them from connectionism then).


I took a machine learning class in ~2006 and granted the prof wasn't specializing in studying NNs, but he seemed to have little awareness of the impact of dataset size, convnets were never mentioned, the suggested approach was two-layer MLPs and sigmoid activation functions. The course was based on the book "Artificial Intelligence: A Modern Approach" 2nd edition, which barely just mentions neural networks in passing.

I think there just weren't enough people looking at neural networks back then. At the time, they were still considered a niche machine learning technique, just one tool in the toolbox, and not necessarily the best one.

Also, it's one thing to say that the best of the best had access to computation, but if your university has just one supercomputer, which is shared among everyone, and you have a limited number of hours during which you can use it, that's not increasing your chances at running very thorough experiments. If anything, it's not that we didn't have the compute to do deep learning in the 90s and early 2000s, it's that more access to compute makes a broader range of experimentation a lot more accessible. That makes it a lot easier to find the right training recipe more quickly.

Back in 2009 the enthusiasm around neural nets was starting to grow, as there were some early successes with deep neural nets, but I remember some people were still trying to argue that support vector machines were just as effective for image classification. It seemed laughable to me at the time, but I think that everyone who wasn't a connectionist was starting to feel threatened and wanted to justify their chosen research specialty, which they would keep doing until they couldn't anymore.


In the early days (pre 2015) it was extremely difficult to train models with more than two or three layers.

Batch Norm and Residual connections opened the door to actual depth.

(On edit: ReLu activations were also vital for dealing with vanishing gradients. And there were some real advances in initialization and opportunities along the way, as well.)


There's a third too: Moore's law. Few researchers in the world had access to the compute resources to do this stuff in the 1980s and 1990s. Today a high-end desktop can train models large enough to do compelling things in a few hours and a phone or a small laptop can execute a model to render a prediction or do pattern classification in seconds.

For the really huge models we now have commodity cloud compute available when in the past you would have had to be among the elect with access to a supercomputer. You couldn't just go whip out a credit card and rent time on a Cray II or a Connection Machine and commodity hardware back then would have taken years to train something like GPT-3 if it even had enough storage.


The story your telling was the mainstream story from when "deep learning" first arose, using Boltzmann machines for pretraining, etc. But then the field exploded, more compute power was directed at the problem, and people discovered that yes, in fact basic MLPs with SGD work well with enough compute.


But the point is that there were super important algorithm advanced that allowed training bigger models. It wasn't all hardware... A big part of it was getting more eyes on the problems (via hype over imagenet results circa 2012) to figure out how to solve them.


How big was the biggest dataset in the 1980s?

And I'm pretty sure that a modern computer with 4 GPUs beats the best clusters from the 80s


You don't need to develop GPT-3 to make major progress in deep neural networks: but in the 1990s the field had stalled due to other reasons that had nothing to do with either compute power or data set size.


most practicioners claim those were the primary causes of the stall

i'm curious what you believe it was instead


The adage that we overestimate technology in the short term and we underestimate it in the long term holds. There is no way we all have a personal AGI assistants that drive our cars and work jobs for us this decade. At the end of this century we will all have this technology, barring any one of the possible apocalyptic catastrophes besetting us at present. The question is whether this advance is possible without some irrational exuberance today. We need the cult of AI to inspire us to think it’s even possible in the first place.


No the majority of researchers in the field who have been researching for decades already know deep learning isn't the correct way forward it's just the vocal minority claiming it not to be true for nothing more than purely financial interests.


No the majority of researchers in the field who have been researching for decades already know deep learning isn't the correct way forward

(playing Devil's Advocate here)

That doesn't even make sense. Correct way forward to what exactly? I guess if you assume that the only valid goal of AI research is to eventually achieve AGI then you might have a sort-of valid point. Except that nobody "knows" that deep learning isn't the way forward to AGI. Nobody "knows" that it is either.

But all of that said, current deep learning research has absolutely created systems that do amazing things and create tremendous value for society. So in what sense can we say that it "isn't the correct way forward"? Do you suggest abandoning DL completely? If so, in favor of what?

(OK "Devil's Advocate" portion over)

My own position has long been that "just deep learning" might be sufficient to achieve AGI if we eventually make networks that are big enough to support the right kind of emergent behavior. But I've also long been skeptical that this is the most direct or efficient path to AGI. I'm an advocate of hybrid symbolic/sub-symbolic systems that integrate deep learning with other AI strategies.


> My own position has long been that "just deep learning" might be sufficient to achieve AGI

as soon as you say agi, you've lost the plot.


> as soon as you say agi, you've lost the plot.

What plot? Would you care to elaborate?

Hard to imagine how that term could matter. It's just short-hand for some "thing" that people care to talk about. I'm not interested in quibbling over definitions or whatever. Nor am I interested in the kind of discussions that get all wrapped in what "general" means and reduce to arguing that even humans don't have "general intelligence", etc. To me, going down those rabbit-holes is really "losing the plot."


AGI is what someone says when they want to sound like they're talking about AI, but don't know anything about AI, and still feel the need to sound as if they're wearing a lab coat.

It's like when you're in a room where a bunch of people who can't name a medical school are trying to explain to each other what r0 means in a discussion that they believe is about COVID.

It's a red flag.


You do realize that they were trying to have a rational discussion here, right? They're pretty unambiguously asking for information and trying to understand your viewpoint.

If AGI is the wrong term to use then either get over it, or explain what term they should use. Particularly when they ask you to explain why the term is wrong.

The idea of a self aware AI is a little difficult for most people to define. Even if the term AGI isn't accurate, don't you think it's important that people be able to discuss the topic? Especially when those people are quite explicitly trying to learn more about it.


"You do realize that they were trying to have a rational discussion here, right?"

Rational is another HPMoR red flag word.

All rational means is "follows a rationale," or a rules system. Astrology, anti-vaxxing, belief in Hermetic magic, and chemtrails, while all ridiculous and incorrect, are also fully rational. So is Dungeons and Dragons, or internalizing Star Trek lore.

.

"They're pretty unambiguously asking for information and trying to understand your viewpoint."

And I gave it, clearly and politely.

.

"If AGI is the wrong term to use"

No, it's not the wrong term to use. Swap it with a synonym and you have the same problem.

Look. What if I started rambling about fixing aging? Like, straight up immortality. Would you think I was a compelling medical light, or an outsider with dreams that don't make sense given today's realities?

Flying cars? Pocket fusion devices?

"bUt ThEyRe ScIeNtIfIcAlLy PoSsIbLe"

It doesn't matter if I'm "using the wrong term." I could call them aerial vehicles, floating carriages, hover-Datsuns, whatever you want.

The problem is the idea. Anyone who's plying these ideas has completely missed the boat, and doesn't recognize that they're reciting bad science fiction.

.

"The idea of a self aware AI is a little difficult for most people to define"

I see you're still missing the boat.

.

"don't you think it's important that people be able to discuss the topic?"

It is approximately as important as discussing vampire repellant strategies.


What makes you say that? There's a vocal faction I've heard who essentially say that statistics can never be intelligent. I don't understand the argument. I think it's grounded on feeling like there should be something somewhere in the complicated system that you can point at and say it understands, but just like the old chinese room thought experiment I think the question of what it means to understand isn't the important one. Practically it's hard to deny that a system of approximation run statistically can do more or less the same things we can, because that appears to be what we are.

I think there's the much more defensible position that current techniques can't scale up to general artificial intelligence. That's pretty clear because we've already had to change details of model architecture a lot to get the models to where they are now


The autonomous driving question is an interesting one. It's probably some combination of techno-optimism, naive extrapolation of deep learning advances over a fairly short period of time, and--yes--at least some degree of a massive game of topper because everyone else was feeding at the funding trough and saying things were right around the corner. How much of each doubtless depends on the person.

And a lot of other people listened to the vocal "experts" and assumed that they must know what they're talking about.


> majority of researchers in the field who have been researching for decades already know deep learning isn't the correct way forward

Do you have a reference for this? It’s not my experience.


There's also the AI adage that you don't get to the moon by building successively larger ladders... all this deep learning stuff is great and will unlock amazing value (esp if we fund open data+models), but we run the risk of exhausting consumer and government patience if we keep promising the world with this single family of techniques.


That's interesting and something I hadn't heard before.

Any pointers to further reading on why deep learning might be a dead end?


How about a simple explanation of why it seems palpably absurd to some people that it will not be a dead end?

Consider a teratoma or lab-grown organ. Is grafting or engineering another attached sensory mechanism going to bring it closer to being an ordinary organism?

I think whether capabilities are super- or sub- human is a red herring.

Even a really primitive organism is still taking all of its computational capabilities and outputting, implicitly, decisions in one context that is its perceived reality.

A collection of computation and perception modules does not do this, without something else.

I don't think developing "something else" is obviously impossible or would require magic. But I'm not sure anyone sane would want to create it when it inherently creates unlimited risk of running amok. This is what the LessWrong people are afraid of, aren't they?


I think it's important to distinguish between something that's a "dead end" in that it will take you nowhere interesting, and something that's a "dead end" in that it will, after a certain point, cease advancing.

The former would imply that there is no point using deep learning and other similar techniques at all, and is the common implication when people say something is a "dead end".

The latter is what I believe the current generation of machine learning to be: I do not believe it will lead to AGI, and I am skeptical that it can do a great deal more than what it has already done (it can continue to refine the types of things it already does, and I expect it to do so, but I don't think it will open up new categories of things it can do many more times). But despite that, it does do some very cool things now, and as they are refined, I think they can be commercially successful and generally beneficial tools.


They didn't say it was a dead end. I think their observation is rooted in gradient descent and how it improves but gets stuck in local maximas, when that solution might never be good enough. Hence the ladder. No matter how good you get at ladder making it with take you to the moon, but it will get you closer the entire time you improve at it.


For a contrasting viewpoint, there's always Sutton's Bitter Lesson - http://incompleteideas.net/IncIdeas/BitterLesson.html


Not sure if the analogy holds. Human brains work on the same principle as lizard brais, no change in strategy was needed to go from lizard intelligence to human intelligence. I'm not saying that DL holds this promise though.



Yes, but unlike a ladder and a rocket, they are not fundamentally different. This is why I don't think the analogy holds. Evolution iterated over the same model, never starting again from scratch.


[flagged]


Be kind. Don't be snarky. Have curious conversation; don't cross-examine. Please don't fulminate. Please don't sneer, including at the rest of the community.[1]

[1] https://news.ycombinator.com/newsguidelines.html


There's some real selection bias in that observation: if we overestimate a technology's impact in the long term, nobody's going to remember the technology or the estimates by the time the long term rolls around.


self driving cars would be easy if all the cars on the road were self driving, what makes it difficult is getting them to operate in an environment with illogical human drivers.


Self driving cars would be easy if all cars were self driving, all roads were designed for self driving cars, roads did not degrade, there are no pedestrians, no cyclists, trash or other obstructions could appear on the road, weather did not exist, and the sensory hardware could not get damaged or obstructed.


Very true but lets pretend we are in the real world where making all these changes is exponentially harder than only changing the cars. Listening to Geo-shot the main problem is to get them to co-exist with human drivers.


And even then... Any sufficiently complex system is chaotic (behaves in surprising, non-linear ways). And traffic networks pretty certainly constitute a sufficiently complex system.


And they already exist under these conditions: automated subway trains.


Very true Haha.


I'm certainly unconvinced that the first part of that statement is true. We've seen self-driving cars run into static obstacles. We still need to deal with emergent behaviour of groups of self-driving cars. There are still unpredictable environmental factors that we rely on humans to work out now.


No, pedestrians and cyclists still exist.


And animals and weather and road damage...


You are as overconfident as they are to say that there is "no way" that those things happen.

Let's try to not make false promise one way or another, just wait and see.


Because technologists these days are first and foremost salesmen


No, it appears the "technologists" here on HN (such as yourself) are now people who are techno-pessimists through and through, don't see the big deal in new breakthroughs, aren't excited about technology at all really, can't extrapolate a trend a few years into the future, and would rather quit their job and do wood working.

Is there another forum where all the hackers went, I think I want to go somewhere else.

Edit: I take the part about wood working, I think that's a reasonable response to the way today's tech jobs make us feel. I know many people who'd like to quit and just work on side-projects (or wood working), but they don't hate technology itself, that's the difference.


>No, it appears the "technologists" here on HN (such as yourself) are now people who are techno-pessimists through and through, don't see the big deal in new breakthroughs

Yes, it's called experience. After one has seen all those "new breakthroughs" hyped to high heaven and amounting to nothing, they have learned not to take salesmen at face value.

Of course some people remain eternally starry eyed, as if they've just came down from a bus from Nebraska to L.A. to become movie stars. Sadly, they're still not the ones doing the real innovation. Just the ones taken for a ride by one hype after another.


Thing is it gets hard to get excited about every new “breakthrough” after twenty years of paying close attention to them. This inevitably means I’m late to the party on some things, but I don’t care.

Tech is a means to an end; specifically a way to make a living and hopefully buy myself out of the rat race.

If that isn’t “real hacker” enough for you then that’s fine. In my book, lifestyle design is the hacker ethos applied at the meta level.


> If that isn’t “real hacker” enough for you then that’s fine.

"Means to an end", "make a living", "rat race", yeah I don't think that would be considered "hacker" by any definition of the word (I think the word is "corporate"), and this site is called "Hacker News".

You don't have to be excited and bullish about every single new trend, but at least be interested enough to take a closer look. I was interested in crypto a few years ago, now I don't care much about it. Still excited about VR and AI though. I mean, it's a free country, but if this pessimism is the default attitude here on anything, I think it's time to find where all the actual hackers are hanging out.


I'm sorry that you feel you haven't found your tribe. I get it and feel/felt similarly at times.

> I don't think that would be considered "hacker" by any definition of the word (I think the word is "corporate")

Tempered passion != corporate.

> You don't have to be excited and bullish about every single new trend, but at least be interested enough to take a closer look

No, I don't owe trends anything. It's easy to trend, and much harder to deliver value over time. I wait for the wheat to separate from the chaff before investing time. This is not pessimism, this is triage of emotions and attention. It isn't a belief that what we have is best, but rather a preference to be doing the work as much as possible. Time, unfortunately, is zero sum.

I have stopped chasing relevance, and in doing so, found contentment and focus. I'll reiterate: the self-work I do is my current manifestation of the hacker ethos.

I am excited by PL stuff: Rust, Haskell, compilers, LLVM, etc. I write Rust and Haskell at work, so I scratch a lot of that itch on the job. It took a long time to get to that point. I still have too many tabs open on my computers about cool PL stuff like the u-combinator, GC implementations, cache friendliness, and other low-level esoterica.

Yet, life is much better when tech is a part of it, not the centerpiece of identity. Perhaps that is the main difference that is surfacing here.


>if this pessimism is the default attitude here on anything

Ah, if only that was the case... One can dream, however...

Mind you, when real hackers walked the earth (MIT, AT&T, Xerox, and so on), they weren't that hot about hypes either, and they couldn't be further from today's marketing types.

Those were hackers: programmers, scientists, enthusiasts, and so on.

Not today's used-car-salemen types touting AI, big data, VR, and the rest...


> they weren't that hot about hypes either

Are you really trying to convince me that the original hackers were tech-conservatives? They literally set the trends and built the future.

Who are these marketing types you talk of? I don't dispute there is a lot of fluff and hype going around, but you chose yourself who you listen to. I follow true hackers like John Carmack, Jeri Ellsworth, Andrej Karpathy, George Hotz and prominent scientists like Yann LeCun. If all you find is hype and marketing that's on you. It's like people complaining that Twitter is bad, you decide who to follow!


Subscribed to the answers to this comment.


>can't extrapolate a trend a few years into the future

It's better to not extrapolate S-curves into exponential functions when you don't know what you are talking about, yes. A good advice for both DL researchers and "techno-optimists" who haven't seen unsubstantiated hype they didn't jump on yet.


Discord servers.


What are your favorites?


[flagged]


I've been on HN since about 2012 under various accounts, mostly lurking


Agreed. The billions of dollars thrown at anything claiming “AI” is quite disappointing. It’s a small branch of statistics the same way that statistics is a small branch of math.

At this point, even a rolling average is labeled as AI.


The billions of dollars thrown at anything claiming “AI” but would be lukcy to be high school level statistics under the hood is disappointing

There fixed it for you ...

However, that is the cycle of society. 1990's all you had to do wat put '.com' in your company name and you were a paper millionaire overnight. 2000's "Social, 2010's mobile first. Metaverse, AI and beyond.


> It’s a small branch of statistics the same way that statistics is a small branch of math.

Is this sarcastic or something? Math understanding is a fractal, the closer you look at a branch the bigger it gets. Statistics in particular seems to have countless real world applications.


Not sarcastic, looking at statistical modeling there is a vast library of modeling techniques and research of which “AI” is a small component.

In my experience, non-AI models tend to outperform the AI ones in many applications.

This is different than how AI is marketed.


Deep learning is statistics weighted graphs, but graphs are a way to encode all of mathematics.

Which is why things like manifolds and topology are useful for analyzing our models, because all of the ways to encode mathematics are equivalent and you can transport ideas between them.


Out of curiosity, what are some other, powerful, statistical branches besides those used in "data sciences" ? (honest question, I just completed a MSc in data sciences but I must admit I don't know much besides that)


I’d recommend generalized linear models which are a subset of linear mixed models. These are the driving force for financial services.

The model building process for these translates well to other types of models.

Decision trees and random forests show up too.

I think data science is great in that it exposes one to programming and technology moreso than a focus in math or stats, I was mostly complaining about the marketing I’ve seen surrounding AI. I recall listening to a few mba consultants claiming that they could improve a business process significantly by using ML even though there was sparse inaccurate data available and they had no modeling background.

Overall I think AI models are interesting and often useful (although not sure what fits under the umbrella), but am tired of the hype, strong opinions, and amount of claimed expertise.


Statistics is the basis of science. Probability is one of the axioms of reality. Probability just works and is assumed to work and nobody knows why. It is fundamentally the reason why Entropy always goes up.

It is much more then a small branch of math given it's foundational relation to the real world.

Intuitively AI is much more just a branch of math. We are getting closer and closer to understand theoretically how our own brains work. The philosophical connections are profound and you are using language categories to limit your thinking.


I don't understand this criticism. What matter are the results, not the technology used. If your company is capable of solving a new problem that is worth billion of dollars, then does it matter that it uses basic statistics ?

Of course if they overpromise, then reality will catch them and the company will fail.. but this has nothing to do with statistics.


This is as ridiculous as claiming that a neural network is going to make Skynet.

First, statistics is not a small branch of math, proability theory is (not a small branch by the way, not by a long shot, just a branch, one of the fundamental and very large ones). Statistics is the art and science of dealing with data, it uses probability theory like physics uses math, but is far more than probability theory just like physics is far more than math.

Second, Neural nets and machine learning in general is an extremly powerful way of solving problems of certain structure and form, and it solves problems not encountered in statistics. Claiming it's a small branch of statistics is like claiming that the entirety of computer science is a small branch of logic.


What are the top 3 things accomplished and implemented by the “extremely powerful” neural networks and machine learning?

How significant are these top 3 things?

Have you ever built a production model for a large company or have a tier 1 phd? If not, what are your qualifications for labeling a claim ridiculous.


>What are the top 3 things accomplished and implemented by the “extremely powerful” neural networks and machine learning?

Off the top of my head and in no particular order :

1- Playing human games at superhuman performance from raw pixels

2- Recognizing objects and people in images at human or better performance

3- Recognizing human voice in real world noisy environments and parsing it into text at human or better performance

>How significant are these top 3 things?

They generate billions of dollars of profit and beat human knowledge and expertise that took thousands\millions of years to craft and\or evolve with nothing but a GPU and a lot of electricity.

>Have you ever built a production model for a large company or have a tier 1 phd?

No.

>what are your qualifications for labeling a claim ridiculous.

I have a brain that can spot ridiculous hyperbolic claims, and experience-backed knowledge with neural networks and machine learning that can explain with examples why they are ridiculous and hyperbolic.

If you do have the qualifications you request, start with using them to support your claims. Here's a small challenge that should pose no difficulty to you.

Show me, with derivations and citations, how an RNN is "just" an obscure statistical model. Which statistician first published or applied it ? Is it taught in a statistics course you know of ? Which ones ? Are research statisticians publishing on it right now or ever ? In what sense is it part of the science of statistics like, say, the normal distribution is ?


>They generate billions of dollars of profit

Number 1 has probably generated ~0$ in profits. It's not immediately clear that number 3, when weighting its contribution apart from everything else in the products it's integrated in, has reached 10-digits in $ profits.

So you're only left with number 2.


Except for of course when number 2 fails which is quite often.


I wouldn't be so quick. #1 is usually implemented by Deep Reinforcement Learning systems, RL is an AI paradigm descended from Control Theory and is used extensively elsewhere in Optimization and Operations Research. Here[0] for example is a Nvidia blog post detailing how they used it to obtain circuits with 25% less area at the same performance metrics. (I didn't search for this post except just to get the link, it's very typical of the kind of research I follow and I read it as soon as it was posted several weeks ago.) In another[1] example, Deepmind researchers used DRL to control a fusion reactor with results that exceed the current state of the art.

I used games as examples just because they are the easiest to understand and most popular applications of RL, and also very general, but that would be like saying that CNNs are useless because recognizing cats is not a bussiness, recognizing cats is just 1 example of the vast array of things CNNs can do. For every optimization algorithm pioneered by RL research in order to play a useless $0-return game, you can bet that it's later implemented and used in other areas to save and generate millions or billions of dollars. Games are just the test playground for new research.

I don't understand how #3 can't be credited with the billions of dollars the products and services based on it generate. According to [2], the global market of offering a speech-to-text API is valued at 1.3 billion dollars in 2019, if you assume that just 20% of that is actual customer value (to account for hype, errors in valuation, etc...), that's about 260 million dollar per year. Even an extremly conservative value estimate tells you the technology generates a billion dollar every 4 year (if it stays constant, the forecast projects the market will reach >3 billions by 2027). Keep in mind that this does not include any value generated from any other component or product, this is just the profits from the APIs alone, the generous 20% discount is to account for hype, analysis errors and bad data.

But you know what ? the broader point is, these are just 3 examples that I pulled off the top of my head while I was making a 5 minute comment without consulting google. And considering the person I'm replying to never provided a single example or any kind of specific clarification till now, or anwered any of my questions or otherwise offered any kind of rebuttal, I think it's pretty fair to say they hold up nicely.

[0] https://developer.nvidia.com/blog/designing-arithmetic-circu...

[1] https://www.nature.com/articles/s41586-021-04301-9

[2] https://www.fortunebusinessinsights.com/speech-to-text-api-m...


>I used games as examples just because they are the easiest to understand and most popular applications of RL

No, you used it in exactly this context:

" >How significant are these top 3 things?

They generate billions of dollars of profit "

and the claim was incorrect regarding example 1.

I also don't think you understand what profit is and the difference between profit and revenue. Regardless, this exchange has become pretty pointless.


So something that is used to develop something that later generates billions of dollar of profits isn't, in itself, responsible in part for those billions of dollars ?

So by this reasoning, something like Linux has exactly $0 value in monetary terms. Linux is never an application, it never does something that actual users pay for (the "users" who pay for Linux support are just developers and corporations who use it to develop something else).

>I also don't think you understand what profit is and the difference between profit and revenue.

Yes, I don't understand an elementary distinction between terms that anybody with a dictionary understands. This is quite a fair and productive point for you to make.

>this exchange has become pretty pointless.

Probably the one thing you're right about in this, I'm indeed not interested in debating incredibly insecure people fond of accusing others of ignorance (quite often a projection technique) instead of engaging with their arguments or citing specific examples or research.


Regarding the last sentence, you started the argument and haven’t displayed the slightest understanding of the topic argued. I have 15 years combined experience between academia and career excluding undergrad. I’ll be the first to admit that there are counter-claims and refinements to my statements, but you just go on and on and on without saying anything meaningful while being addicted to having the last word. So go ahead and talk in circles.

For the sake of analogy, perhaps I should read some articles on equine veterinary practices and start talking down to practitioners in the field, all for the sake of making my internet ego feel better.


It's utterly hilarious how you think the person who cites research and asks specific concrete questions (which, as a reminder, you hadn't answered yet) is the one who talks in circles and strokes his ego.

Yeah, 15 years in industry and academia, and you don't know how statistics and probability theory are different.

May you have a nice happy day\night my guy.


>Yes, I don't understand an elementary distinction between terms that anybody with a dictionary understands

Yes, you actually don't understand them. Which is why you said this

>the global market of offering a speech-to-text API is valued at 1.3 billion dollars in 2019

which is about revenue when you previously were talking about profits. It's obvious who is the insecure one in this exchange.


Exactly. Remember a couple years ago when just saying "blockchain" would get money thrown at you?


Or more accurately, grifters.

The very same ones did it with 'open-source' (hijacking 'free-software' and creating companies that develop closed-source products). The same happened with the Web (and has been ruined by the ads industry thanks to Big Tech), IoT (with phones being the worst IoT device), Blockchains (web3 being yet another hyped up narrative duping retail investors), and now AI once again (with Deep Learning) is already creating a new grift after the release of Stable Diffusion.

All together have been a total disaster towards privacy and are quite frankly dystopian creations and tools for the technologist grifter.


"We won’t waste your time arguing with these claims"

Err, actually, if you have something substantive to say, please do?

I would really love to be persuaded that unaligned AGI is not something I need to worry about in my kids' lifetimes. I'd much prefer to believe more strongly in my ability to set them up to thrive! But when skeptics repeatedly don't seem to have anything to say beyond sneering, my anxiety is not alleviated.


I would really love to be persuaded that an alien invasion is not something I need to worry about in my kids' lifetimes.

Come on. We can't even build an AGI equivalent to a cockroach yet. Let's get real and stop panicking about things that are 100% hypothetical.


> I would really love to be persuaded that an alien invasion is not something I need to worry about in my kids' lifetimes.

There's good reason to think aliens aren't going to invade in your kids lifetime. After all, as far as we know, aliens haven't visited Earth before.

If, however, we received a message from an alien civilization saying "we're coming in the next 10-100 years", I would definitely be worried.

There are many good reasons to think that an AGI will be created in the next 10-100 years. There aren't many reasons to think that this won't happen except saying things like it's "hypothetical". Human flight was hypothetical one day, and reality the next. Landing on the Moon was hypothetical, then reality. Unless there's a fundamental reason we can't build AGI or that 100 years isn't enough, this is something we should be worried about.


There are no good reasons to put a specific time period prediction on the arrival of AGI. You are essentially stating an article of religious faith without a clear scientific basis. "Surely I come quickly. Amen."

How much actual money have you bet on your prediction?


I think there are plenty of good reasons to put a time prediction on the arrival of AGI. Saying that it's just an article of religious faith is a pretty bad argument.

I think we'd both agree that we can definitely predict some things. We can put a reasoanble prediction on the economy in 10 years time, on weather, on e.g. processing speeds, etc.

I think we can also put some predictions on a bit more far-reaching advances. E.g. I think it's a reasonable prediction that in 50 years time, we'll have some form of energy that is in wide use and relatively clean, possibly using new technology. I don't know much about the energy space - maybe my prediction is terrible. But surely experts in the field have reasonable predictions, don't they?

So why, when it comes to AGI, do you think we are unable to make predictions? What's so special about that technology?

I could be completely wrong, but it seems to me like you might be ascribing more magic to AGI than I am, and therefore think predictions about it are stupid. I'm not thinking of AGI in religious terms or anything like that - I'm literally thinking things like "how long until an AI system kind of like Copilot can program as well as a human, for some value of programming? How about do other, harder tasks? Etc." And I'm assuming that if you extrapolate this out 50-100 years, you eventually get something with incredibly vast potential.

Where am I going wrong?


You are going wrong by making up arbitrary numbers. Extrapolation is not a valid technique for making long-term predictions. The curve is non-linear.


Well I wasn't making up the AI numbers, the experts who supposedly know something about this gave them. But some follow-up questions:

1. What is a valid technique for making long-term predictions? 2. Do you agree that in theory AGI can be built? 3. Given that some people worry that it has the potential to be harmful, do you think it's worth at least trying to think through when it will be built?


There are no experts on AGI. It's like being an "expert" on unicorns or leprechauns.

There are some experts on advanced statistical techniques which are often classified as AI. Those techniques have real practical value in narrow problem domains, but we have no idea whether any of those techniques will ever lead to a true AGI.

There are no valid techniques for making long-term predictions in poorly characterized, non-linear systems where we don't know what the goal looks like or how to get there. No, it's not worth trying to think through. Let me know when someone manages to build an AGI roughly equivalent to a lizard or mouse or whatever. If it ever becomes a real thing then we'll have plenty of time to think.


I'm not sure what you mean by an AGI equivalent to a cockroach, but let's take it for granted that cockroaches can do things AIs still can't. AIs can also do a lot of things that cockroaches can't, including plenty that until recently seemed far out of reach (translation, superhuman Go play, illustration). Why would we expect intelligence to get smoothly better along a total ordering before it's dangerous? Human capabilities don't work like that.

Re 100% hypothetical: this seems like a pretty useless heuristic. Climate change, the health impact of leaded gas, and the nuclear bomb were all hypothetical, until they weren't. For that matter, if something isn't at least a little bit hypothetical there's no point in worrying at all; it's a done deal!

The alien invasion analogy seems like it suggests the heuristic you're actually using is "wild scifi thing so not worrying". This is a better option than "still hypothetical so not worrying" but still doesn't seem reliable.


We have a lot of dangerous technologies which exist today and are either actively killing people, or could if they are actually used. It's silly to worry about something which doesn't exist, and for which we don't even have a clear path to build. It's like someone in 15th century Milan seeing Leonardo da Vinci's design for a helicopter (aerial screw) and worrying about attack helicopter gunships coming to destroy their city.


That principle would have Einstein laughing off Szilard instead of sending him along to FDR. But sure, if AGI were on the order of five centuries away, dismissal would be the right take. I don't see much reason for your confidence in that but I hope it's there!


I find it more illuminating to look at resolved prediction markets for concrete benchmarks; many of which have resolved faster and more positively than expected. https://bounded-regret.ghost.io/ai-forecasting-one-year-in/

I guess we will have to see what happens, but starting an anti deep learning blog after the staggering progress of the last six months seems pretty wild...


The beauty of all is that it actually seems we are underestimating current methods of AI. The noise (especially in this community) is very high though, so I suggest everyone just look at the experts predictions from the year before and compare with what we have achieved.


Here's my cynical take on the state of the art for 99% of AI projects: 1. take some training data. 2. Do some statistical analysis on it and find relevant variables for the variable you want to predict. 3. Build a model to predict the variable. 4. Put it into production. 5. Run model and get predictions. 6. Notice predictions are becoming less reliable 7. Retrain the model on new data. 8. GOTO Step 5.


To borrow an analogy from cognitive science, deep learning has pretty much mastered System 1 thinking in independent domains (image, language, audio) and scaling laws suggest we have a clear path to self-supervised multi modal models (video+audio+text) by applying more data and compute. We will likely miss critical world knowledge about dynamics, contact forces, taste, feel etc. until we figure out a way to efficiently collect datasets for these modalities.

What’s less certain is the path to System 2 thinking that is required for true AGI. We have yet to find evidence that deep learning is enough for AI to learn planning and conscious thinking.


They delivered technology that everyone would have said was sci-fi ~5 years ago. I think some confidence is deserved at this point.

I personally remain skeptical of the claims that the singularity is just around the corner though.


I think the problem is that most of the people working in this field does not seem to have basic knowledge about statistics or how to evaluate statistics. They somehow think they can cheat it. And they over evaluate their own results.

The death of Google Translate was caused by GT started to learn from itself. Same thing will be happening to self-driving cars and so on.

Five years ago, some garbage predictors said developers would become obsolete. I tell you, the amount of work that needs to be done by real people at real companies in the software biz has never been higher.


We have a stock market investment bubble. Deep learning, AI, metaverse, blockchain, quantum computing, self driving cars, and this very website's whole shtick all revolve around fumbling for an answer that isn't there.

"Technology" is set for a long winter of commoditization and consolidation. Technology as a standalone sector will become sleepy with all the excitement being in every other sector of the economy leveraging cheaper and better technology for their own benefit.


That's one way to look at it.

Another is that AI/ML will replace illustrators, musicians, vocalists, and the like. That "metaverse" will take over gaming and lead to powerful new filmmaking techniques. I'm building something in this space and I can tell you that these technologies are powerful because people who aren't "traditionally skilled" are using these tools to create amazing content and excel. This stuff is going to eat the world.

Not sure about the other domains you mentioned, but I'm sure they have ardent believers too.


VR does not add enough value to counteract the UI problem that most of the people who try to use it experience nausea, vomiting, and visceral aversion.

Minecraft is plenty cool and my four year old is making amazing stuff in it. But increasing the resolution, frame rate, and immersiveness of that experience is decidedly evolutionary, not revolutionary.

And there was already no money in music or graphics design before the machines showed up, so I'm not sure how that's going to justify the investment bubble.


The tech stack isn't just VR/AR, but includes the scale and affordability of rendering, increasingly wide range of sensor input types, spatial computing, mocap, photogrammetry etc.

https://youtu.be/RR_3eMIP6YE

https://youtu.be/QM8qOoYK3Fs (mildly NSFW)

https://youtu.be/2-1G8JUWoG8 (same techniques but narrative)

Pretty soon everyone is going to have access to world class creative tools that give them the same capabilities that previously only studios with institutional capital had.

This is the creative decade.


Unlike the other technologies listed, Deep Learning, AI, and to a lesser extent, "partial self driving" exist today, are rapidly advancing, and are producing massive amounts of value. You can play with the AI systems yourself too.

Don't even talk about them in the same sentence as useless shit like the metaverse or blockchain tech.


>are producing massive amounts of value

Point me to a single DL-exclusive company that has an annual turnover of >1 billion dollars and is profitable.


Huggingface


Huggingface doesn't have a revenue of even close to a billion dollars.


In our experience, even deep learning technologists at the top of their game are surprised when we point out that deep learning (or any other form of AI) struggles when tasked with predicting the future.

It seems to do a pretty good job of predicting future stock market performance. But of course the authors have an axe to grind and choose to view the world through their narrow filter, probably because they want to promote their book and tap into the "tech is bad" trope thats going on in the wider culture.

Most significantly, AGI hypers have succeeded in directing a large amount of funding toward the fanciful goal of ensuring that this supposedly imminent AGI will be aligned with humanity’s interests. Crucially, much of this community believes that we already have all the innovations we need to reach AGI; we just need to throw more hardware at the neural networks we already have.

The problem with this argument is that we have no idea what the "warning signs" of imminent AGI is. It could take 2-3 key breakthroughs or it could take a few hundred. Whatever the case is, we should think about investing in a fire alarm system before theres a fire.


> It seems to do a pretty good job of predicting future stock market performance.

Does it? Care to elaborate on that?


Well the obvious example is RenTech's Medallion fund which consistently out-performs human stock traders.

The Medallion Fund has employed mathematical models to analyze and execute trades, many of them automated. The firm uses computer-based models to predict price changes. These models are based on analyzing as much data as can be gathered, then looking for non-random movements to make predictions. [1]

Renaissance Technologies' Medallion fund, which from 1988 to 2018 clocked annualized returns of 66%. After fees, those annualized returns were still remarkable, at 39%.[2]

[1] https://wealthofgeeks.com/medallion-fund/#:~:text=Algorithmi....

[2]https://money.usnews.com/investing/funds/articles/top-hedge-....


> Well the obvious example is RenTech's Medallion fund which consistently out-performs human stock traders.

The article isn't critical of mathematical models, it's specifically critical of deep learning. Is there any evidence that the Medallion Fund uses deep learning? Are there any funds that have said they use deep learning that have a consistent track record of beating the market?

My (admittedly anecdotal) understanding is that many, many people have tried using deep learning to predict the market and I'm not aware of anyone who has had long term, consistent success with it.


The famous medallion fund is running for decades before there was (practical) deep learning.

Offtopic: I still bet the "secret sauce" is insider trading :)


They are talking up their own chances of success for the $$$'s. When they tweet / do a puff piece that a PR person has lined up in the news of course they will say that their one true technology is going to solve world hunger and bring about world peace.

Why are <insert language, database, framework, technology etc> so overconfident?


>snake oil >our book

Ctrl+F "scaling" gives 0 results across this essay

Deep learners have their scaling laws which seem to hold across many orders of magnitude of model size and compute spent on training. Critics sometimes have to resort to building their argumentation along pretty esoteric tangents to avoid naming this elephant in the room.

Instead of denying scaling, why shouldn't we make it work for our society's benefit? A new generation of thinkers already proposes a whole research program for life science, designed from the scaling and transfer learning fundamentals: https://markov.bio/biomedical-progress/


Deep learning technologists != people that write articles about machine learning making various professions obsolete. I don’t think anyone that works with neural networks professionally is drinking the kool-aid. Don’t get me wrong, deep learning is an incredible technology, but not in ways that it’s made out to be in the press. Like a lot of other technologies, it’s real value is in solving really boring problems at a scale. But that doesn’t make for good content so we end up with a bunch of articles about self driving cars and nothing about using deep learning to take something like 3d photogrammetry from “passable for artistic purposes” to “industrially valuable.”


Uh, the nag to harvest email addresses was weird. It completely covered the article. I almost closed the tab without reading the article.

IMO: Get rid of it. (Or use a normal style that doesn't nag for an email address until you've scrolled a bit, and keeps the page visible while the nag is open.)

Why? People don't always read an article when they open a page. (I certainly don't.) By the time I saw the page I was totally lost from the context of what the page was, and the only reason why I didn't close the article was because the title on the Hacker News adjacent tab looked interesting.


Because the field saw exponential progress which has started stalling out only relatively recently. And so now there are two possible explanations:

1) We're reaching near the end of where the current domain can take us.

2) We're making some sort of mistake or otherwise missing something that's the key to unlocking continued exponential progress.

It's always natural, especially for those working within a domain, to assume it's #2. Unfortunately it seems our universe has this habit of always kicking us back to #1 just before the completely game breaking stuff starts happening.


I don't know man, everything so far seems to show that it scales frighteningly well. Maybe the skepticism over DL is wildly exagerrated?


This feels related to the idea that there is mostly no such thing as "general intelligence." Rather, there's a diverse field of mental/cognitive skills. Many may correlate (or even..causate?) but I believe this is why e.g. AI is overhyped.


Deep learning has two main deficiencies:

1. They depend on pre-existing data. In other words they might not create something new.

2. They need a lot of hand-holding for learning. Children have an innate instinct to learn, and deep learning is not (yet) there.


Because they have no idea what they're doing so they compensate for that with extra confidence.

:shrug:


counter argument: they know exactly what they're doing, but no one with actual money is going to fund them if they talk about their work in realistic terms instead of boasting and promising the moon, next week.

Academic budgets are one of the biggest jokes in science, the real money comes from investment capitalists, and goes to those who make the biggest promises, the loudest.


So, worse than not knowing, it's plain lies?

The counter argument makes no one look good. Who's to blame: The game for pushing the gamers or the gamers that keep the game alive?


If you're asking people to pick just one, you're being far too generous in placing blame.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: