Hacker News new | past | comments | ask | show | jobs | submit login
Artificial Intelligence is stupid and causal reasoning won't fix it (arxiv.org)
81 points by wallflower 40 days ago | hide | past | favorite | 58 comments

I recall reading about Searle's Chinese Room argument in Daniel Dennett's "Consciousness Explained" about 25 years ago. Maybe the exposition there was more complete than the version given in this article, but the argument of this article is terrible.

The author says that the English speaker who simply manipulates symbols and follows rules will never get the joke written in Chinese, even if the people external to the room understand it and think it was produced by an intelligence that understands the joke.

But that contains the assumption that the human is the consciousness in that arrangement, when in fact the human is just the energy source which drives the hardware. One might as well say that computers can never create a 3D drawing because its power supply doesn't understand arithmetic.

Ugh. I remember listening to Searle on the BBC in the early 1980s, and becoming quite irate, as a 17 year old does, at his mistake in defining the boundary within which intelligence should be found as the operator not the whole room.

You can make a pretty good argument that the _rulebook_ is the Chinese-speaking intelligence in the room. "But it's an inanimate object, if the human stops following the rules it will be inert". Yeah, and if your mitochondria stop producing ATP you'll be inert too.

Also. People taking Seale's position rarely reckon with just how big that rulebook would be.

Daniel Dennett coined the term “intuition pump” for this, which is a great concept to keep in mind so you don’t get caught by one. Wikipedia: An argument “designed to elicit intuitive but incorrect answers by formulating the description in such a way that important implications of the experiment would be difficult to imagine and tend to be ignored.”

Scott Aaronson has an excellent take on the Chinese room argument in my opinion, namely that the existence of intelligence can be inferred from asymptotic space complexity. If you have a rulebook which lists every possible sentence and optimal response, the size of that book is clearly exponential in the size of the longest possible input.

On the other hand, if you could construct an algorithm which can "understand" Chinese using only polynomial space and runtime, then it seems a lot more intuitively clear that there is something genuinely intelligent happening.

I think the problem with the Chinese room is that if you have something that can "mechanically" translate via algorithm, maybe that means it already understands. It's like saying a blind person cannot understand anything about the world because they cannot see. But to the extent that the blind person can successfully navigate the environment belies a sufficient understanding of that domain.

The “Berkeley ‘systems argument’”, i.e., the room understands, not the human (why Berkeley gets credit for this obvious counterargument I don’t know) is mentioned in the paper, but not refuted other than by some confusing thought experiment involving a joke that just seems to repeat the original argument.

It also mentions Harnad, whose classes I attended many years ago.

It's still unclear to me what the point of this type of philosophy is though. Perhaps its practitioners find it entertaining? I haven't the faintest idea why we'd fund this more than say, people who want to make giant stone heads and leave them in the city's parks or people who make funny videos about their cats. It just goes around in circles, finding new ways to make the same (dualist) argument that I considered and rejected as an infant. I don't think we should ban it (or the giant stone head people) but it seems like it could be done independently on their own time without troubling society for funding.

Harnad at least had done/ been doing something else interesting and important unrelated to his philosophical musing - rebelling against the publisher monopoly on research output. Though he taught classes about Cognitive Science, he was partly at the University because of his radical thinking of how scholarly communication should work, in the form of the journal BBS (Behavioural and Brain Sciences). This was of course way ahead of its time, the things BBS struggled to do for small numbers of scholarly articles in the 1970s can now be easily replicated online for any topic you can think of.

The paper is right, all impressive achievements amount to just curve fitting.

But ... maybe that's why the human brain works so well. The 10s of billions of neurons in your brain are designed to adapt to patterns and adjust to external stimuli from sensory input.

Just like compression algorithms can only compress something so far before losing data, maybe being limited by how much processing we are throwing at this problem limits the usefulness of our solutions.

If it was just a matter of horsepower we could perfectly simulate animals with simpler brains already.

"why does OpenWorm need a distributed supercomputer when the thing it's simulating needs about 10 millicalories a day" is one of those category of questions people don't like thinking about too much.

Are you sure the bulk of it is not emulating physics and environment?

I mean how many resources can you even throw at "302 neurons and 95 muscle cells"?

edit: down the wormhole I go

just look at the screenshots of

the "brain": https://github.com/openworm/c302

and the sim: https://github.com/openworm/OpenWorm/blob/master/README.md#q...

Reading further, more physics:

> Some simulators enable ion channel dynamics to be included and enable neurons to be described in detail in space (multi-compartmental models), while others ignore ion channels and treat neurons as points connected directly to other neurons. In OpenWorm, we focus on multi-compartmental neuron models with ion channels.

Speculation in an absence of information. There are definitely biological organisms that suffer the same mistakes as our curve fitting. Eg the bird that pecks the red dot on the mothers beak. Its selected to pick the biggest red dot, so if you put a giant red dot next to the nest they will starve. Seems reasonable to believe human brains are lots of very good curve fitters working together. If anything because this is so successful maybe it's all we actually need. Specific neural nets for specific tasks. Maybe dualism is correct and conciousness is something special, but we have no ideas on how to test that.

- loaded premise (that AI community thinks causal reasoning, whatever form that may take, is the silver bullet)

- Failure to distinguish b/w narrow and human-level AI

- Zero mention of attention/transformer models

- Zero mention of BERT, GPT, let alone GPT-3

Note-to-self: ignore and file away in the gary-marcus box.

Yeah, was not impressed. Skipping over the (opinionated) survey of recent AI techniques (and their failure modes) and philosophic theories of cognition that is the bulk of the paper, the only bit that claims to be novel (pp 29-31) is an argument I will summarize as follows:

1) any computational AI can be represented as a finite state machine or FSM (author calls it a finite state automata. Same thing.)

2) when said computational AI performs an "act of cognition" it (as a FSM) will iterate through a defined series of states, based on a defined series of inputs

3) It is possible to build a simpler FSM composed of a counter and a lookup table that would take the same series of inputs + the counter as an input, and produce the same state/output as the original computational AI

4) since the response to stimulus is identical, the 2 finite state machines are equivalent.

5) if the state machines are equivalent, they must be equivalently conscious

6) the counter+look-up table is obviously not conscious ("reducto ad absurdum")

7) from (6) and (5) no computational AI can be conscious

To me, this argument fails in the following manner; the only way to actually construct the "simpler" finite state machine in step (3) above is to actually let the computational AI react with the world first and record its combinations of input and state. There is no way to predict what series of states an arbitrary FSM will go through in response to a particular series of inputs without actually running it. That would be equivalent to solving the halting problem. Any program can be encoded as an FSM. If you could predict the state sequence of such an FSM, you could tell whether the FSM would enter the 'halt' state.

IMO this is analogous to arguing that:

1) the animatronic band at Chuck E. Cheese could be programmed to play identical music to that which has been (previously) performed by a human band (and recorded in perfect detail).

2) because they produce identical outputs the 2 bands are equivalent

3) if they are equivalent, they must equally be said to create original music

4) the animatronic band obviously doesn't create original music

5) from (3) and (4) no band can create original music

He also elides any discussion of whether or not actual human intelligence manages to avoid the failure modes he uses to conclude that neural networks are not intelligent - e.g. he mentions adversarial examples fooling visual classifier networks without mentioning that "optical illusions" exist and people will reliably misperceive certain images in certain ways too.

I actually agree that neural nets as they currently exist are aggressively stupid, but the author concludes way too much.

TL;DR, author starts from a premise that there is something uniquely special about human consciousness that machines can't duplicate, and reaches the conclusion that there is something uniquely special about human consciousness that machines can't duplicate.

Also, depending on how physics and the mind works, it's possible that human minds would be representable as finite state machines. If this argument were valid, it's an argument against human consciousness too in that case.

Ignoring the lack-of-any-progress-towards-AGI elephant in the room is presumably the main reason the AI hype train happily speeds on its pointless route throwing out it's cute tricks and tools with no actual destination in sight.

Best not to ignore.

I would definitely argue there's been substantial progress towards AGI. We've been seeing increasingly impressive AI results with increasingly more general architectures. Which criteria do you have for progress towards AGI that aren't being met?

The problem is that AI doesn't actually understand anything. We don't know what understanding is, but we know that an AI doesn't. But because we don't know what understanding is, we don't know how to make an AI do it.

We're seeing AIs that can kind of hide their lack of any kind of understanding by means of more and more sophisticated behavior. But unless understanding = "not understanding, but in a really sophisticated way", that road isn't going to get us to AGI, ever.

Personally, my bet is that understanding != "sophisticated non-understanding". I can't prove it, of course, because I don't know what understanding is either...

[Edit: I suppose the other alternative is that understanding is real, but AGI doesn't actually require understanding. That seems improbable to me, but it is at least theoretically possible.]

We have all been surprised a few times in the last years by AI advancements, not even experts could predict AlphaGo or GPT. Why are you so sure you won't be surprised again, this time by human level understanding?

The missing piece for real understanding is embodiment - we can act on our environment while GPT-3 like models can only see a fixed training set. This means an AI could construct its own hypothesis and test it if it were embodied, but can't do that otherwise. Understanding comes from playing 'the same game' with us: sharing the same environment and having aligned rewards.

Imagine if a scientist were kept locked up all her life and only had access to a video feed showing the world. Would she gain anything by coming out of the cave and actually seeing and interacting with the world?

My concern is, if we don't know what understanding is, how do we know it's not becoming a god of the gaps? Playing Go and writing poetry were canonical examples of things AI couldn't do without true understanding, right up until we wrote AIs that could do them.

Go was never thought of in that way. Search Google books for "Go chess artificial intelligence computer" and use the 20th century time filter.

It was only believed that Go was impervious to the brute-force search strategies employed by the most successful chess programs. The alternative was never "true understanding", but rather believed to be the application of knowledge engineering, expert systems, etc, none of which could be equivocated with "true understanding".

Regarding poetry, the counter argument is that a modern neural network can't generate poetry any more than a Xerox machine can generate poetry. IOW, it can only replicate styles, not invent new ones. Though, that's often good enough as far as the vast majority of readers are concerned.

>any more than a Xerox machine can generate poetry

You assume, but are you sure...?


Playing Go was a big deal because it could be done by a program that started from zero.

Did somebody create a program that can write poetry without first consuming basically everything humans have ever written? Or even after consuming roughly what a well-read human might have read?

I think it's reasonable to allow for the fact that humans must have a certain amount of information encoded by evolution, but I can't believe it's either qualitatively or quantitatively comparable to a database of all the text on the internet.

perhaps understanding = sophisticated non-understanding + (being unaware that one does not understand || having a belief that one understands)

I know what understanding is...

I don't entirely disagree that we're progressing towards AGI. For me, the most disappointing part of machine learning is complete lack of systems that can extrapolate.

To be honest I struggle with causal reasoning myself. I can intuit on some level that one thing causes another, and that in the process of causing the "other" the "thing" itself is reflexively brought about. This leaves me stuck in a place trying to make sense of things infinitely far apart in space and time all influencing each other instantaneously.

You can't send information faster than C, so if causality == communication it's not instantaneously.

"all the impressive achievements of deep learning amount to just curve fitting"

Yeah, and then the same Judea Pearl also admits that "we didn't expect curve fitting to work so well".

That's because their mental model of AI is not good enough. Apparently humans don't really 'understand AI', the irony.

By the way, causal reasoning is not the product of just one human, it is based on experiments, observations and careful model building by the whole human society over long spans of time.

We didn't understand even basic things such as infections and the role of hygiene until recently. What does that say about our causal reasoning powers? That we were stupid?

We know how COVID spreads and many people are still exposing themselves without care, sometimes causing their own demise or somebody else's. Why isn't causal reasoning working for us all the time?

I think humans can only do causal reasoning when they have a very good model of the thing they are trying to understand. Causal intelligence is not in our brains naturally, it depends on having access to specific models.

And the God is just a function. Who is there to tell, that curve fitting is insufficient for AGI?

... so says a curve fitting machine operating in a hyperplane in 100 trillion dimensions

I feel like the author has unreasonably strict expectations with some of his examples. Humans sometimes crash cars, misunderstand grocery lists, or learn to say things just as racist as Tay did - surely he wouldn't argue that any human who does dumb things lacks phenomenal consciousness.

Not to say that any dumb person lacks consciousness, but you might be surprised how many cognitive scientists think not everyone is “conscious.” There could be zombie/automatons living all around us.

> There could be zombie/automatons living all around us.

So now we have 'consciousness of the gaps'. It's an ever retreating concept, as AI advances what we call consciousness recedes into these gaps. Now we're discussing about how some humans are not really 'conscious', what next? Maybe in the end the only remaining 'conscious' people will be philosophers who don't believe in AI.

>but that AI machinery - qua computation - cannot understand anything at all

Ugh, here we go. I swear this was all gone over with a very similar post just a week ago where it was pointed out that if an author says physical things can't 'understand' or whatever else, they are implying some non-physical soul-spark in humans.

Right. Same old bad arguments. Undecidability, Chinese Room, Penrose, etc. There's a problem, but that stuff doesn't help.

Machine learning as currently done clearly has limits. The big problem is lack of an underlying model of the real world. Systems which do have real-world knowledge tend to store it as something like predicate form. Cyc is the classic example of that.

A useful question: what should knowledge about the real world look like? More specifically, is there some low-level form in which info about the real world could be represented that makes it useful as training data for machine learning? Skin contact and muscle tension, perhaps?

Then it was pointed out wrongly and this isn't some spiritual talk. If you are able to retrofit new knowledge into the observations and categories you've previously formed by correcting what you already know and by changing perspective so that you can have proper inputs without causing contradictions, you understand it. Do you think curve fitting does this?

I'm not sure what you mean by "curve fitting" here. I would definitely say that a deep neural net has observations and categories, and I think it's pretty reasonable to characterize the training process as retrofitting new knowledge into it by correcting its knowledge and changing its perspective.

This is the reply I was expecting, and I think you are right, but a CNN doesn't take into account any contradictions. It doesn't test the input for correctness, it just tries to categorize and fucked if the results make no sense.

I'd agree, but I'm not sure how much this is an inherent problem. To what degree does it just reduce to needing a good conceptual framework for a "makes no sense" category?

I would be interested in the kind of model which will tell you something is wrong when you show it a black square and call it a black square, then show it a white square and call it a black square. Not the kind of model which would average the results and adapt to nonsense.

you didn't read the part in the abstract where it says " - qua computation - ". This author is trotting out the same old reflex that a computer (any computer, with any algorithm) can never "understand" something because it's "not conscious" because it's "not alive". It absolutely is philosophically equivalent to positing a spirit, because when you start drilling down into what's different between a brain and a (general) computer.. that's all you'll have left.

The article is about state of the art computation, which does not model understanding as per my definition. Not spiritualism.

I can't stand with the unquestioned use of "understanding" here and there - did the author give a more precise definition of that? (I couldn't find any in the paper.) It reads more like a ranty blog piece than an academic article. It's sad that some people will read this and get confused.

Thanks for the link. It will take quite a while to read the paper.

We all know it but somehow ... 'it is not so much that AI machinery cannot grasp causality, but that AI machinery - qua computation - cannot understand anything at all.'

> Figure (8) shows a screen-shot from an iPhone after Siri, Apple’s AI ‘chatbot’, was asked to add a ‘litre of books’ to a shopping list; Siri’s response clearly demonstrates that it doesn’t understand language

So his conclusion is based on Siri, an AI assistant I would agree was 'stupid', but not representative of SOTA. It's unfair to judge AI by Siri, Siri is a mass produced system with scaling costs, Apple can't host GPT-3 for everyone yet. Not even Google can use the latest and greatest neural nets in mass produced AI systems because they don't have the hardware and it would not make economical sense.

> the ability to infer causes from observed phenomena

So 'reading the room'. In social settings you can't follow logic and rationale blindly because there are these things on two legs full of meat and organs that don't like it.

Relevant: Youtube algorithm blocked a video on a popular chess channel with 1800 videos, supposedly for a "racism" https://www.youtube.com/watch?v=KSjrYWPxsG8

What is your point? People make bad calls all the time too. I got a ticket for parking my car behind a sign saying "no parking beyond this point."

As a secondary point, I suspect youtube's classifier is a bag of heuristics they are constantly fiddling with. Any failures of it are no indictment of the futility of developing AGI.

1. I don't need "a point" to attach a relevant information.

2. But I indeed have one - giving algorithms so much power without an appropriate checking and appeal process is clearly wrong.

3. This doesn't imply we shouldn't do science or develop AI systems.

I'm not saying this is the case but as chess podcast can be racist. It's presented by people.

Sure, it can be. Or it could just talk about black and white, and, say, about how black has a clearly inferior position, and be talking about nothing but chess, and still trigger an algorithm.

It's like a conspiracy that Americans didn't land on the Moon. Maybe a few decades will pass and people will be surprised to learn that on a leading technical forum people entertained ideas of absence of progress.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact