- I share the skepticism towards any progress towards 'general AI' - I don't think that we're remotely close or even on the right path in any way.
- That doesn't make me a skeptic towards the current state of machine learning though. ML doesn't need to lead to general AI. It's already useful in its current forms. That's good enough. It doesn't need to solve all of humanity's problems to be a great tool.
I think it's important to make this distinction and for some reason it's left implicit or it's purposefully omitted from the article.
The more you learn about neurobiology, the more apparent it is that there are so many levels of computation going on - everything from dendritic structure, to cellular metabolism, to epigenetics has an effect on information processing. The idea that we could reach some approximation of "general intelligence" by just scaling up some very large matrix operations just seemed like a complete joke.
However, as you say, that doesn't mean what we've done in ML is not worthwhile and interesting. We might have over-reached thinking ML is ready to drive a car without major fourth-coming advancements, but use-cases like style transfer and DLSS 2 are downright magical. Even if we just made marginal improvements in current ML, I'm sure there is a ton of untapped potential in terms of applying this tech to novel use-cases.
The way a plane flies is quite different than the way a bird flies in complexity - they share an underlying mechanism, but planes don't need to flap wings.
It's possible that scaling up does lead to generality and we've seen hints of that.
Also check out GPT-3’s performance on arithmetic tasks in the original paper (https://arxiv.org/abs/2005.14165)
Pages: 21-23, 63
Which shows some generality, the best way to accurately predict an arithmetic answer is to deduce how the mathematical rules work. That paper shows some evidence of that and that’s just from a relatively dumb predict what comes next model.
It’s hard to predict timelines for this kind of thing, and people are notoriously bad at it. Few would have predicted the results we’re seeing today in 2010. What would you expect to see in the years leading up to AGI? Does what we’re seeing look like failure?
Few have predicted a reasonably-capable text-writing engine or automatic video face replacement, but many have predicted self-driving cars would have been readily available to consumers by now and semi-intelligent helper-robots being around.
Just because unforeseen advancements have been made, does not mean that foreseen advancements come true.
It's about very good general problem solving software that's way beyond the capabilities of humans while not being aligned with human interests. Not because the software is evil, but because aligning values is an unsolved problem (humans aren't even totally aligned - and values also change).
If you have an intelligence that's very good at achieving its goal and you don't have a good way to align its goal with human goals, you can very quickly get into trouble if that intelligence thinks much faster than you do.
Also, self-driving cars were mostly hyped up by companies, FSD is quite obviously a hard problem, much closer to general intelligence than the average NN application.
When I was a kid, I remember wondering how Soviets were obsessed with faking photos. A few years later, I saw Terminator 2 and realized that faking videos was also a thing. The tools for it would clearly get better and better over time. When I studied ML in the early 2000s, it seemed obvious that pattern recognition tasks such as image manipulation would be "easy" for computers, once we found the right approach and made the ML systems big enough. In the end, I decided not to pursue ML, because jobs were still scarce and I found discrete problems more interesting. That was probably the worst career mistake I've ever made.
Or, watch the latest Veritasium
#2 see linked video, he sits in the backseat, there is no driver and no one to take control.
These are just overhyped drive assist tools that market themselves immorally as something they aren’t.
If you ask it to compute something that would never have been seen on the internet it’s likely to fail. E.g. add 2 extremely large/rare numbers together
A rough calculation, humans can feasibly consume 4-5 TJ/annum of energy, of which a lot is going to go into motion or whatever. And if devoted to mental activity, it has a shelf life of ~70 years before they die.
A distributed computer might theoretically burn TJ/hr and once the weights are known they may as well be in a permanent record. The upper limits of what computers can learn and get good at are much higher than what humans can. They won't need as much implementation trickery as biology to get results.
And a plane is a vastly simpler machine than a bird!
I think this for a couple reasons:
1. The current gap in complexity is so huge. Nodes in an ANN roughly correspond to neurons, and the brain has somewhere on the order of 100 billion of them.
Even if we built an ANN that big, we would only be scratching the surface of the complexity we have in the brain. Each synapse is basically an information processing unit, with behavioral characteristics much more complicated than a simple weight function.
2. The brain is highly specific. The structure and function of the auditory cortex is totally different to that of the motor cortices, to that of the hypothalamus and so on. Some brain regions depend heavily on things like spike timing and ordering to perform their functions. Different brain regions use different mechanisms of plasticity in order to learn.
Currently most ANN's we have are vaguely inspired by the visual cortex (which is probably why a lot of the most interesting things to come out of ML so far have been related to image processing) and use something roughly analogous to net firing frequency for signal processing. I would consider it highly likely that our current ANNs are just structurally incapable of performing some of the types of computation we would consider intrinsically linked to what we think of as general intelligence.
To make the airplane analogy, I believe we're probably closer to Leonardo da Vinci's early sketches of flying machines than we are to the Right Brothers. We might have the basic idea, but I would wager we're still missing some of the key insights required to get AGI off the ground.
edit: it looks like you added some lines while I was typing, so to respond to your last points:
> it’s hard to predict timelines for this kind of thing, and people are notoriously bad at it. Few would have predicted the results we’re seeing today in 2010. What would you expect to see in the years leading up to AGI? Does what we’re seeing look like failure?
I totally agree that it's hard to predict, that technology usually advances faster than we expect, and that tremendous progress is being made. But the road to understanding human intelligence has been characterized by a series of periods of premature optimism followed by setbacks. For instance, in the 20th century, when dyes were getting better, and we were starting to understand how different brain regions had different functions, it may have seemed like we were close to just mapping all the different pieces of the brain, and that completing the resulting puzzle would give a clear insight into the workings of the human mind. Of course it turns out we were quite far from that.
As far as what we can expect in the years leading up to AGI, I suspect it's going to be something that comes on gradually - I think computers will take on more and more tasks that were once reserved for humans over time, and the way we think about interfacing with technology might change so much that the concept of AGI might not seem relevant at some point.
As to whether the current state of things is a failure - I would not characterize it that way. I think we're making real progress, I just also think there is a bit of hubris that we may have "cracked the code" of true machine intelligence. I think we're still a few major revelations away from that.
That's hardly accurate - didn't Musk and Co. promise self-driving cars by 2012? We're in 2020, and the SDC's are great for making youtube videos, but not any good at piloting a vehicle without human intervention.
Since the 90s it has been clear that the only thing holding back what we have today is limited processing power. While there may be some new insights and directions in AI, they are not "general" and they require 3 orders of magnitude more processing power for a lot smaller improvement in performance.
What has been clear since 2010 is that this field has passed the point of diminishing returns already. We throw vastly more computational power at problems that we ever did before, and then call the result an improvement.
Deep blue beat the best human at chess using 11.8 GFLOPS of computational power. Alphago beat the best human at go using 720000 GFLOPS of power. The complexity difference between Chess and Go are within a single order of magnitude - 10x to 99x difference in complexity (https://en.wikipedia.org/wiki/Game_complexity). The difference in AI processing power to beat the best human between Chess and Go is between 4 and 5 orders of magnitude (1000x and 100000x).
This does not look like a success to me - it looks like a brute-force approach. If you spend 10000x more resources for a 10x more benefit, you're at the point of diminishing returns.
Here's a great paper that should be written (but won't be) - plot the improvements in AI and the usage of computational power for AI on the same chart.
From the 90s (https://en.wikipedia.org/wiki/History_of_self-driving_cars#1...):
"The robot achieved speeds exceeding 109 miles per hour (175 km/h) on the German Autobahn, with a mean time between human interventions of 5.6 miles (9.0 km), or 95% autonomous driving."
Yup, 95% autonomous. Today we have 95.x% autonomous with roughly 10000x the resource power thrown at the problem.
So, yeah, your assertion that "Few would have predicted the results we’re seeing today in 2010." is wildly off mark, we predicted more than what we see today because we did not expect to hit a point of diminishing returns quite so quickly.
The people who did the 95% SDC in 1997 would have been disbelieving if anyone told them, in 1997, that even with 10000x more processing power thrown at the problem and new sensor hardware that was not available to them, it won't get much better than what they had.
- FSD is one example, but the improvement of computer vision since 2015 has been massive and deep learning approaches to general problem solving too. This wasn’t something people were predicting in 2010.
- The deep blue and stock fish style approaches vs. the alpha go or alpha zero approaches are categorically different - the latter being a lot more interesting and closer to general learning vs. the older approach which is more like brute force.
- GOFAI was a bad approach and the optimism in the 60s was wrong. Today’s looks more promising. Being wrong in the 60s doesn’t necessarily mean people are wrong now. It’s hard to know: https://intelligence.org/2017/10/13/fire-alarm/
For the AGI bit I’d recommend reading some of Eliezer Yudkowsky’s writing or Bostrom’s book (though I find Bostrom’s writing style tedious). There’s a lot of good writing about take offs and AGI/goal alignment that’s worth reading to get a base level understanding of the concepts people have thought through.
AGI doesn’t need to be human like to be dangerous - it can be good at general problem solving with poorly aligned goals and just act much faster. Brains exist everywhere in nature, simpler than human brains. A lot of that computation in training could be an analog of the genetic “pre-training” of evolution for humans that gets our baseline which could be one reasons humans don’t seem to require so much. There was a massive amount of “computation” over time via natural selection to get to our current state.
> plot the improvements in AI and the usage of computational power for AI on the same chart.
Would that be meaningful? I mean, I use an infinity times the computational power for writing a letter than people did 100 years ago, still producing more or less the same results.
A lot of birds don't need to flap their wings either.
And while our brain is objectively very impressive, I don’t see how our complex abilities are anything but emerging features.
It's probably true that you could imagine a "perfectly designed" brain which could perform better on some tasks with less complexity, but I think it's also true that there's been a lot of selection pressure towards increased intelligence, so this is probably fairly well optimized.
> why don’t we have a normal abstraction for sending signals? Instead, we have like 10s of slightly different ones with different failings each, but each having many repetitive machinery leading to inefficient “spaghetti code”.
What do you mean exactly by this? Like different neurotransmitter systems? Because I think it's actually quite elegant how the properties of different neurohormones lead to different processing modalities. It's like we have purpose-built hardware on the scale of individual proteins specialized for different purposes. I'm not so sure a more homogenized process for neural signaling would be an improvement.
As for how optimized is the human brain, well good question. I think not even a single biological cell is close to efficient, at most it is at a local minima. The reason is perhaps that “worse is better” in terms of novel functionality. But I don’t think there was a big evolutionary pressure on sufficient intelligence once it emerged - it is sort of a first past the post wins all.
I have to be honest, I would take any such comparison from the 1950's with a huge pinch of salt. I think perceptions about how "dumb" an individual neuron is as a processing unit have shifted quite a bit since then.
> Also, I don’t think that comparing the training of a neural network to the brain is fair from an energy usage point of view - compare the usage of the final NN with it.
I'm not considering this in terms of the training efficiency, I'm looking at it in terms of the ratio between operational utility and energy used. There's no trained ANN with anything remotely close to the overall utility of the human brain at any scale, let alone one that weighs 3lbs, fits into a human skull and runs on 20 Watts of power.
The fundamental difference between our current approach and biological brains is just as much a hardware one as it is theoretical. CPUs and GPUs are simply not best fit for this sort of usage — a “core” of them is way too powerful for what a single neuron can do (even with the more correct belief that they are not as dumb as we first thought), even if they can calculate multiple ones simultaneously. I’m not sure of specifics but couldn’t we print a pre-trained NN to a circuit that could match/beat a simple biological neural network in both speed and power efficiency? Cells are inefficient.
I just don't think this is a meaningful comparison, and I'm not convinced it's evidence of the "limitations" of biological computation.
Silicone beats biology in doing binary computation because they're a single-purpose machine built for this task. But a brain is capable of serving as a control system to operate millions of muscle fibers in parallel to navigate the body smoothly in unpredictable 3D space, while at the same time modulating communication to find the right way to express thoughts and advance interests in complex and uncertain social hierarchies, while at the same forming opinions about pop-culture, composing sonnets, falling in love and contemplating death.
For me to buy the argument that ANN's can be more efficient than biology, you'd have to show me a computer which can do all of that using less resources than the human brain. Currently we have an assembly line for math problems.
> a “core” of them is way too powerful for what a single neuron can do
I just think you're vastly under-counting the complexity of what happens inside a single neuron. At every synapse, there's a complex interplay of chemistry, physics and biology which constitutes the processing of the neurotransmitter signal from the presynaptic neuron. To simulate a single neuron accurately, we actually need all the resources of a very powerful computer.
So it may be the case that we can boil down intelligence to some kind of process which can be printed in silicon. But I think it's also entirely likely that the extreme parallelism (vast orders of magnitude greater than the widest GPU) of the brain is required for the kind of general intelligence that humans express, and the "slowness" of biological computation is a necessary trade-off for the flexibility we enjoy. If that's the case, it's going to be very hard for a serial computer to emulate intelligence.
Also, while indeed we can’t simulate the whole of neuron, why would we want to do that? I think that is backwards. We only have to model the actually important function of a neuron. If we were to have a water computer, would it make sense to simulate fluid dynamics instead of just the logical gates? Due to the messiness of biology, indeed some hard to model factors will effect things (in the analogy, water will be spilt/evaporated) but we should rather overlook the ones that have a minimal influence on the results.
Yeah so I think this is where we fundamentally differ. It seems like your assumption is that neurobiology is fundamentally messy and inefficient, and we should be able to dispense with the squishy bits and abstract out the real core "information processing" part to make something more efficient than a brain.
So if that's your assertion, what would that look like? What would be the subset of a neuron that we could simulate which would represent that distillation of the information processing part?
Because my argument would be, the squishy, messy cellular anatomy is the core information processing part. So if we try to emulate neural processing with the assumption that a whole neuron is the base unit, we will miss a lot of that micro-level processing which may be essential to reaching the utility and efficiency achieved by the human brain.
I'm not against the idea that whatever brains we happened to evolved are not the most efficient structure possible. But my position would be, we're probably quite far in terms of current computing technology from being able to build something better. I would imagine we might have to be able to bioengineer better neurons if we really want to compete with the real thing, rather than trying so simulate it in software.
But I feel I may be misrepresenting your point now. To answer your question, maybe a sufficient model (sufficient to be able to reproduce some core functionality of the brain, eg. make memories) would be one that incorporates a weight for each sort of signal (neurotransmitter) it can process, complete with a fatigue model per signal type, as well as we can perhaps add the notable major interactions between pathways (eg. activation of one temporarily decreasing the weight of another, but in a way bias is sorta this in the very basic NNs). But to be honest, such a construction would be valuable even with arbitrary types of signals, no need to model it exactly based on existing neurotransmitters. I think most properties interesting from a GAI perspective are emerging ones, and whether dopamine does this and that is an implementation detail of human brains.
You only need to accurately simulate the input and the output.
Frankly, if that can’t be done with a Markov process I’d be very surprised, and we already know that Markov chains can be simulated with ANNs
For instance one of those is spike-timing dependent plasticity. Basically the idea is that the sensitivity of a synapse gets up-regulated or down-regulated depending on the relative timing of the firing of the two neurons involved. So in the classic example, if the up-stream neuron fires before the down-stream neuron, the synapse gets stronger. But if the down-stream neuron fires first, the synapse gets weaker.
Another one is synchronization. It appears that the firing frequency of groups of neurons which are - for instance representing the same feature - become temporally synchronized. I.e. you could have different neural circuits active at the same time in the brain, but oscillating at different frequencies.
Another interesting mechanism is how dopamine works in the Nucleus Accumbens. Here you have two different types of receptors at the same synapses: one of them is inhibitory, and is sensitive at low concentrations of dopamine. The other is excitatory, and is sensitive at high concentrations. What this means is, at a single synapse, the same up-stream neuron can either increase or decrease the activation of the down-stream neuron: if the up stream neuron is firing just a little, the inhibitory receptors dominate. But if it's firing a lot, the excitatory receptors take over, and the down-stream neuron starts to activate more. Which kind of connection weight in an ANN can model that kind of connection?
My overall question would be, do you think back-propogation and markov chains are really sufficient to account for all that subtlety we have in neural computation, especially when it comes to specific timing and frequency-dependent effects?
To boil it down, if you really want to argue that the behaviour of a neuron can’t be simulated by an ANN, you’re arguing that a neuron is doing something non-computable. At which point you might as well argue it’s magical.
1. Can ANN's (in their current iteration) achieve general intelligence
2. Can they do it more efficiently than a biological brain
It certainly has not been established that a Turing machine can achieve general intelligence.
The biggest thing is that computing is just now finally able to have enough data and large enough networks to really start to create more generalized models. With a computer 20 years ago you might be able to do squeeze out simple pattern recognition, but for every layer and every neural node you add to a neural network, the more complex the model becomes and the more edge cases it can fold into itself.
Take a look at Universal approximation theorem and how, with enough nodes in a neural network, you can solve pretty much any problem given the right weights.
I also find it improbable that intelligence will emerge from modern ML without some major leap. But you have added nothing to the discussion, beyond some impressions from undergrad, when we are talking about something that is a very active and evolving research area. It's insulting to researchers and practitioners who have devoted years to studying ML to just dismiss broad areas of applicability because you took a course once.
Do you disagree substantively with anything I have said, or do you just think I could have phrased it better?
If your conclusion is that current gradient based methods probably won't scale up to AGI, you're probably right. But if you want to get involved in the discussion of why this is true, what ML actually can and can't do, etc. I would encourage you to learn more about the subject and the current research areas, and draw on that for your discussion points.
Otherwise, it comes across as "I once saw a podcast that said..." type stuff that is hard to take seriously.
No doubt I come across as condescending, please take what I say with the usual weight you'd assign to the views of a random guy on the internet :)
If you can’t characterize the technical problem that creates a limitation then you are just expressing an uninformed opinion.
Even if you were an expert!
Once we finally had the tools to even start trying, in the late 80s/early 90s, it took us a very long time to "calibrate" these general ideas and figure out the "devils in the details" that were necessary to make certain ideas viable (for example, neural networks were discarded as a dead end in the 80s, and only considerably later were we able to discover that multi-layer networks essentially "salvaged" the idea).
Machine learning without the era of "modern computers" was a bit like flight before we'd really mastered the internal combustion engine - we understood quite a bit about it, and had theories about a lot of stuff (like the basic shape of a wing), and could successfully build gliders and such. Contrary to a lot of propaganda, the Wright Brothers didn't just arrive in the world like "lightning from a clear sky", but ... it had to become practical to do for us to then move on to putting the ideas through the paces, and all of the established theory from beforehand ran into the usual treatment of "no plan of battle survives contact with the enemy".
However until the hardware and software support for mainstream massively parallel execution became available it was a niche tool.
So the level of adoption, experimentation, deployment and research resources available are multiple orders of magnitude greater than 20 or 30 years ago.
As a practitioner (for my entire career) the field still operates as a new field, with enormous areas for new experimentation and interesting new creative advances happening quickly.
So we are still at the beginning.
Perhaps, CAI for inference or insight would express it more fairly.
Alternatively, AI could've stood for 'automated inference', but sure it's all too late to rebrand.
We humans still not clear about nature of our own intelligence, yet already claimed being able to manufacture it.
There were a few new things, like ignoring model, choosing right variables, whatever was available was thrown into the equation, if it was clear that such "model" is over-fitted, there were some methods to overcome this by adding some random coefficients to the model that were smoothing it a little.
So, the naming is there... could be modified by adding some clarification that we don't care that much about understanding model we plan to use.
> I share the skepticism towards any progress towards 'general AI' - I don't think that we're remotely close or even on the right path in any way.
This isn't how science works though. Quoting the wikipedia page for Thomas Kuhn's "The Structure of Scientific Revolutions" (https://en.wikipedia.org/wiki/The_Structure_of_Scientific_Re...):
"Kuhn challenged the then prevailing view of progress in science in which scientific progress was viewed as "development-by-accumulation" of accepted facts and theories. Kuhn argued for an episodic model in which periods of conceptual continuity where there is cumulative progress, which Kuhn referred to as periods of "normal science", were interrupted by periods of revolutionary science."
I think this is the accepted model in the philosophy of science since the 1970s. That's why I find this argument about AI so strange, especially when it comes from respected science writers.
The idea that accumulated progress along the current path is insufficient for a breakthrough like AGI is almost obviously true. Your second point is important here. Most researchers aren't concerned with AGI because incremental ML and AI research is interesting and useful in its own right.
We can't predict when the next paradigm shift in AI will occur. So it's a bit absurd to be optimistic or skeptical. When that shift happens we don't know if it will catapult us straight to AGI or be another stepping stone on a potentially infinite series of breakthroughs that never reaches AGI. To think of it any other way is contrary to what we know about how science works. I find it odd how much ink is being spent on this question by journalists.
I think, in a way, Doctorow is making that same argument for the current state of ML: "I don't think that we're remotely close or even on the right path in any way". In other words, general thinking that ML will lead to AGI is stuck in a rut and needs a new approach and no amount of progressive improvement on ML will lead to AGI. I don't think Doctorow's opinion here is especially insightful, he's just a writer so he commits thoughts to words and has an audience. I don't even know wether I agree or not. But I do think this piece comes off as more in the spirit of Kuhn than you're suggesting.
And of course you can interpret Kuhn however you want. I don't think Kuhn was saying you shouldn't use/apply the tools built by normal science to everyday life. But he, subtly, argues that some level of casting off entrenched dogmatic theories, in the academic domain, is a requirement for revolutionary progress. Kuhn agrees that rationalism is a good framework for approaching reality, but also equates phases of normal science to phases of religious domination that predated it. Essentially truly free thought is really really hard because society invents normals (dogma) and makes it hard to deviate. Academia is no exception. Science, during periods of normals, is (or can become) essentially over-calibrated and over-dependent on its own contemporary zeitgeist. If some contemporary theory that everyone bases progressive research off of is not quite right, it kinda spoils the derivative research. Not always true because sometimes the theories are correct.
I felt like the part that wasn't in line with Kuhn was the idea that there was something wrong with a field if incremental improvement couldn't lead to a breakthrough like AGI. You're right. He's arguing Kuhn's point. But he seems to use it to conclude that machine learning is a dead end when it comes to AGI. Further, he seems to think this means AGI won't happen any time soon.
But, if I'm not misinterpreting Kuhn again, knowing that a revolution is necessary to overturn the current dogma (which I would argue is deep learning) doesn't tell us anything about when the revolution will occur. It could be tomorrow or 50 years from now or never. So, specifically, it doesn't tell us anything about machine learning in general, whether AGI is possible, or when AGI will happen.
We skeptics aren't skeptical that AI is possible, were skeptical of specific claims. I think it's perfectly reasonable to be skeptical of the optimistic estimates, since they really are little more than guesses with little or no foundation in evidence.
I agree that one would think that Science Fiction writers would have enough of an imagination to be able to consider alternate futures (Cory CYA's by saying such a scenario would make a good SF story) - but there are already promising approaches to AGI: Minsky's "Society of Mind", Jeff Hawkins' neuro-based approaches, the fairly new Hinton idea GLOM: https://www.technologyreview.com/2021/04/16/1021871/geoffrey... .
“By 2029, computers will have human-level intelligence,” Kurzweil said in an interview at SXSW 2017.
Time to get to work, eh? https://www.timeanddate.com/countdown/to?msg=Kurzweil%20AGI%...
1993 - Vernor Vinge predicts super-intelligent AIs 'within 30 years'.
2011 ray Kurzweil predicts the singularity (enabled by super-intelligent AIs) will occur by 2045, 34 years after the prediction was made.
So until his revised timeline for 2029 the distance into the future before we achieve strong AI and hence the singularity was, according to it's most optimistic proponents, receding by more than 1 year per year.
I wonder what it was that lead him to revise his timeline so aggressively. I think all of those predictions were unfounded, until we have a solid concept for an architecture and a plan for implementing it an informed timeline isn't possible.
Perhaps, but "philosophy of science" has never been something the majority practicing scientists consider relevant, care about, or are influenced by, since forever.
I actually think that AGI is deceptively simple. I don't have a proof, but I have a (rather embryonic, frankly) theory of how is it gonna work.
I believe AGI is an analogue of third Futamura projection, but for (reinforcement) learners and not compilers.
So the first level is you have problem and a learner, and you teach learner to solve the problem. The representation of the problem is implicit in the learner.
The second level is that you have a language, which can describe the problem and its solution, and a (2nd level) learner, and you teach the 2nd level learner to create (1st level) solvers of the problem based on the problem description language. The ability to interpret the problem description language is implicit in the 2nd level learner.
The third level is, you have a general description language that is capable of describing any problem description language, and you teach the 3rd level learner to take a description of the problem description language, and produce 2nd level learners that can use this language to solve problems created in it.
Now, just like in Futamura projections, this is where it stops. You have a "generally intelligent" creature on the 3rd level. You can talk to them on level of how to effectively describe or solve problems (create a specialized language for it) and they will come all the way down with the way to attack (solve) them.
In humans, the 3rd level, general intelligence (AKA "sentience"), evolved eventually from the 2nd level, and it was a creation of the general internal language (which probably co-evolved to be shared). The 2nd level is an internal representation of the world that can be manipulated, but only ever refer to the external world, not itself, so it allows creatures to make conscious plans, but lack the ability to reflect on the planning (and also learning) process itself. The "bicameral mind" is a theory how we acquired 3rd level from the 2nd, and the 3rd level is why "we are strange loops".
Anway, the problem is, the higher you go up the chain, the harder it becomes to create the learner, it's a lot more general problem. But I think the ladder must be, and should be, climbed. I believe that Deepmind (and RL research) has solved the 1st level, is now working on the 2nd level, but they already somewhat dimly see the 3rd level.
I beg to disagree. They clearly state your opinion at the end of the piece, using the metal-beat analogy. Great things were done by blacksmiths beating metal, but not an ICE
Why I'm pro-AI: Neural nets.
I worked on object detection for several years at one company using traditional methods, predating TensorFlow by a few years. We had a very sophisticated pipeline that had a DSP front end and a classical boundary detection scheme with a little neural net. The very first SSDMobileNet we tried blew away 5 years worth of work with about two weeks of training and tuning.
Other peers of mine work in industrial manufacturing, and classification and segmentation with off the shelf NN's has revolutionized assembly line testing almost overnight.
So yes, DNNs absolutely do some things vastly better than previous technology. Hand's down.
Why I'm Anti-AI: hype
The class of problems addressed by recent developments in NN/DNN software have failed horribly in scaling to even modestly real-world, rational multi-tasking. ADAS level 5 is the poster child. When hype master Elon Musk backs away, that is telling.
We're on the bleeding edge here, IMHO we NEED to try everything. There's no telling which path has fruit. Look at elliptic curves: half a century with no applications, now they are the backbone of the internet. Yes, there will be BS, hype, snake oil, vaporware, but there will also be some amazing tech.
I say be patient and skeptical.
It's explicitly right there in the essay...
> Machine learning has bequeathed us a wealth of automation tools that operate with high degrees of reliability to classify and act on data acquired from the real world. It’s cool!
> Brilliant people have done remarkable things with it.
You seem to be in agreement with the article but don't realize it.
As a side note, I'd like to say humanity's own intelligence is actually able to come up with solutions to its problems, we don't need AGI for that. Humanity is unable to implement those solutions for reasons beyond technical. How an AGI would get over those hurdles I have no idea
Racial bias in facial recognition: "Error rates up to 34% higher on dark-skinned women than for lighter-skinned males. "Default camera settings are often not optimized to capture darker skin tones, resulting in lower-quality database images of Black Americans" https://sitn.hms.harvard.edu/flash/2020/racial-discriminatio...
Chicago’s “Heat List” predicts arrests, doesn’t protect people or deter crime: https://mathbabe.org/2016/08/18/chicagos-heat-list-predicts-...
And that's the problem with ML in general: its failure to recognize the implicit biases in choice of dataset and training and the resulting problems, of which Microsoft racist chatbot Tay is merely the most blatantly ludicrous.
It's fine, these are not complicated problems, and they are much easier to spot and fix than most problems in software engineering at scale. Don't be fooled by the negative PR campaigns and clickbait, there's no reason to be skeptical about ML in general because of this.
Also, Tay attempted to solve a much harder problem than image classification. It's hard to build a safe hyperloop. It's no longer hard to build a safe microwave oven.
Other issue is for many of these systems having a racial bias in the error rate has mild business impact which makes it harder to prioritize in fixing. Last issue the work needed to fix this tends to be less interesting than most of the other work to make the system.
So overall, the main issues are lack of good open source fair datasets with loose licensing, cross organizational need to solve it (engineers can not code up a fair dataset), and business prioritization.
edit: Also solve here is getting accuracy across races to be close not zero. ML models will always have an error rate and if your goal is 0 errors related to racial factors that is extremely hard. Modeling is about making estimates of data not knowing the truth of that data.
No doubt there are still plenty of other issues with ML that haven't (yet) made it to popular attention, and the people employing it aren't making decisions based on social value or common good, but simply invoking free markets and capitalism as their guiding philosophies.
Of course if you don't take account of the difficulties that come with using the tool then you might be acting with racial bias, but that's different. Or, all cameras/eyes/visual imaging means are "racist".
Bringing up quantitative vs qualitative analysis is just silly, since science has had this problem way before AI. Hume famously described it as the is/ought problem†. And that was a few hundred years ago.
Finally, dropping the mic with "I don't think we're anywhere close to consciousness" is just bizarre. I don't think that any serious academic working in AI/ML has made any arguments that claim machine learning models are "conscious." And Strong AI will probably remain unattainable for a very long time (I'd argue forever). This is not a particularly controversial position.
† Okay, it's not the same thing, but closely related. I suppose the fact–value distinction might be a bit closer.
> We don’t have any consensus on what we meant by “intelligence,” but all the leading definitions include “comprehension,” and statistical inference doesn’t lead to comprehension, even if it sometimes approximates it.
So now the semantic shell game is stuck on defining "comprehension". In the next paragraph he starts to suggest it has something to do with generalization -- but that's a concept around which ML practitioners are constantly innovating in formalizing, and using those formal measures to good effect.
Also, "comprehension" is absent in plenty of definitions of intelligence. Take Oxford's "the ability to acquire and apply knowledge and skills". Huge parts of the world work around notions of intelligence demonstrated through action, not a philosophical abstraction.
I'll never understand the "ML won't make my version of AGI" crowd's view on science in general. "This won't work in ways I refuse to define" isn't scientific criticism, and doesn't show any particular curiosity or interest in advancing the state of the art. It's just a rhetorical pose that seems aimed at building up a platform for the next time there's some AI pratfall to point out.
I am not aware of any ML in flight controls. Being black box and probabilistic by nature, these things won’t get past industry standards and regulations (at least for a while).
(Hah, I accidentally wrote "self-landing cars," fixed). But yeah, I guess I was thinking more of drones, I'm not exactly sure what ML (if any) is in the guts of a commercial or military airplane.
We have a system now, TCAS II which is reliable but has a lot of false positives and has a big limitation: both aircraft need to have TCAS in order to detect and resolve a conflict. It's also an environment that is very simple to simulate and model mathematically compared to virtually anything else AI will ever be applied to. TCAS II will be around for a very long time, like two decades, while ACAS X is deployed. So the AI system will also have a backup that already works.
This is really the perfect target for AI: clear backups, we just need some extra capabilities and warnings, easy to simulate, false positives are acceptable. That's basically unique.
Even then nothing has been deployed yet. We're half a decade away or so at best.
I never get tired of the fact that first jet airliners with fully automated landing systems had been developed and were going through certification for regular use by the time first microcontrollers popped up. Intel 4004 came in 1971, here's Hawker Siddeley Trident landing in a 1968 promo movie: https://www.youtube.com/watch?v=flVcxfOnWi0&t=9s
It might be obvious for most people here on HN that we are very far away from true artificial intelligence, but most normal people aren’t and the marketing bullshit around calling statistical models "artificial intelligence" paints the wrong picture. This article shows why.
- The task many people seem to be benchmarking against is not just a measure of general intelligence, but a measure of how well AI is able to emulate human intelligence. That's not wrong, but I do find it amusing. Emulating any system within another generally requires an order of magnitude higher performance.
- The degree to which human intelligence fails catastrophically in each of our lives, on a continuous basis, is way too quickly forgotten. We have a very selective memory indeed. We have absolutely terrible judgment, are super irrational, and pretty reliably make decisions that are against our own interests, whether it's with regard to tobacco use, avoidance of physical exercise, or refusal of life-saving medications or prophylactics. We avoid spending time learning maths and science because it's not cool, and we openly display pride in our anti-intellectual behaviours and attitudes. We're all incredibly stupid by default.
- AI researchers need to work more closely with neuroanatomists. The main thing preventing AI from behaving like a human is the different macro structure of human NNs vs artificial NNs. Our brains aren't random assortments of randomly connected neurons: there's structure in there that explains our patterns of behaviour, and that is lacking in even the most modern AI. We can't expect AI to be human if we don't give it human structures.
This is a really bad argument - human intelligence is not highly rational, but it is deeply nuanced, using social cues, emotions, instincts and a miriad of other things.
Computers can never be anti-knowledge because they lack the free will and social behavior of humans - they didn't chose to be pro knowledge either.
The human body is also functioning like a machine, there is no magic, just new stuff build upon very old stuff.
These things aren’t magical properties of a “higher” intelligence, they’re phenomena that emerge from structure. Give a robot a hindbrain and it will pick up on that type of things.
We totally do emulate organisms on that scale. The real challenge is simulating the sensory inputs and the feedback loop between the outputs, the environment as the body acts, then new inputs.
Disembodied simulations of nerual networks don't work. They are part of a body, an environment, and all the feedback loops that come with it.
It sounds like you really just want to see a ML algorithm have a body to learn in. Why we ever expect AGI to happen without letting an ML algorithm learn by interacting with a "real" reality seems strange to me. By all means, keep making glorified optic nerve and expecting them to "wake up".
> We totally do emulate organisms on that scale. The
There is no evidence those emulations actually emulates those organisms. They just built a neural net in the same structure and assumes the cells doesn't matter. But cells are really smart and can navigate environments on their own, they are intelligent beings in their own right, and building a flea using a thousand of those is very plausible compared to doing it using neural net of similar size.
And yes, in order to prove that we actually emulated those you need to show that it does the same things in the same scenarios. You don't even need to do everything, just a simple thing like being able to move around, gather material and build a home in a physics engine would be huge.
While technically true, I actually think this is way more difficult than it sounds, bordering on practical impossibility.
I think the other commenter was making a really important point. The simulated environment would need to be incredibly rich, to a point as to almost defy imagination.
Consider what happens to a human mind when confined in a box (prison) with limited opportunities for stimulation. There’s a room, a gym, other people with which to socialize, food, walls, an outdoors enclosure... And yet someone who spends their entire life in this type of environment will certainly be facing serious neurodevelopmental issues.
For human/mammal order of AI, I would even argue that simulating adequate inputs might actually be a more difficult problem than building the AI that responds to them!
This article is the opposite. He's treating ML as basically a simple supervised architecture that doesn't allow any domain knowledge to be incorporated and simply dead-reckons, making unchecked inferences from what it learned in training. Under these constraints, everything he says is correct. But there is no reason ML has to be used this way, in fact it is extremely irresponsible to do so in many cases. ML as part of a system (whether directly part of the model architecture and learned or imposed by domain knowledge) is possible, and is generally the right way to build an "AI" system.
I think ML has its limitations and will be surprised to see current neural networks evolve into AGI. But I also don't think the engineers working in this space are as out to lunch as the author seems to imply, and would not write off the possibilities of what contemporary ML systems can accomplish based on the flaws pointed out in relation to a very narrow view of what ML is.
Are you at all close to this space? It sounds you may be underestimating corporate politics and the lack of rigour and ethical thought with which these systems are applied. The example Cory puts on policing -- and the many other examples you can find in Evgeny Morozov's book or "The End of Trust" -- are solid proof of this.
The deepmind team etc type of group who actually know what they’re doing and the boundaries of what they are working with
the “AI-washing” startups, corporate groups who know they are faking it and that what they’re doing is extremely limited
the corporate project team types who are just doing random tool play and honestly don’t understand what they are doing or that they are absolutely clueless with no self-awareness at all
I’ve worked with all three and they really are just totally different things that are all being lumped together. They also are listed in terms of increasing proportion. For every self-aware AI-washer team I’ve seen 50 “we are doing AI” Corp team types spinning out one trivial demo after another to execs who know zero.
You’re observing that they aren’t doing a perfect job, which is true, but my grouping isn’t related to perfection of results.
You claim that they "know what they are doing and the boundaries of what they are working with" -- and yet they recklessly make public a racist vision product?
It is easy to have a theory of what is going on, to model the processes of how things are playing out inside the system, to make external predictions of the system, and to be utterly wrong.
Not because your model is wrong, but because either the boundary conditions were unexpected, or there was an anti-pattern in the data, or because the underlying assumptions of the model were violated by the data (in my case, this happened once when all the data was taken in the Southern Hemisphere...)
In all these cases, you can know what you're doing, you can know the boundaries of what what you're working with, and you can get results that surprise you. It's called "research" for a reason.
The model can also be ridiculously complex. Some of the equations I was dealing with took several lines to write down, and then only because I was substituting in other, complicated expressions to reduce the apparent complexity. It's easy to make mistakes - and so you can know what you're doing, and the boundaries that you're working with, and still have a mistake in the model that leads to a mistake in the data ... garbage in, garbage out.
In short, this shit is hard, yo!
If instead it was a fuckup, well that seems adequately covered by “this shit is hard”.
If you are instead complaining about a lack of oversight, I don’t have a horse in that race. Ask someone else, I don’t care about the politics, I’m here for the technology.
And why would they have assumed in the first place that the model _would_ generalize across human races, or any other factor for that matter?
When this technology gets into their hands with a dev leash it will be recklessly implemented and people will die.
> The example Cory puts on policing
My most upvoted comment on this website was discussing this exact scenario. https://news.ycombinator.com/item?id=23655487
Could you perhaps clarify the generalization you're making about me and people like me so I can understand it?
Also, the lack of thought and accountability that I mention above I think is fairly general from my experience, even outside of policing. That is why I don't generally agree with the lunch statement. Guys are having a hell of a party as far as I can tell -- at the expense of horror stories suffered by the victims of these systems.
That is all part of engineering to me, so by definition, I think many in the field are in fact, out to lunch.
"Don't say that he's hypocritical
Say rather that he's apolitical
'Once the rockets are up, who cares where they come down?
That's not my department!' says Wernher von Braun
Some have harsh words for this man of renown
But some think our attitude
Should be one of gratitude
Like the widows and cripples in old London town
Who owe their large pensions to Wernher von Braun"
Many people are very resistant to the idea that their particular work can have a negative impact or that they should take responsibility for that. See Yan Lecun quitting Twitter (https://syncedreview.com/2020/06/30/yann-lecun-quits-twitter...)
Other people are very aware of the dangers of their work. But, when the money gets big enough, they take their concerns to the bank and their therapist. See Sam Altman's concerns about the dangers of machine intelligence before he invested in OpenAI (https://blog.samaltman.com/machine-intelligence-part-1) Contrast that with his decision to become the CEO, take the company private and license GPT-3 exclusively to Microsoft. (https://www.technologyreview.com/2020/02/17/844721/ai-openai...) He had reasons. He posts here. He might defend himself. But to me it seems like the kind of moral drift I've seen happen when people in silicon valley have to make hard choices about money and power.
There are also applications of ML that are generally safe and can be of benefit to society. See the many medical uses including cancer detection.(https://www.nature.com/articles/d41586-020-00847-2) Most of the work being done to expose the risks and biases of ML is being done by researchers who are at least somewhat within the field. In my math and computer science program, two and a half of the 25 students are doing their thesis in safe ML. (I'm giving myself a half because I'm working on logic based ML.) I don't think it's fair to believe that every person working in ML is participating in something negative for society.
Ultimately, I think we need some reasonable regulation and a lot more funding for research into safe ML. Corporations and governments want ML for purposes that can be unethical. Unfortunately they also control a lot of the research grants. So they have a disincentive to fund AI ethics or safe ML over pushing the boundaries of what ML can accomplish.
Finally, I think many engineers would like their work to be positive for society. Unfortunately, with what we know now, a lot of the edge cases we run into are unfixable. When Google Photos started classifying black people as gorillas, Google just removed primates from the search terms. Years later, they hadn't fixed it. (https://www.wired.com/story/when-it-comes-to-gorillas-google...) I'm sure most engineers on the project knew that was a hack. When faced with an unfixable issue like that, the engineer either tries to get the company to stop using ML for that problem, compartmentalizes and ignores the issue, or they quit. Where do you draw the ethical line? It's good to hold people accountable but it's unrealistic to expect that to solve the problem.
Your third and fourth points I think are linked. I am not exactly sure where or how you would draw the line, but I kind of think of these ML/AI applications as something that could be export-controlled or be regulated along those lines, just like certain pieces of hardware are export-controlled on the grounds that they could be used for harm, and weapons, of course (and I mean, add some salt here because governments will cause the harm regardless, but hopefully the point comes across.) Once the regulations are in place, and corporations take _substantial_ economical hits for their errors (unlike, say, GDPR violations, which Google just factors into their OPEX), those corporations will rapidly start effecting real change. Corporations understand the language of (economic) violence suprisingly well, it's an effective tool for change. But like you said, it is precisely the same governments and corporations driving the research and exercising economic and political power, so I am not entirely sure how that would start shaping into place. Like almost everything else in life, the first step will probably be to keep raising social awareness; change will emanate from us at the bottom -- if we can direct our anger correctly and if the climate catastrophe that is upon us does not wipe us all first.
I don't think this is an example of a straw man, given that his audience is readers of Locus, a science fiction magazine. While researchers and practitioners in ML understandably hold a more nuanced, informed view, the position he's arguing against is pretty common among the general public, and certainly common in science fiction.
What deep learning seems to step into more and more is time-based statistical inference.
AGI is not:
seeing that a girl has a frown on their face.
seeing that a girl has a frown, because someone said "you look fat"
seeing that a girl has a frown because her boyfriend said you look fat
seeing that Maya has generally been upset with her boyfriend who also most recently told her she is fat.
But keep going and going and going and we might get somewhere. Do we have the computer power to keep going? I don't know.
That executive and arranging function is the unknown. From whence cometh that characteristic of Dasein? That preponderance of concern with the act of being as Being?
It's a tough nut to crack, even in philosophical circles. To think that we're going to articially create it by any means other than accident or luck is hubris of the highest order.
This meta-comment of restatement in various contexts, with various amounts of story-telling and technical detail, brings up the educational burdens of communication -- to be effective you have to reach a reader where there are today .. in terms of assumptions, technical learning, and focus of topic.. since this is such a fast-moving and wide subject area, its super easy to miss the distinction between "low value, high volume audio clips recognition" and "life and death medical diagnosis for less than 100 patients". hint - that matters a lot in the tech chain AND the legal structure, and therefore combined, the "do-ability"
A ML model and most of what we currently call AI, is a statistical model. It predicts things, and in the process of "training" this model we can also learn about the world it interacts with.
Anyone who has ever spend any amount of time on the issue knows that the idea of "generalizing" or "adapting"in ML corresponds directly to causal inference. This is not some secret to be uncovered by every other pundit, it's a direct consequence of ML models being.. statistical models. If our AI can understand when statistical relationships hold and when they do not, then it is inferring causality from data. Currently, we are dealing with ML algorithms adapting or generalizing: learning which regularities to "trust" and which to "discard" when the "DGP" changes. This sounds nebulous and difficult, but that's only the case because the models are complex.
Nevertheless, the underlying statistical problem is old. Ancient. Talked to death in every scientific field that tries to do an experiment, to infer some causal parameter from observational data, or construct counterfactuals to guide policy. Similarly, the divide between quantitative and qualitative approaches, are decades (if not centuries) old.
These problems are so well understood that we can state very precisely what needs to happen to make our inference causal. The catch is, that first it depends on the DGP and our model thereof, and further that none of these things can be proven to be true in the same framework. Whether the DGP is as we need it to be, or whether the model identifies what we want, is something we can be confident about given a set of assumptions, but it is not something we can know to be true. And guess what, if the model is complex, then so is thinking about its inferential capabilities.
The discussion seems tired, because it is. It's not a deep philosophical issue. It's a practical one. That doesn't imply there is a good solution to it either, but the basic issue hasn't changed for a long time. Maybe we are not trying to predict a treatment's effect on some randomly selected population, maybe we are instead trying to land a plane under conditions we can barely foresee. But what needs to happen for these two predictions to be unbiased, consistent, low variance, whatever... has not changed.
Researchers are frustrated with these article, because pundit after pundit claims to uncover some general problem with AI or ML in face of reality. But it is not about AI. It's about statistics, and we all want to say: "Yes, we know. Now what?"
Clearly, no biologist claims that humans evolved from modern primates, just like no modern AI researcher seriously thinks that current machine learning methods will lead to "True AI".
The author seems to have missed or excluded reinforcement learning and planning algorithms in this definition.
My criticism of AI criticism in general is that no one admits that at the root of it, we do not understand thinking (or "consciousness"). We are merely the "recipient" or enjoyer of the process, which is opaque. Just as AlphaGo, even if it just a facsimile of a Go player, could beat a human at Go, it is probable that an AI could produce a passable facsimile of thinking at one point. Its mechanisms would be as opaque as human thinking (, even to itself), but the results would be undeniable. AGI is a possibility.
IMO the main difficulty is that humans have terrible self-awareness or self-insight. We want to believe we're special, we want to believe we're intelligent, we want to believe we're different than machines. We're in denial about that.
Our brains aren't any more special than computers, other than that it's really quite formidable that we evolved them by chance in this universe of chemical soup we find ourselves in. At the end of the day, however, a computer is a computer, and "thinking" and "consciousness" simply do emerge from low-level computations given some special structures.
My comment is loosely based on a general appreciation of textbook-level neuroanatomy and recent advances in AI/comp sci.
> But the idea that if we just get better at statistical inference, consciousness will fall out of it is wishful thinking.
I'm a mostly disinterested spectator in current AI research, and even I know that it's not all about that. Just google "AI alignment" for an example, and god only knows what's going on in private research.
(On topic: blacks could commit more crime then average. They could do it because of systemic racism, but in this case the _arrests_ are not racist.)
> Okay, you’ve all told us that progress won’t be all that fast. But let’s be more concrete and specific. I’d like to know what’s the least impressive accomplishment that you are very confident cannot be done in the next two years.
Just today I was rather astonished by https://moultano.wordpress.com/2021/07/20/tour-of-the-sacred... -- try digging up something comparable from mid-2019.
That's what makes me excited about our recent advances in ML. Finally, we are getting around to modeling the lower levels of our cognitive system, the fuzzy pattern recognition part that supplies our consciousness with something recognizable to reason about, and gives us learned skills to perform in the world.
We still don't know how to wire all that up. Maybe a single ML model can achieve AGI if it is adaptable enough in its architecture. Maybe a group of specialized ML models need to make up subsystems for a centralized AGI ML-model (like a human's visual and language centers). Maybe we need several middle layers to aggregate and coordinate the submodules before they hook into the central unit. Maybe we can even use the logic, planning or expert system approach from before the AI winter for the central "consciousness" unit. Who knows?
But to me it feels like we've finally got one of the most important building blocks to work with in modern ML. Maybe it's the only one we'll need, maybe it's only a step of the way. But the fact that we have in a handful of years not managed to go from "model a corner of a reptile brain" to "model a full human brain" is no reason to call this a failure or predict another winter just yet. We've got a great new building block, and all we've really done with it so far is basically to prod it with a stick, to see what it can do on its own. Maybe figuring out the next steps toward AGI will be another winter. But the advances we've made with ML have convinced me that we'll get there eventually, and that when we do, ML will be part of it some extent. Frankly I'm super excited just to see people try.
Nearly everyone here knows mostly when seeing AI written or hearing it that it's a total crock. Nearly everyone here knows ML is applied statistics done with a computer but this not common knowledge and it really should be.
There's been plenty of progress in the last 15 years re-interpreting many ML methods as regression (any optimization is a regression if you set up the right likelihood function).
But many important results and techniques -- including today's ubiquitous deep nets -- originated and had successful applications way before they had statistical interpretations. They came from fields like compression theory, database design, or even biological interpretations.
The term Machine Learning was introduced to re-focus the field on a measurable objective: algorithms that improve with more data. The "Learning" part was not an abstract term to tug on your imagination, but included formal definitions of how algorithms improve that involved slightly fewer assumptions than statistical learning (which is a subfield).
This lineage isn't that important today, but that focus on how learning is measured is still the most important guidepost both for ML research and for sorting out marketing BS from realistic claims. Certainly, state of the art work using deep nets for tasks like NLP, image and video recognition aren't designed by reasoning about the statistical interpretation, or tested by applying typical statistical tests. Popularizing this work as Statistical Inference or Regression wouldn't give any added intuition and wouldn't really describe the way ML research proceeds, or how ML systems succeed or fail.
Putting stats right there in the name is vastly, vastly more informative than "Learning" which has the nuance for 99% of people as something requiring intelligence and is misleading. Hence the AI cons all pop up immediately there are some public ML wins called "Learning."
Generalizing from data is actually what statistics does. It's what ML is. People like Hinton, Wasserman, Tibrishani et al seem to agree that ML is statistics but even that isn't what I'm talking about here.
The term "machine learning" fits the field, but the venn diagram of "what those two words could mean in english" versus "what the term means in the field" is a huge circle enclosing a tiny subset.
It's way too broad, and a term that naturally lent itself to a far more narrow interpretation by people first finding it wouldn't have this problem.
It's fascinating to me, as someone that works with (rudimentary, non-ML) game AI, that - until recently, nobody really even tried doing game AIs that even "trained their heuristics". Like, I get how AIs couldn't form a general plan or any of that, but I was shocked, as an adult, to learn that i.e. FPS AIs were too dumb to even take "guesstimate" values like how much they needed to lead a shot (i.e. honing ballistics calculations), and at least train the aiming value for that based on inputs and success/failure criterion. As a kid, the obviousness of the idea, and triviality of how much effort it ought to take (surely a couple of hours, tops?) had me convinced that of course everybody was doing that.
Once I became an adult, I learned the bitter truth that even banally simple ideas are shockingly difficult to put into practice. The devil's in the details.
People have an idea what a statistical analysis is and basing decisions on it. Eg Gambling. That is what ML /is/. It's not some incredible computer brain thinking learning magic pixie dust. You know that. I know that. Everybody who knows what ML is knows that. It's a minute proportion of the world. This is the data we need to learn from.
See ML as distinct from stats all you like, go nuts. Take it up with Hinton, Wasserman, Murphy, Tibrishani & Hastie and so on. Your understanding is different from theirs which could well make your textbook a ground breaking best seller.
Within a single problem space (or sub-space) past performance can generalise quite well.
There's a problem with scaling solutions and expecting performance to continue to increase in a continuous exponential manner: growth that we perceive as exponential is often only on a long-life S-Curve.
We've seen this in silicon, where what appears to the layman to have been exponential growth has in fact been a sequence of more limited growth spurts bound by the physical limits of scaling within whatever model of design was active at the time.
The question of where the bounds to the problem domains are, and when new ideas or paradigms are required is much more difficult in AI than it has been in microprocessors.
It's easy enough to formulate the question "how small can this be before the changes in physical characteristics at scale prevent it from working?", if rather more difficult to answer.
AI is so damned steeped in the vagaries of the unknown that I can't even think of the question.
The current approach to machine learning is not going to go towards general-purpose AI with steady steps and gradual innovations. Things like GPT-3 seem amazingly general at first. But even it will quickly plateau towards the point where you need a bigger and bigger model, more and more data, and training for smaller and smaller gain.
There need to be several breakthroughs similar to the original Deep Learning breakthrough away from statistical learning. I would say it's 4-7 Turing awards away at a minimum. Some expect less, some more.
Being able to calculate that Inputs a, b, c… z add up to Outcome X with a probability of 75% still won’t tell you if arrest data is racist, whether students will get drunk and breathe on each other, or whether a wink is flirtation of grit in someone’s eye.
Except if information about what we consider racist etc. also passes through the same inference engine (feeding it with information on arbitrary additional meta levels).
So, sure, an AI which is just fed crime stats to make inferrences, can never understand beyond that level.
But an AI which if fed crime stats, plus cultural understanding about such data (e.g. which is fed language, like a baby is, and which is then fed cultural values through osmosis - e.g. news stories, recorded discussions with people, etc).
In the end, it could also be through actual socialization: you make the AI into a portable human-like body (the classic sci-fi robot), and have it feed its learning NN by being around people, same as any other person.
Yes, an ML model that infers B from A might not "understand" what A or B are....yet. But what is it to "understand" anyway? Just a more complex process in a different part of the machine.
If the human brain is just a REALLY large, trained, NN, there's no reason that we won't be able to replicate it given enough computing power.
I think one clear sign that the human mind is more than just a big NN is how large neural networks are already.
Take GPT-3, which is was trained on 45 terabytes of text and has 175 billion parameters. Contrast that with the human brain, which has around 86 billion neurons and is able to do much of what GPT-3 can do with only a tiny fraction of the training data. And it has to be said that while GPT-3 has more competency than an average human at some text generation related tasks, the average human brain is vastly more capable than GPT-3 at any non-text related task.
So for neural networks to approach human level capability we would need a whole stack of GPT3-ish size networks for all the other non-text related things the human brain can do: speech, vision, motor control, social interactions, and so on. By that point the amount of training data and parameters is so astronomical, there can be no question that the functioning of human brains must be significantly different than that of contemporary computer neural networks.
To be clear, I am also a materialist and subscribe to the computational theory of mind, but just based on the size of training data alone, it seems obvious that human brains work differently than neural networks.
If I read all day everyday from the moment I was born until now, I couldn't have read 45 terabytes of text.
I guess if we want to reduce the comparison to something vague like "the human brain and neural networks both developed over many iterations," I could agree with that, but that doesn't seem very interesting.
If the two were actually comparable, it would look more like:
1. Use NN (or GP) to develop set of hyper parameters
2. Give said HP to neural nets that, thanks to the HP, can be trained on a very limited said of data.
I am not aware of any successes with methods like these.
While I also don't expect that AGI will emerge solely through optimizing statistical inference models, I also don't think "improvements to the machine learning field" consist only of such optimizations. Surely further insights, paradigm shifts, etc., will continue to play a role in advancing AI.
Perhaps it's more a matter of semantics and a bad analogy; "machine learning" seems far more broad a field than "horse-breeding." Horse-breeding is necessarily limited to horses. Machine learning is not limited to a specific algorithm or data model.
Even calling it a "statistical inference tool", while not wrong, is deceptive. What exactly does he or anyone expect or want an AGI to do that can't be understood at some level as "statistical inference"? One might say: "Well, I want it to actually understand or actually be conscious." Why? How would you ever know anyway?
The moment you have a paradigm shift, sure, it can be considered "learning done by machines", but it's not "Machine Learning™" anymore.
This is why the author put it in quotes; because, since it's a term comprehensible to anybody, it's got this unfortunate side effect where people on the outside of the field take the "plain english" meaning of it rather than realizing it's loaded with some extra specific meaning for the practitioners in the field.
In my experience, "machine learning" is more broad than "convolutional neural networks", aligning with Wikipedia's definition: "the study of computer algorithms that improve automatically through experience and by the use of data." https://en.wikipedia.org/wiki/Machine_learning
The Antikythera mechanism was built 1800 years before the first metal lathe. It is a fantastically sophisticated clockwork with dozens of gears, concentric shafts, and brilliant, practiced fabrication. It is not a unique device. It was built by someone who knew what they were doing and had made this thing many times. It is obvious in the same way that you can tell when code was written from the start knowing how the finished product would look.
The device displays the relative positions of stars and planets from their underlying orbits, and was built with bronze hammers and some small fragments of steel. All that to say, you can do incredible things with practice, care, and tools that are thousands of years too primitive.
I think part of the problem is the belief that human or animal intelligence is somehow more mystical.
People who think like this will see an ML implementation solve a problem better and/or faster than a human and counter "well, it's just using statistical inference or pattern recognition" and my response is "so?" Humans use the same processes and parlor tricks to understand and replay things.
Where humans excel is in generalizing knowledge. We can apply bits and pieces of our previous parlor tricks to speed up comprehension in other problem spaces.
But none of it is magic. We're all simple machines.
Ooof. Premed dropout here, so admittedly not an expert in human biology but this is a wild statement. A neuron is simple in the same way a transistor is simply a silicon sandwich doped with metals.
A parlor trick is something that once you understand, is straightforward to implement on your own. Are you arguing that anyone now or in the foreseeable future could simply recreate the abilities of a human? If so, what evidence could you show me to support that?
There's a bias toward the marvel of human intelligence that causes some people to dismiss ML for the same underlying reasons we don't try to put a square peg in a round hole after infancy.
Side note: disagree all you like but starting a rebuttal with "oof" is the kind of dismissive language that lets people know you'll be taking a very reductionist approach in your reply.
Until ML/AI can perform a single one of those parlor tricks without the constant direction of human intelligence, there’s no reason to stop marveling.
Great. Prove it. Build the simple machine that acts as a human does. Should be simple, right?
Personally, I don't think there's any magic. But it's not "simple" either.
I’d also point out that not all models are “theory-free”, as he describes it. I specifically do work in areas where we combine “theory” and machine learning, and it works very well.
And finally, his point about comprehension does not really fly for me. There is no magical comprehension circuit in our brain. It’s all done via biological processes we can study and emulate. Will that end up being a scaled up version of current neural nets? Will it need to arise from embodied cognition in robots? Will it be something else? I don’t know, but it’s certainly not magic, and we’ll get there eventually. Whether that’s 10 years or 1000, who knows.
Are current paradigms going to lead to AGI? Frankly, I’d just be guessing if I even tried to answer that. My gut instinct is no, but again, that’s just a guess. Can current methods evolve into better constrained systems with more generalizable results and measurable fairness? Absolutely.
I'll just note that while you start off saying Doctorow has no idea what he's talking about, you finish by pretty much fully agreeing with the essay.
My read of the discussion here is that there is lots of idle speculation by people who don't have any real experience with ML research / engineering, that overwhelms a minority who actually know what they are talking about and are calling CD out on this, or at least challenging aspects of his arguments.
There are a lot of systems out there where peoples lives are changed forever "because the machine said so".
I'd argue that since machine learning learns only from it's data (produced by it's human creator), it becomes a great tool for baking in unconcious biases in a completely opaque system and amplifying those biases.
Much easier to tell if a human seems to be biased than if the dataset fed to an AI algorithm is biased.
If someone says a person is biased they can just say "No I'm not" and then how do we tell? We can't really make a person do a million judgments and find statistical evidence. We can't dive into how the possibly biased person came to have or not have biases.
1) Regarding the example of qualitative data via drunk students attending eye-licking parties. The author doesn't explain how this is qualitative. To my mind it's a gap in the model. The modelers could have included parameters to account for students behaving impulsively or irrationally, but they didn't.
2) Considering the nebulous nature of terms like consciousness and comprehension and the ensuing challenges of measurement, can it be proven that the structures that can potentially underpin behavior, etc., that would generally be recognized as conscious or comprehending do not exist as an emergent but undetected property of the Internet? Is it reasonable to suppose that if such a structure existed and if it possessed or embodied consciousness or comprehension that it might work toward remaining unknown?
The term “AGI” is also good, “artificial general intelligence” describes long term goals.
My impression of Silicon Valley types like Ray Kurzweil in "The Age of Spiritual Machines" that if we wire up enough transistors somehow consciousness will somehow arise out of the material world. The somehow is not explained. Materialism is a dead end in my opinion. I am more interested in theories about consciousness as a field and our brains as receivers.
You also seem to have just kicked the can down the road. "Consciousness arises from a field somehow, and the brain acts as a receiver somehow. The somehow is not explained."
A self aware intelligent organism or machine needs three key components: a material foundation that's sufficiently organized (a large net of neurons, a silicon crystal, etc.), a material fluid-like carrier to control the foundation (that's always electricity and magnetism) and the immutable immaterial principle to constrain the carrier (math rules, physical laws, software algorithms). That's the core idea of occultism rephrased in today's terminology.
The "conscious field" would be identical with the magnetic field here and neurons don't need any magical properties to receive this field: they just need to be conductive, like transistors. I think the reason the AI progress has stalled is because 0-1 transistors are too primitive and too rigid for the task. I guess that superintelligence is only different in the performance and connectivity degree of the material foundation: instead of slow neurons with 10k of connections it would be fast quasi crystal like structure with billions of connections that needs to move very little matter around (but it has to be material and consist of atoms of some sort).
Do you think that animals receive a conscious field? Could we create an accurate representation of a mouse's brain just from modelling its neurons? If a mouse brain can't receive a conscious field, but a human brain can, then what relevant physiological differences are there between the two, other than size?
I'm not sure what level of modelling you'd accept, but we appear to be close:
> Mouse brains aren't even on the horizon of what we can do.
I would say that they are "on the horizon", given that the mouse brain connectome has already been published:
I have no doubt that Kurzweil's timelines and outcomes are wrong, as have the predictions of just about every prior futurist. I don't see what that has to do with materialism being a dead end.
It's key contributions are about the mainstream domination of quantitative vs. qualitative methods, especially in this paragraph:
> Quantitative disciplines are notorious for incinerating the qualitative elements on the basis that they can’t be subjected to mathematical analysis. What’s left behind is a quantitative residue of dubious value… but at least you can do math with it. It’s the statistical equivalent to looking for your keys under a streetlight because it’s too dark where you dropped them.
and also of note is the "veneer of empirical facewash that provides plausible deniability", for discrimination, and for doing a poor job but continuing to be rewarded for it.
If I had to summarize it would be:
- The ML/AI community, which includes the researchers, practitioners, and the evangelists, are broadly utopian in what they think they can achieve. They are overconfident even in the domain of detecting the face of potential burglars in a home security camera, never mind in terms of creating new life with AGI. I think Doctorow's critique equally applies to "algorithms" even only as complex as a fancy Excel sheet, but he focuses on ML/AI as the most common source of this excess of optimism, that recording data and running it through a model is almost certainly the _most sensible thing to do_ for any given problem.
- If there is a manufactured consensus that the almost purely quantitative approach is the _most sensible thing to do_, then any failures or short-comings can be hand-waved away. Say sorry, "the model/algorithm did it", and just ignore the issue or apply a minor manual fix. This is a huge benefit for decision-makers wishing to maintain their status/livelihoods in both the public and private sector. Crucially, this excuse works if you're just ineffective, or if you're a bad actor.
Note that this is a critique of CEOs and government officials, more than of engineers -- we would only be complicit by association. If there is a critique for engineers, it's that we provide fodder for the excess of optimism in summary point 1 because we love playing with our tools, and that we allow ourselves to be the scapegoat for summary point 2.
This would suggest that "general" AI is impossible.
ON THE OTHER HAND
There is a variety of general AI, called an "optimizer". It starts with something better than a void. Maybe that's the path we should be looking at.
To talk about the models some more...
There's this big mass of models. And it's got all kinds of sections. Special sections that we learn about in school. Special sections called "science". Sections that we invent ourselves. Sections that we inherit from our parents, religion, etc. It's partially biological. Partially cultural. A massive library of models, mostly inherited.
You move in relationship with the mass in different ways.
You can create new models. That's what basic science is. Extending the edge of the mass. Naming the nameless.
You can operate freely from the mass. Creating your own models or maybe operating model-less. Artists, mystics, weirdos.
You can operate completely within the mass. Never really contending with unmodelled reality. The map and territory become one. Like in a videogame. I think that's the most popular way.
Via aesthetics etc.
Or, in the case of the optimizer, I think the human equivalent would be desire.
I view A.I. as dual to "neoliberal M.B.A. culture". Just as the business schools taught that managers should be generalists without craft knowledge applying coarse microeconomics, A.I. that we have created is the ultimate pliant worker that also knows nothing deep and works from statistics. In a bussiness ecosystem where analytics and presentations are more important than doing things, they are a perfect match. Of course, a bunches of statistician-firms chasing each other in circles is going to exhibit the folly, not wisdom, of crowds.
I think solution is to face reality that more people need to learn programming, and more domain knowledge needs to be codified old school. https://www.ma.imperial.ac.uk/~buzzard/xena/ I thus think is perhaps the best application of computing, ever.
Training A.I. to be a theorem prover tactic is a great way to make it better: if we can't do theory and empiricism at the same time, we can at least do meta-empiracism on theory building!
I think once we've codified all the domains like that and been running A.I. on the theories, we'll be better positioned to go back to the general A.I. problem, but we might also decided the "manually programmed fully automated society" is easier to understand and steer, and thus less alienation, and we won't even want general A.I.
Surely semantic meaning is qualitative, but look at word replacement in Google search. That’s entirely based on statistics, thesaurus graphs, and other ultimately quantitative data.
The neat thing about neural nets is that they are ultimately making a very, very complicated stepwise function. Brains are not neural nets, but are they doing anything other than create a very complex, entirely numerical, time and state dependent function? No matter which way you try to understand something, ultimately you are relying entirely on statistical inference.
I think the real difference is that in qualitative data the numerical representation does not mean anything. Sure, the names of the archangels can be represented digitally (quantitative) but that is just a change of representation - the bit strings' numerical value carries no theological meaning.
Changing the representation of the names doesn’t matter, but attempting to understand the meaning behind the names is ultimately quantitative. The numbers are run in the giant black box that is your brain and then your consciousness receives other qualitative answers.
Asking for an AI without statistical inference or quantitative data is asking for consciousness without a brain.
To respond to your edit: That is not brown... there is a whole science of color perception, have a look.
You see the color "brown" as a reception of a photon of a certain wavelength onto your retina, which is sent to your brain.
Visual perception can be equated to camera perception, but the data isn't represented the same way. Humans are just much better at "analog input" than computers are.
When you see things around you, it's natural to try to classify it at multiple levels. In our early formative years, we learn shapes, lines, colours, etc.
This said, "Brown" is just a label we put on a the perception of a photo's wavelength, which is quantitative on the light spectrum. All raw data is quantitative. Qualitative data are just the labels we put on it, but if we're going to general intelligence, we can't shortcut learning this way.
So how would you quantify "good" or "bad"? You can't unless you also answer what "good" + "bad" should be. In psychology they just assume that mapping those onto 1 and 5 makes sense, so "good" + "bad" = 5 + 1 = 6, but that doesn't make sense since it would imply that "good" is the same as "bad" + "bad" + "bad" + "bad" + "bad". You get similar but different issues if you start including negative numbers, or if you just use relative measures and don't have a proper zero, no matter what you do numbers doesn't properly represent feelings as we know them.
Consider that "autopilot" was invented in 1914, long before digital computers. From this perspective, Artificial Intelligence might even be seen as an ancient human practice— present whenever humans have used artifacts to govern complex systems.
A very simple example. If we ask our classical computer this question "are people currently supportive of COVID-19 vaccines?", then it would probably give us a straight answer of either a "yes" or "no" based on statistical inference of the percentage of total people who have received vaccinations at this point.
At its most fundamental level, classical computers just cannot comprehend a reality that could resolve that answer to both "Yes" and "No" in a single statement, which btw is possible in a quantum computing environment under its superposition state.
In our reality, some people who may not be fully supportive of the vaccines, but under special circumstances they may be forced to receive it because of workplace requirements, pressures from their loved ones, etc...
Train it with as many images as you want and as long as a good enough face shows up, the model is going to have a positive match. The entire problem is it’s missing that upper level of intelligence that evaluates “that looks like a face, could it actually be a human?”
Is there? Humans used to think the gods were literally watching them from the sky and the constellations were actual creatures sent into the night. So this seems learned behavior from data rather than some inherent part of human thinking.
>Train it with as many images as you want and as long as a good enough face shows up, the model is going to have a positive match.
So will a human if something is close enough to a face. A shadow at night for example might look just like a human face. Children will often think there's a monster in the room or under their bed.
Humans do not need to be trained on billions of images from around the globe to semantically understand where human faces are not expected to appear. Modern AI can certainly recognize faces really well with that level of training now, but it still doesn’t even understand what a face is (i.e. no model of reality to verify its identification against).
(I was thinking of this when I was driving in a new place. Suddenly it looked like the road ended abruptly and I got ready to act, but of course it didn't end and I realized that just a split second later.)
This really is a god-of-the-gaps answer to the concerns being raised.
For consideration, our brains start with architecture and connections that have evolved over a billion years (give or take) of training. Then we are exposed to a lifetime of embodied experience coming in through 5 (give or take) senses.
ML is picking out different things, but it's not obvious to me that models are actually getting more data then we have been trained on. Certainly GPT has seen more text, but I don't think that comparing that to a person's training is any more meaningful than saying we'll each encounter tens of thousands of hours of HD video during our training.