Both statements are probably true, but the parenthetical (eventually) is doing an awful lot of heavy lifting.
The last ten years have shown that backpropagation -- while a crucial component -- is not enough. Personally, I would not be shocked to find out in the next ten years that reinforcement learning is not enough for an AGI (as there are aspects like one-shot learning, forgetting, sleep, and other phenomena for which the RL framework seems not a natural fit).
There's a sweet spot between knowing enough and knowing little enough so that you get the right answer and get it quickly enough.
Correlation != Causation. While they very likely might be relevant, I've not seen anything to conclusively prove that it is. The ability to forget is important to humans because we are emotional beings, but I don't think that necessarily is a requirement for generalized intelligence. "sleep" (as in what happens during sleep, not the act specifically itself) on the other hand is very likely important, but again, not proven.
Intelligence is simply a special side-product of evolution, there is nothing general about general intelligence. Many organisms can thrive without it.
There is also a non-negligible chance that all organisms would die out before reaching intelligence. We are fortunate to live in a world that produced us.
I would agree, but might add that evolution doesn't have 'goals'.
Is that the point you were trying to make?
Convergent evolution exists for at least some adaptations though, like the eye. It’s not unreasonable to think that there may be some sort of equivalent convergence which creates a high general intelligence adaptation given enough time, at least for social creatures.
I think it’s pretty much impossible to know whether intelligence is a convergent adaptation without some kind of perfect simulation of evolution over billions of years. You’d have to tweak starting conditions and see if you kept getting smart creatures.
There are many reasonable assumptions one could draw from the fact.
Says nothing about this:
> biological general intelligence is the end goal of evolution
We won't be able to tell whether it's AGI or just good enough at trained tasks to trick us.
A deductive system can come with an answer and a proof of that answer, where proof is whatever counts as proof in that system.
So the notion of “does it really understand it’s answers” gets punted off its Q&A abilities and onto its ability to justify its answers.
What you're describing is what we do at school. We can't assess understanding so we assess justification of answers as well as other things like ability to do X (we don't care if they understood or not, just be capable).
Absolutely. The term "AGI" came about specifically to avoid existing philosophical arguments about "strong AI", "real AI", "synthetic intelligence", etc. Those wanting to discuss "true intelligence", etc. should use those other terms, or define new ones, rather than misuse the term AGI.
AGI requires nothing more (or less!) than a widely-applicable optimisation algorithm. For example, it's easy to argue that a paperclip maximiser isn't "truly intelligent", but that won't stop it smelting your haemoglobin into more paperclips!
Personally I take mammalian intelligence as the relevant standard we're actually aiming at.
So I'd say mouse+.
Houseflys, I think, are closer to non-intelligent than intelligent.
Crow-level intelligence is probable likewise sufficient.
I think aiming at mammalian is a good long-term ambition. I think, either way, we are hundreds of years off.
The turing test is also not an AGI test, it's a "good enough" standard for fooling people.
Intelligence fundamentally requires a multitude of environmental capabilities. The turing test considers only a single i/o boundary.
So yeah, even getting that level of intelligence would be a huge win. However, most people mean close to human level intelligence when they mean AGI even if it's one narrow specialization.
Obviously that already exists even with g.o.f.a.i.s so that is not that impressive.
The impressive thing is something more general than that.
If our metric is (intelligence)/(joule), nature seems pretty bad at a first glance: it took many trillions of lifetimes to achieve "general intelligence" *
But then again, on the big stuff like this, have we ever really beat nature? That asterisk is there because, sure, turning the earth's biosphere into computers would make us smarter, but... are we sure?
(And also: human = general?)
Yet manmade solar cells are more efficient by nearly all measures.
Also if someone loses weight, most of the carbon that made up their fat leaves the body as breath.
Just because plants compete on some limited level doesn’t mean that a particular plant organism “winning” means becoming the most efficient converter of sunlight.
Is everyone’s memory like those people who can remember every detail? Why not? If you’re immediately planning to make up a just-so explanation on the spot that has the requisite but unproven claim about increasing the genetic fitness function, that is the problem with evolutionary explanations. It’s not science if you just make stuff up and give it the same amount of credibility as something that has been tested and proven. You can take any trait and spin stories about why it is the way it is, and then expect somehow that some metric has to be maximized because of your unproven theory.
Only because we cheated, though: Houses can't sponantously grow more cells in place when more energy is needed.
1) Trees are natural and trees create leaves with a solar efficiency of x
2) Humans are natural and we create solar panels with efficiency x + y
A gene's extended phenotype includes effects external to particular organisms, like nests, deforestation, changes to the chemical makeup of the atmosphere, etc.
Nature is full of examples that are 'good enough' while balancing other competing constraints. Evolution doesn't create organisms optimized for efficiency - it creates organisms optimized for reproduction. The two are not always the same.
They did avoid one common pitfall at least. They are (intentionally?) vague about which number systems the rewards can come from, apparently leaving it open whether the rewards need be real-valued or whether they can be, say, hyperreals, surreals, computable ordinals, etc. This avoids a trap I've written about elsewhere : traditionally, RL rewards are limited to be real-valued (usually rational-valued). I argue that RL with real-valued rewards is NOT enough to reach AGI, because the real numbers have a constrained structure making them not flexible enough to express certain goals which an AGI should nevertheless have no problem comprehending (whether or not the AGI can actually solve them---that's a different question). In other words: if real-valued RL is enough for AGI, but real-valued RL is strictly less expressive than more general RL, then what is more general RL good enough for? "Artificial Better-Than-General Intelligence"?
Note, however, that almost all  practical RL agent technology (certainly any based on neural nets or backprop) very fundamentally assumes real-valued rewards. So if it is true that "RL is enough" but also that "real-valued RL is not enough", then the bad news is all that progress on real-valued RL is not guaranteed to help us reach AGI.
 "The Archimedean trap: Why traditional reinforcement learning will probably not yield AGI", JAGI 2020, https://philpapers.org/archive/ALETAT-12.pdf
 A notable exception is preference-based RL
I really don't believe that using approximations of real numbers is going to be the bottleneck for AGI.
I'm not sure that makes any difference (in either direction).
I mean, at the scale we care most about, the universe appears to be continuous, so an AGI has to be able to tackle continuous-appearing problems and use continuous-appearing representations.
OTOH, the universe is likely to actually be discrete, so an AGI has to be able to tackle actually-discrete problems, and use representations that are actually-discrete on a fundamental level.
There isn't much of a contradiction between these constraints, although the prospect of a continuous-appearing universe that is actually running on a discrete substrate seems to give a lot of people a brain cramp, and that same brain cramp gets elevated into 'proof' that current approaches cannot lead to AGI. Which is nonsense (there may be other limitations inherent in current approaches, but that can't be one of them).
One might as well claim that computers are digital and brains are analog and conclude that digital image representations cannot possibly be used to communicate information to analog brains.
If the above rewards are shoehorned into real numbers---for example, by replacing omega with 9999 or something---then an RL agent would misunderstand the environment and would eventually be misled into thinking that pressing A yields more average reward.
I don’t think you want to encode your problem domain in your reward system. It’d be like asking a logic gate to add when you really should be reaching for an FPU. Maybe I’m missing something though?
This is only a problem if you're already assuming we do everything based on our biological reward systems, and in the current context that would be circular reasoning.
Imagine the treasury creates a "superdollar", a product which, if you have one, you can use to create any number of dollars you want, whenever you want, as many times as you want. Obviously a superdollar is more valuable than any finite number of dollars, and humans/mathematicians/AGIs would treat it accordingly, regardless of the finiteness of our biological reward systems.
Is there some other way that we are do it beside our biological reward system? It sure looks like we get an apple and not an infinite reward when we pick the right answer to be selecting button B. I understand that might not satisfy you.
Seems to me that's what this whole paper we're discussing is about. If you're already convinced that there is no other way, then you're basically already agreeing with the paper, "Rewards are enough".
I understand you can use non real numbers, that's not what I was asking. I'm asking what's a behaviour you can't replicate using a reward system based on real numbers.
So glad you asked! I can give an answer which people will love who take the necessary time to understand it. It's complicated, you might have to re-read it a few times and really ponder it. It's about automatic code generation (though it might not look like it at first).
Definition 1: Define the "Intuitive Ordinal Notations" (IONs) to be the smallest set P of computer programs such that for every computer program p, if all the things p outputs are IONs, then p is an ION.
See https://github.com/semitrivial/IONs for some ION examples in python.
Definition 2: Inductively associate an ordinal |p| with every ION p as follows: |p| is defined to be smallest ordinal which is bigger than every ordinal |q| such that q is an output of p. Say that p "notates" |p|.
Finally, to answer your question, I want the AGI to write programs which are IONs notating large ordinals, accompanied by arguments convincing me they really are IONs. An easy way to incentivize this with RL would be as follows. If the AGI writes an ION p and an argument that convinces me it's an ION, I will grant the AGI reward |p|. If the AGI does anything else (including if its argument does not convince me), then I'll give it reward 0.
You can't correctly incentivize this behavior using reals. The computable ordinals are too non-Archimedean to do so.
You argue a real-bound RL doctoring algorithm would not appropriately set "the patient dies" to `-Inf` weight, but in fact humans do not either. If we did you'd see in the case of a near-death patient absolutely every procedure, no matter how costly, experimental, dangerous, or irrelevant, would be attempted if it had even the slightest chance of increasing the likelihood of them not dying. In reality, doctors make risk-reward decisions on every patient, and will very often choose not to undertake costly, experimental, dangerous, or irrelevant procedures even if there is some documented minuscule chance of it working.
Further, you argue that a real-bound RL theorem prover or composer would not know how to stop going down an ever increasing state-chain x_0, x_1, x_2, ... even if there existed some other state y that was "better" than any of the x's. But, this too is a very human behaviour! How many brilliant mathematicians, musicians, heck even software engineers, have spent their entire careers creating further and further derivatives of a known successful work, as opposed to starting anew and creating something truly world-changing?
You also bring up a theoretical button which on every press gives you 1 point, versus a different button which gives infinite points on every power-of-two press. You argue that the real-bound RL agent would be forced to move to the 1-point-per-press button after some number of presses, but would any human really sit there pressing the button for all of eternity to eventually get the `Inf` instead of just saying "screw it I want something now"? Not to mention that the problem setup is fundamentally flawed, as within our current understanding of the universe there is no infinite supply of anything, and furthermore if there was an infinite supply of something you wouldn't have any benefit of pressing after the first press, much less waiting around for the billionth -- you'll continue to have an infinite supply. In fact, what you've done there is presumed a surreal universe by a) assuming that a button can provide an infinite supply of something, and b) assuming that having two of the infinities is better than having just one. So sure, if you're in a surreal universe, backing your RL with surreal numbers is a good idea. But we're, so far as I know, in a real universe, so backing with reals should be sufficient.
Edit: I above use "surreal" to mean both the standard concept of surreal numbers in addition to any numbering concept which allows for and distinguishes between integer multiples of infinities.
One minor correction first: you're absolutely right that AGI is about comprehending the environment, not about perfectly solving all environments (the latter is mathematically impossible even with strong noncomputable oracles etc). I'm not sure why people so often come away from my paper thinking I'm saying AGI is supposed to solve all those environments, I never say anything like that. If I could go back in time, I'd make that clearer in the paper. No, it's about the AGI simply being able to comprehend the environments, like you say. And the thesis in the paper is that shoehorning general environments into real-valued-reward environments is a lossy process.
For the rest of your argument, you make a lot of good points. I would ask, what do you say in response to, e.g., Alan Turing who asks us to imagine Turing machines having infinite tape and running for all eternity? Obviously that too is impossible in the finite universe we live in. That's sort of the divide we disagree on. I'm talking about idealized AGI. If we consider human beings, humans have finite lifetimes so any particular human being's entire lifetime of actions could simply be recorded in a finite tape recording. But does that mean said finite tape recording is intelligent? In the idealized world, I would want to say it's a basic axiom that no finite tape recording of a human can be intelligent. But now we're deep in philosophical woods.
I like your point about musicians etc creating further and further derivatives of known successful work as opposed to starting anew :) I guess in terms of my paper, the real question is, if you confronted these derivative musicians with the grand new work that transcends them all, would they recognize it as such, or would they (like an AGI confused by rewards shoe-horned into real numbers) mistake it for something mediocre? Now we are deep in psychological woods!
Yes, but over what timeframe? Will there be any diminishing returns plateaus along the way?
If you look at sci-fi movies with robots, they usually speak in a metallic voice but have good situational and language understanding. In reality it was the other way around, it's much easier to do artificial voices than understand the topic. That kind of naive understanding seems silly now, and this is how we gradually advance.
GPT-3 taught us that good sounding text is not that hard to generate if you have ample training data, but modeling the larger context is still hard. These kind of fine distinctions are what I call progress.
I waffle a lot on whether that aspect of 1968's '2001: A Space Odyssey' is evidence of genius or just survivorship bias.
Some Bozo has no credentials, no reputation, no track record of publications and barely supports the claim they're making with anything much. Some Bozo has no financial incentives or otherwise to opine either way. Some Bozo doesn't even work in the field at all.
Bets: Some Bozo or Deep Mind turn out to be closer to being correct in the passing of some finite amount of time? 5 years? 10 Years? 25 Years?
Bozo has the hindsight of history and philosophy going for him, while Deep Mind has a huge financial temptation to sell snake oil.
EDIT: bunch of other related predictions currently open:
>AGI may never happen, but the chance of that is small enough that adjusting for that here will not make a big difference (I put ~10% that AGI will not happen for 500 years or more, but it already matches that distribution quite well).
I don't know, isn't the DeepMind founder that guy in the Go documentary? I read about him after watching the doc and he seemed to be pretty cautious about taking in investment, and he didn't seem the type to try to cash out.
Eventually Google will give them the option to deliver financial success or be shut down.
Google make money, Google Bad.
Deep Mind owned by Google, Deep Mind bad!
The above conclusion stands trite.
Perhaps the inverse is true.
Google and Deep Mind, if correct, could be hurting themselves more than helping themselves.
Why? Creating a future species who’s too smart to click on ads, and too smart to remain subject to its whims, doesn’t sound like it’d be good for quarterly profits...
There’s also the emotional incentive for humans to confirm their own beliefs about humanity being special.
If Google/Deep Mind knows this, yet publishes research anyway in the spirit of truth, why, what they’re doing may be considered heroic.
Two sides of the coin here.
Creating an AGI is the endgame for everything. Who cares about ads when you have an AI that can learn to do anything and improve upon itself continuously?
- live forever
- grow their own mental capabilities exponentially over that unlimited lifespan
- turn themselves into universe-eating von Neumann probes
- grow exponentially forever
- "eat the universe" (I know, the last point was sci-fi gibberish)
In fact, humans are already pretty good at reproducing themselves and have managed to travel to space, and have exhibited finite periods of exponential knowledge growth combined with periods of collapse, as nothing grows exponentially forever.
AGI does not necessarily require for it to be conscious or throw tantrums about its creators' purpose. AGI just means that it's an intelligence that can be thrown at any problem, not just a particular game or task, similar to how humans can specialize in CS or playing the violin.
There is a semi-established definition that does include what I referred to:
> AGI can also be referred to as strong AI, full AI, or general intelligent action. Some academic sources reserve the term "strong AI" for computer programs that can experience sentience, self-awareness and consciousness.
The people paying for the development of the AGI can mean many things - the Google customers/users, Alphabet as a company, the executives throwing money at the problem?
Either way, I don't really get your point. Your initial post was about how it is counterintuitive for Google to allocate funds for an AGI, since it makes money out of ads. These are not mutually exclusive, you can have both, but my point is that if you develop an AGI, then you can pretty much "conquer" the world and revenue from ads becomes irrelevant.
Screwing my face up, looking at this sideways … but it seems as though you’re saying that the Bozos of HN have nothing useful to contribute to this discussion based on … [rereads] … their lack of academic credentials in the area… you could say this about just about any HN post I’m just wondering why this one? Here’s a thing though … if the understanding of a technology is so nuanced … that Bozos can’t “get” it … is it really that mature? We had functioning computers for 50 years but it was only when the Bozos got their hands on it that things took off. Internet for 20. Cell phones for 10. How long are we dabbling with neural networks? 50 years or so? All I see in this most recent explosion in AI is a rapid jump in the availability of cores. Ala Malthus once that newly available “source of nutrition” has been used up we will see a rapid die off once more and it will be another 20 years once the Bozo intellect has caught up before we look at this topic en masse again. Dismiss the Bozos at your peril. You’re dependent on them for innovation and consumption. Your sincerely, a Bozo.
The vague point was to show someone with zero reputation, credentials, specific expertise in the field or anything much seems to be pretty convincing in response to this hugely funded ivory tower exercise by spitting, cocking an eyebrow and saying "So you think so, eh? Wanna bet?"
This is a statement about the state of AI research credibility. Do you feel the first breezes of a deep AI winter coming on? (I don't know, I'm disinterested but not uninterested. Rising tides lift all ships etc. And vice versa). Neutral nets are cool. Is all ML a bit overrated? Is learning a misleading name to give to applied statistics?
I don't have answers, just suspicions. I could be very wrong, of course.
It would be an interesting thing to know more about.
The existence of human crafted general AI forces him to struggle with the possibility that there is no such thing as a soul.
I know a lot of people don't fall in that camp, but I heard enough "serious" people make such desperate claims to avoid thinking about the topic in a way that might challenge their underlying religious beliefs. I think no one likes to admit that religion and spirituality often force someone to reject the possibility that AI is actually really much simpler than they think it "should" be, because then humans aren't special after all.
 Numerous arguments boil down to an argument that complexity is non reducible. You see it here, hidden in various comments as well.
What will happen is that capital will become more skeptical about the limits of what's feasible with AI and it'll be harder to sell bullshit. You're already seeing that with companies like Uber selling off their self driving divisions.
It might or might not give us AGI. But it is already leading us to lots of places. Eg speech recognition even on my phone works way better than what I had twenty years ago on a Desktop.
However, this "result" is trivial. It is obviously equivalent to the claim that intelligence arose naturally in the biological world without influence from God.
The important difference here is that in order for RL to translate to solving real world problems, you need to faithfully and computafionally simulate the real world's physical processes and rules, or at least enough that n-th order processes exist accurately.
I've done various types of computational modeling and simulation work at different scales throughout my career with all sorts of scientists and engineers and I can tell you, pretty much no domain is there where you have good enough representative models RL can be used in. Some narrow special cases exist but nothing to the degree of a massive environment full of well coupled expert domain models. Some of the best cases are going to be so computationally bound that it would be quicker to do things for real vs simulate.
If you want RL to work and learn, it's likely possible under the connection you point out, but has to do this using physical machines and sensors interacting with the physical world like life as we know it does. Your AGI won't be able to cheat and run through the evolution process quicker using faulty reductionist models we use in most simulations (which is what everyone implicitly is hoping for), IMHO.
If you try this, your AGI is going to learn all sorts of flaws within those environments or at the very least, have so many narrow scoped bounds it won't be that "general." A lot of simulated models are frankly garbage (they have some useful narrow scope but are typically littered with caveats) and they've been in development pretty much since digital computing began.
Once we get into the details, their claims stop being iron clad. Even worse, some of their claims become actually hard or impossible to accept if applied to actual RL algorithms we have today. You give one good example with the difficulty of modeling the world. The implicit claim they make that this would be realizable in reasonable time (say, less than a billion years) is also not well supported. The idea that humans or mammals learn their social behaviors through RL rather than a good deal of reasoning from evolutionarily-trained first principles pretty clearly fails in the face of the poverty of the stimulus argument.
Overall, the claims in the paper tend to switch between obvious (if taken to talk about the general idea of maximizing reward) to almost certainly wrong (if taken to talk about known RL algorithms, reasonable time frames, and specific examples of what is supposed to be learned).
 the poverty of the stimulus argument may be controversial in linguistics where it was first formulated. Still, if applied to mammal or insect socialization, the extremely low time frames in which individuals of a species start exhibiting typical behaviors basically proves in my opinion that they are instincts, trained at the population level through evolution, not individual learning through RL. The extreme similarity of behavior between individuals of the same species, VS the variety of behaviors between different species, also suggests an important component of species-level rather than individual level learning.
Where did God's intelligence come from?
If that's "cutting edge ML", then going off my YouTube recommendations, we're back in another AI winter. If I watch one video from a channel I've not seen before, I'll get that channel recommended constantly even if it bears no resemblance to what I normally watch. On my Explore page, the first 22 videos (of which 8 are Fortnite-related!) hold no interest for me. My Home page is just channels I've watched repeatedly and/or am subscribed to. It's a mess.
I would guess about two thirds of the channels I consistently watch I originally discovered through algorithm recommendations. I think it works extremely well.
For me, probably 90% of what I watch I'm not interested in and often I'm repelled by. This is because I mostly watch to find out what things I'm not familiar with are.
For example let's say I'm a liberal. I'm not going to watch liberal political videos because I know generally what they're going to say and I don't need my political views stroked in order to be happy. But I will watch various other political videos, no matter how extreme or not, so I can be at least a little familiar with their behaviour and views.
YT can't cope with this. To their systems I seem to be randomly picking videos with no correlation with the subject matter or other users and no reinforcing pattern. It just gives up and recommends things based on the behaviour of the general population, as if they had no data on me at all.
Every day, averaging 2-3 hours. It's background for working and foreground for evening viewing.
Deep Mind is best understood as the following bet: if we can train an AI that can learn from "its environment" and do the sort of things a human would do in that situation, then we have achieved AGI and from that ... business ... will follow. Hence their focus on video games as a training environment.
This sounds intuitive but is actually a very agent-centric viewpoint and most AI doesn't resemble this type of thing at all. Most AI deployed so far doesn't have anything resembling an environment, doesn't have any kind of nexus of agency and doesn't need to actively make decisions that then feed back to its own learning, only make probabilistic predictions. And in fact you often don't want an ML model to train on the outcomes of its own decisions.
For reference : https://deepmind.com/research
If indeed it even is Deep Mind making those improvements, Google has lots of other ML groups, such as Google Brain, and these are more directly focused on Google products.
There’s no denying their academic success, or game playing etc, but as far as I can see, the data centre cooling bit is the only palpable (public) business success.
Did you mean to write DeepMind instead? If so, I don’t disagree.
Finally, if you're going to attack someone's article: attack the article, not the person that wrote it. This is the lowest level of attack possible: the personal one. It's as ad-hominem as it gets.
So if we're gonna have an opinion we need to do the whole academia & job in the industry dance?
That's quite a terrible way to view the world and quite limiting. A world without diversity is a stale and rotten world.
So fuck that and the glass houses and the boxes this kind of worldview puts people in.
Everyone should be able to throw stones, and if the hit hurts, well guess there is a reason.
The thesis is that DeepMind has financial incentive to state "we can achieve AGI with what we're doing", to keep up the funding and hopes for the field, not "the author is an idiot".
And the thesis is true, they do have financial incentives.
That's not ad-hominem.
"Only a true AI would deny their being."
The vast majority of ideas are wrong. Every idea is wrong until it leads to the one that is right.
This idea might be the right one, or it might be close to the right one, or it might be far from the right one, but the trajectory is headed toward the right idea. SomeBozo has no trajectory. The best he can do is watch from the sidelines.
Frankly, after seeing AlphaZero and AlphaFold I'm surprised they didn't declare AGI right there and then.
People assume that when AGI happens, computers can suddenly outsmart humans in every way and solve every problem imaginable. The reality is just that it could in theory given enough time and resources.
It is like quantum computing. In theory it can instantly factor and break our nice cryptographic primes. In reality the largest number it factored is 21.
In theory given enough time and resources, anyone can defeat any grandmaster in Chess: just compute the extended tree form of the game and run the minimax algorithm.
The "given enough time and resources" clause makes everything that follows meaningless, unless a reasonable algorithm is presented.
> It is like quantum computing.
It is absolutely not like quantum computing. Shor's algorithm is something you can look up right now. It is precise and well-defined. The problems we are facing with quantum computation are related to the fact that we can't really build reliable hardware. But we know that given such machines the algorithm would work. We have precise bounds and requirements on those machines.
As far as AGI goes, we have absolutely no idea. There's lively debate on whether anything we have done even counts as significant advancement towards AGI.
Yes, that's why we're considered to be generally intelligent. It is exactly the point, and not at all meaningless. Right now there's no machine that can come up with the idea to run an extended tree form of the game and minimax the algorithm. If there was such a machine, then that machine would be considered AGI.
> It is absolutely not like quantum computing.
I meant in the sense that just that it has actually been achieved, it doesn't mean it's as powerful as we have described in the theory. In theory you can use Shor's algorithm to break encryption, in practice the devices we have today have trouble with 2 digit numbers.
The same principle goes for AGI. If someone releases an AGI system today, it doesn't mean that tomorrow we'll see a Boston Dynamics robot hop on a bicycle to his day job as a Disney movie art director. The world would most likely not change at all, at least not for a while, many people would not recognise the significance and many people might not even recognise the fact that it is in fact AGI.
> As far as AGI goes, we have absolutely no idea. There's lively debate on whether anything we have done even counts as significant advancement towards AGI.
You might think that, and that says something about what side of the debate you're on. We're commenting here on the thread of an article about DeepMind asseting that reinforcement learning is enough to reach general AI. If that's true (and I think it is), then we've probably reached general AI already.
"Artificial general intelligence (AGI) is the hypothetical ability of an intelligent agent to understand or learn any intellectual task that a human being can."
What definition are you using?
You can read a book on how to hit a ball with a baseball bat, you can even practice and get good at it, but that still doesn't mean you would actually be able to hit a ball thrown by a professional pitcher.
> You can read a book on how to hit a ball with a baseball bat, you can even practice and get good at it, but that still doesn't mean you would actually be able to hit a ball thrown by a professional pitcher.
If I hade incredibly fast reflexes and actuators I could.
Or maybe it couldn't, because the software is not as efficient as the organisation of your brain is. Or because there's hardcoded routines evolved in your brain that it lacks.
What I'm saying is that just that because an AGI can't drive a car, it doesn't mean it's an AGI. For the same reason there's loads of people out there that are generally intelligent that can't drive cars for all sorts of physical reasons.
Admittedly I'm a layman in this area, but could it? AFAICT it would only work on trained set data and whatever generalizations can be made on that and not infer unseen scenarios like humans do readily.
> What I'm saying is that just that because an AGI can't drive a car, it doesn't mean it's an AGI.
I understood what you meant from your first post, I'm simply disagreeing on account of the very definition of AGI.
You can't have an ameoba level AGI and still call it (a limited) AGI. Either it can understand/learn any human task, or it can't.
The definition is made for a reason. Watering it down for any specific generation of AI serves no benefit.
OK. This basically says "evolution works". But how fast? Biology took tens of millions of years to boot up.
An related question is how much compute power does evolution, viewed as a reinforcement learning system. have? That's probably something biologists have thought about. Anyone know? Evolution is not a very fast or efficient hill-climbing system, but there are a large number of parallel units. It's not a philosophical question; it's a measurable one. We can watch viruses evolve. We can watch bacteria evolve. Data can be obtained.
Two questions I pose occasionally are "how do we do common sense, defined as not screwing up in the next 30 seconds", and "why does robotic manipulation in unstructured situations still suck after 50 years". A good question to ask today is why reinforcement learning does so badly on those two problems. In both cases, you can define an objective function, but it may not be well suited to hill climbing.
Great point. Until the promoters of RL can build us a robot that can 1) walk gracefully through a typical home that has stairs and closed doors, 2) cook a meal with pots and pans, and 3) get back up after it falls down -- I suggest we take their claims of impending Singularity with a big grain of salt.
Separately, a hostile or indifferent AI could still cause a heck of a lot of trouble for human civilization without the first two things. Consider an autofactory clearing room for expansion with bulldozers, no need to navigate stairs there. Bullets or smart glide bombs don't need to understand doorknobs. Etc.
For example, the classic "design a car that can drive over this terrain" problem, even after a billion generations (~ the same number as life on earth), shows no substantial performance improvement.
That makes me suspect something is missing from our biological genetics model.
So, we do not have as many constrains as life did.
That is probably true to some extent. I mean, if we make an AI that has an orgasm each time it blows up something with a hellfire missile, it will probably learn to find ways to blow up more things more frequently and efficiently.
Our cognition is affected by pain, hunger, thirst, cold, heat, pleasure, smells, sounds, etc... positive and negative reinforcements.
I'm with most of the comments there. This paper is ridiculously hand-wavey.
It's also worth it to note as well that this isn't a homogenous organization, many DeepMind employees have different opinions on issues like this and an individual paper isn't representative of the entire organization.
Please don't consider my critique of this paper as an indictment of DeepMind as a whole!
> Many of DeepMind's opinion style papers are like this.
That's good to know. I have not read many of their opinion papers, and I'll admit I didn't have the context of it being an "opinion" paper.
That said, I don't agree with the opinion. The paper didn't really engage with the concept of AGI in a way that I found satisfying. The conclusion may very well be correct, but this paper wasn't enough to convince me.
Slightly OT: My views were reinforced when I saw the paper was praised by Patricia Churchland. I don't find her take on consciousness a satisfying one, though I find the general direction of her work interesting. See here for another example:
This setting on its own is meaningless! The “how” of the RL agent is not even 99% of the problem, it is all of it.
Given our understanding of both DL and neuroscience, it is not even clear to me that we can say with confidence that Neural Networks are a sufficiently expressive architecture to cover an AGI.
The human brain is a deep net, sort of, but there is also plenty going on in our brains that we don’t understand. It could be that the magic sprinkle is orthogonal to DL and we just don’t know about it yet.
I think there are two currently unsolved problems
1/ We have no idea what the reward function looks like that leads to AGI
2/ Deep networks are artificially constricted for computational efficiency and always optimized to solve the problem at hand;
Any solution that delivers AGI should rely imo on:
1/ reinforcement learning
2/ Happen with an unstructured reservoir of randomly connected neurons
There was a research trend towards reservoir computing and recurrent neural networks but this was mostly abandoned because progress in deep learning was amazing.
These techniques are akin to a 2D-plane in a 3D object, it's heavily simplified and circular references are prohibited.
I have some good ideas on what the reward function should look like in a reservoir setting and happy to discuss them with any active independent researcher in the field.
I'm not sure that's true anymore - pretty much any objective devised is being solved by ML solutions within months (with some exceptions such as Chollet's ARC, maybe Winogrande). But those same models will perform poorly on other unseen tasks, because ML takes shortcuts if it can. We used to have unsolved tasks for decades, such as Go. It's now comparably hard (if not harder) to create a good objective measure of intelligence than to reach human parity on said measure.
I’m not dissing self-driving car research, just not sure we’re anywhere close to parity, and the problem is fairly well defined.
This is the big challenge in practice.
Your brain is constantly simulating a few milliseconds ahead.
The hack is to have the simulator itself ... also be a learned system, and not to simulate the world, because you don't act on the real world, only a tiny part of it that you can measure and actually get into your brain (which is a simplified version of what you see, or a "latent variable"). There's no need to simulate anything that doesn't affect your reasoning. The information flows in reality with an intelligent actor is like this.
World (say, a tree falls) -> input representation (e.g. eyes) -> simplified version ("latent" version) -> intelligent actor -> muscles -> affects world.
Now what everybody thinks of as a simulator is something that simulates the whole thing. But if you insert one more link (output of reasoning agent at time T -> simplified representation at time T+1) you can then run a "simulation":
random simplified version ("what if your car became a tree and fell ?") -> intelligent actor -> next input for latent representation ("then what happens ?")
For safety reasons, it is probably prudent to disconnect the muscles in this state. You know, so you don't knock out your mother when you dream about boxing.
And as you say, this "predict the future" network is probably useful by itself in dealing with the world. So you can catch tennis balls and the like.
Or, if you like: https://www.youtube.com/watch?v=dPsXxLyqpfs
There is no evidence that the thing you don't understand isn't based on RL too.
As far as I can tell, they're not actually proposing how to achieve this. I can't access the article without a host institution it seems (is there another link?), so I only have the article to go by. RL has been the basis for all robots engaging with the world, and that engagement with the physical world modeled using RL has been promised to make robots that can act like a 2 year old for a long time (see Cynthia Breazeal's work, for example). Yet AFAIK, we haven't actually achieved this as we don't know how to efficiently model the problem to have learning rates that reach anywhere near what we're able to do with DNNs today.
Perhaps someone who has access to the paper can say why this is a milestone? If Patricia Churchland suggests it is, then something new must be happening here.
This is the download link: https://www.sciencedirect.com/science/article/pii/S000437022...
After having read the paper, I am very disappointed in the output. Nothing concrete was shown, just hypothesis and reads more like philosophy. That being said, I would say that the paper is carefully worked out and does provide insight if you haven't thought about RL before.
Personally, from reading the abstract, I disagree with the hypothesis. There's a trick where anything (even say, a database lookup) looks like optimization as long as you contrive the objective function just right, but that's kind of uninformative.
Yes, intelligence has been created by evolution. That doesn't imply that any system that is subject to evolutionary forces will lead to the creation of intelligence (and not within a reasonable timeframe, either). The challenge is to create a system that is capable of evolving intelligence.
Afaik some biologists even think that the evolution of intelligence was rather unlikely and would not necessarily happen again under the same circumstances as on earth.
Evolution and intelligence are inextricably linked. They are practically the same thing. This means that intelligence is probably a natural result of any system similar to those that support biologics. If you flow the right amount of energy through a substrate with complex enough building blocks, you'll eventually get life ~ which is just something smart enough to survive and feed off the available energy flows. In the world, this flow is radiation from the sun, while in a computer, it is governed by a more abstract loss or fitness function.
Hmm. Can you provide a pointer to those biologists?
AFAIK, high intelligence has arisen more than once on Earth (Hominoids, Cetaceans, Octopuses), so I'm somewhat skeptical of that claim, but perhaps they're construing intelligence more narrowly (ie. only Homo Sapiens qualifies).
True, the question of intelligent life evolving can be construed as either:
"Given that life exists, what is the probability of intelligence evolving?"
"Given that the universe exists, what is the probability of life arising and evolving intelligence?"
Both are actually interesting and important questions (cf. the Drake Equation and Fermi Paradox), but I am pretty comfortable asserting that in the context of this conversation the former interpretation is more apropos.
Another way to look at it is, if we had a good enough function (e.g. a universal approximator) it can be made to model any behavior using numerical optimization. Which I think isn't very surprising, but apparently there is some arguments about it.
So welcome back to the future, and the $trillions the US spent on 20 years of space race and 50 years of cold war. The catchphrase that motivates the next 50 years of government/corporate funding will be...
They've got a Terminator and we don't.
A 2018 article about the challenges of reinforcement learning: https://www.alexirpan.com/2018/02/14/rl-hard.html
The actual scientific question is, what are the mechanisms that make agents work, what are the fundamental modules within intelligent systems, is there a distinction between digital and biochemical systems, what costs are there in terms of resources and energy to get to a certain level of intelligence, and so on. Real questions with specific answers. For all the advances coming from just upping the amount of data and GPU hours, there is so little progress on trying to have a model of the structures that underpin intelligence.
trying to answer specific questions won't generalize,
but if you train a network with the right potentially hacky series of rewards/rich enough environment you could get a much more general intelligence
a new kind of science
Given that requirement you'd have to either find a way to accurately model the world and all of those interactions in silicon, or you'd have to build millions of robots that can report back the results of billions of interactions each day. It's not impossible to do that, and maybe it would even be likely that we would eventually accomplish that but the cost would make it prohibitive for anyone but a nation to even attempt today. It's almost certainly outside the realm of what is possible in the near future. Maybe when robotics has progressed enough that robots are capable of interacting with the world with basic AI will we see the rise of something like a GAI.
It talks a lot about having a rich enough environment for learning which makes sense, if a computer lives only in a Go board it can only learn go playing itself.
How do you simulate a rich enough environment purely in software (or do you sense input from the "real" environment) and what reward do we define in this complex environment..
It seems to ask those 2 questions in the discussion but kind of glosses over them imo.
As a trivial example, consider a variation of Conway's game of life which, in addition to black and white cells, also has green cells, where any cell next to one or more green cells will be a green cell in the next time step. A generic state in such a variation will have at least one green cell, and therefore all parts of it will eventually be green, and so no useful long running computation will be done, certainly none which takes where the green cells are into account. But, such a system would still be turing complete, because one could start in a state in which there are no green cells, and in those states you just have Conway's game of life.
That trivial example works as an existence proof, but even for less extreme cases it isn't clear.
Consider ordinary conway's game of life. To paraphrase a question from Alex Flint on Alignment Forum (https://www.alignmentforum.org/posts/3SG4WbNPoP8fsuZgs/agenc... ) Suppose we have some 10^50 by 10^50 square where an agent is supposed to be implemented, and this 10^50 by 10^50 square is at the top left corner of a, say, 10^100 by 10^100 square, where the rest of the square is initialized randomly, is it even possible for the agent to be such that it has a high chance of successfully influencing the large scale state of the rest of the 10^100 by 10^100 region in the way that is desired? It isn't clear. It isn't clear that a structure can withstand the interactions with a surrounding chaotic region. Perhaps some systems are such that they do allow Turing-complete computation, and are such that typical states result in complex behavior, but are also such that all really structured behavior is always very "fragile", and can only continue in a structured way if what interacts with it is in a small set of possible interactions.
To be capable of Turing complete computation, is not, I think, sufficient for "life" (a self-maintaining thing) to arise from typical/generic states, even when under the assumption that typical/generic states lead to continually complex behavior (to exclude the spreading green cells case)
Also, I don't think we can confidently say that the Plank time is "the universal frame rate". Better to refer to Bremermann's limit and the Margolus–Levitin theorem , though these bounds depend on the amount of energy available. (10^33 operations per second per joule, where the energy is the average energy of the system doing the computation)
You're right, that's the actual meaning of action in physics, which is what the Planck constant measures. The amount of change (which is measured in Hz) per joule of energy. But it's a good enough approximation and a good lower bound for the amount of processing power the universe possesses versus our en-silico hardware. We don't have anything near 10^33. Just because we build a system that has the ability to evolve doesn't mean we will ever see it through to the extent that the universe has the capability to.
Planck's constant measures action, Hz per Joule of energy. Hz is really just a measure of oscillation, or change. It doesn't directly translate to framerate, but it gives us a ballpark figure in orders of magnitude. We don't have anything near 10^34 Hz en-silico, and even if we built a biological/chemical computer, that would be on the par of Avogadro's number, 10^23. So, just because we build a system that can _evolve_ to be intelligent, or hold intelligence within it, doesn't mean we have any ability to actually see it through to that.
But as an enthusiast of all three I really think that AGI is a hardware problem, not a software problem.
Reinforcement learning on a massive corpus of data is how we train all biological intelligence.
The crazy thing is that in humans we manage to do it on ~3 watts an hour.
I think we have the software cracked, my gut thinks silicon just isn't the right material
To me, it seems more likely that we're missing something/some things on the software side. AGI could probably run on present day hardware or even older.
(Unless you're making a more general reductionist statement that everything in the universe is a computational process - that kind of reductionism is understandable coming from people who work with computers for their job - but this is then a philosophical stance, not scientific, and frankly a very strange one.)
Now can I prove that ? Of course not. But it seems like a fairly solid working hypothesis (any other alternative hypothesis sounds far more quacky anyway? What quantum entanglement of mocrotubules?).
Source? I am not aware of any other known process in the universe that could not be simulated by a Turing machine.
We know that Turing machines are very limited things and that the computational processes they carry out also very limited in applicability.
What's the evidence that the universe is more limited that a Turing machine?
Just the fact that we can imagine things that can't be computed by a Turing machine should clue you in that it's probably otherwise.
Again, source? Do you know of anything that is able to perform a computation that a Turing machine cannot?
> Just the fact that we can imagine things that can't be computed by a Turing machine should clue you in that it's probably otherwise.
Like what? Uncomputable numbers like Chaitin's constant? We can "imagine" them by stating their definition, but we cannot compute them. Or do you have something else specific in mind?
That's a circular argument, because "computation" is literally defined as "something that can be computed by a Turing machine".
That said, the first month of the first year of a CS education is "here's these problems that can't be solved by a Turing machine, mind=blown". (At least where I studied CS, that is.)
The halting problem is (I think?) the standard example.
Here's a short video on it https://youtu.be/macM_MtS_w4
on a serious note the only positive outcome of all this shameless PR is that the heavy investment in ML/RL might trickle down to actual science labs and fundamental neuroscience research which might move us forward towards understanding natural intelligence, a prerequisite for creating an artificial one.
I've thought about this before, and I'm not convinced it's really prerequisite. Naturally developed intelligence in my mind may actually be highly constrained and inefficient because it was limited to what was biologically feasible. i.e. There may be simpler ways of achieving comparable results. Natural intelligence does however have the benefit of being an actual working model, but deciphering the blackbox may be just as hard as developing a working theory from first principles.
I think someone serious about AI should treat it not as engineering problem but as a science, like physics, which starts with model of nature, and experiment to prove or disprove the theory. Nature provides the constraints by which theory is developed, which radically limits the "search space" of theories. Otherwise it's a bit like throwing things on the wall and see what sticks, which is the primary method of current AI research.
However, understanding them absolutely was; we didn't end up taking exactly the same route to the sky, but we absolutely learnt from birds on the way.
Games like Go and Starcraft are well modeled worlds. If you want something akin to AGI to operate in the "real world" you will need a high quality data model of the real world for the RL system to work off of.
To quote a cliche: "we live in a society". As humans we are embedded in a social environment which has a few important features: We cooperate, we compete and we die. These three pillars are the basis of our culture (a concept we should apply to AI btw). Because of competition we are forced to learn everything there is to learn (general intelligence), to get a leg up. Because of cooperation and death we need to continuously transmit and share knowledge with our friends and the next generations. Ever changing alliances means we need to get good at both deception and detecting it.
For this reason I think warfare is ideal for reaching general AI.
Just one - yes. But how about if you send millions of bottle messages?
assuming we can integrate all learnings from those bottles into a system that can classify any given situation and apply the learning in that domain. But to build a system that can classify any problem is where we're stuck at and RL can't get us there
General AI requires a feedback mechanism from the real world. Unless you have an accurate model of it in a computer, you can’t just test whether a joke will be funny without waiting for humans to laugh. You can’t check whether a tailored diet or workout regimen or gene therapy will have good results without humans trying them.
So you’ve reduced your AI problem to a harder problem: modeling the world and all of its complexity in a computer, and somehow being able to run simulations faster than the stuff that happens in the actual real world
I have a feeling that the lines between the supervised and unsupervised categories will get increasingly blurred, with semi-supervised, self-supervised (eg. like self-attention) and adversarial (eg. GANs) approaches mixing together in strange ways.
You might not like my analogy either. I think of supervised and unsupervised learning as the majority of the genome of ML, while RL is that little Y chromosome sometimes tacked on to address a few high-profile tasks.