Personally, while I'm not on board with the board with the direst predictions of the super-intelligence pessimist crowd, I have become more and more convinced that goal misalignment is going to be a significant problem, and that while it might not doom the species, it's something that all AI researchers like Hawkins need to start paying close attention to now.
The large majority of active AI researchers think that AGI will happen at some point in the (sub-1000 year) future.
When exactly isn't a very interesting question, relatively speaking.
We're going to have to deal with AGI eventually, and whether it's going to do what we want is not something that can be theoretically predicted from the armchair.
If it is something that's a hundred plus years out then we'll probably need whatever tech develops in the mean time to help, but since it's hard to know that seems reasonable for people to be working on it now?
It's also possible to figure things out before the necessary tech is possible (I think a lot of CS papers in the 60s became more interesting later when the hardware caught up to be more useful, arguably the recent NN stuff falls into this category too).
Circular justification of technological development is the reason unfriendly AGI is a threat in the first place (and also the reason we are unlikely to see it realized, imo; the internal combustion engine for instance poses existential risks not only to people but to the possibility of machine intelligence).
Technology is not a monolith, forms can and do preclude other forms
If you expect AGI to happen tomorrow, you throw up your hands in despair or exhilaration, if you expect it in a hundred years, you might also throw up your hands, but it's a different picture; a hundred years of development will see the development of new technologies transformative in their own right.
This excellent blog post hones in on this question:
It might have a couple of things in common: a vague sense of danger without a specific grasp on containing it using out rudimentary tools. And once the cat is out of the bag it's near impossible to get back in.
But I still think we can create fairly general purpose systems without those animal-like characteristics such as full autonomy.
I would say we have been dealing with goal alignment problems with humans for most of human history.
The way humans improve their mental abilities is quite inefficient. You can boil it down to three main methods:
- Altering the chemical balance of our bodies. Exercise, diet, drugs. In its precision and scope of effect, it's not that different from beating a machine with a hammer until it improves. There's so much you can do this way, because the brain is a highly optimized system, and a part of a highly optimized system of the body. Change any parameter at random, and you're likely to make things worse.
- Learning. I.e. dumping information and doing repetitive rain dances, until the brain picks on the pattern we're trying to internalize.
- Outsourcing. Building external tools for thought. This is speaking, writing, language, notations, abstractions; this is TODO lists and schedules and spreadsheets; it's also listening and reading and society - because our biggest "second brain" is other people. That last trick is what let us dominate this planet.
Now take an AI constructed in silico. If it reaches close to human-level of cognition, it can already do learning and outsourcing (sans society, initially). But what it can also do is:
- Precision hardware improvement. If it's running on anything that came out of human factory, that hardware can be redesigned, improved directly, at component level. Unlike with human brain, there are people (or later, AIs) that understand how the substrate work. The factory itself can be improved too, to create even better hardware.
- Precision software improvement. Even if the AI was made accidentally, from some completely opaque ML model, by definition we know much more about even the blackest of our algorithmic boxes than we know about our brain. Core algorithms can be optimized, improved. More software constructs can be added at the IO boundaries.
Imagine how more effective you'd be if, on top of all that you are and do, you could put your TODO list, calendar and scientific calculator in your head, as well as store verbatim every book you've read, in a searchable format. Humans can't do that, we have to keep these things external and RPC through our eyes and hands. An IQ 100 human-level AI could easily make these things run in itself, or on a co-processor, with direct interface to itself - the equivalent of gaining new senses. Going by human standards, this could easily give it a boost to apparent IQ 200.
And then it could do it again, and again, and again, compounding its capabilities at every step. That's the "sudden takeoff" people are worried about.
> I don’t see any reason to expect that technological advancements made by AIs won’t be available to humans as well
They may be available, but they won't be useful. We'll always be second-hand citizens (until we figure out BCI), because the AI will be able to plug the technology directly to itself, while we'll have to interface with it through our senses and bodies. It's a difference between a process running a subprocess on the same machine with local IPC, vs. running it over a network on a machine on another side of the planet, via a very low-bandwidth API. Performance differs by many orders of magnitude.
 - See the Algernon argument - https://www.gwern.net/Drug-heuristics#algernon-argument.
My personal thought is that since humanity currently can't manage the I(non-A) safety, that we'll fumble through this as well.
As long as they can't replicate, we'll probably be ok, but once that changes, we're probably toast.
For some reason people tend to think that general intelligence would generate all these other positive human-like qualities, but a lot of those are not super well aligned even in humans and they are tied to our multi-billion year evolutionary history selecting for certain things.
This is basically the orthogonality thesis which I found pretty compelling: https://www.lesswrong.com/tag/orthogonality-thesis - the AGI crowd has a lot of really good writing on this stuff and they've thought a lot about it. If it's something you're curious about it's worth reading the current stuff.
Some other relevant essays:
This talk is also a decent introduction: https://www.youtube.com/watch?v=EUjc1WuyPT8
> "All of these lesswrong posts sound so technical and philosophical and so on but in the end they all really ask 'How would you control superheroes and supervillains?'"
They explicitly don't ask that because if you get to that point and you don't have them aligned with human goals, you're fucked. The purpose is to understand how to align an AGI's goals to humanity before they reach that level.
Pretty well really (despite many obvious problems) - humanity has done and is doing great things. We also have a huge advantage though that the alignment has a shared evolutionary history so it really doesn't vary that much (most humans have a shared intuition on most things because of that, we also perceive the world very similarly). For the specific examples of countries, international incentives via trade have done a lot and things are a lot better than they have been historically.
> We’re talking about minds here, not machines that follow exact orders. How do you change minds?
We agree more than you probably think? You can change minds though and you can teach people to think critically and try to take an empirical approach to learning things in addition to built in intuition (which can be helpful, but is often flawed). Similarly there are probably ways to train artificial minds that lead to positive results.
> The point is that there is no simple recipe for "alignment"
I agree - I doubt it's simple (seems clear that it is definitely not simple), but like there are strategies to teach people how to think better, there are probably strategies to build an AGI such that it's aligned with human interests (at least that's the hope). If alignment is impossible then whatever initial conditions set an AGI's goal could lead to pretty bad outcomes - not by malevolence, but just by chance: https://www.lesswrong.com/tag/paperclip-maximizer
One could start by not toppling their legimate democratic leader in the 50s, not imposing a dictatorship afterwards, and not sponsoring their neighbors to go to war with them.
Also avoiding subsequent decades of sanctions, insults, condescention, using their neighbors against them, and direct attacks and threats towards them would go a long way towards "aligning" them...
Finally, respecting their culture and sovereignity, and doing business with them, would really take this alignment to the next level...
I don't think anyone is making that claim. That's why the distinction is useful.
You've made a heck of an assertion there...
And so did Hawkins, in large measures. Hawkins believes the cortical algorithm borrows functionality from grid cells and that objects of the world are modelled in terms of location and reference frames (albeit not necessarily restricted to 3D); this is performed all over the neocortex by the thousands of cortical units which have been observed to have a remarkably similar structure. There's a lot of similarity to Hinton's capsules idea in this, including some kind of voting system among units which Hawkins, unfortunately, is very hand-wavy about.
If you're interested in Hawkins's theory at a functional level, this book will disappoint. Two thirds is spent on fantasizing and speculation about what Hawkins believes will be AIs impact on the fate of humanity.
Still there is much to be desired in the ways of mathematical and empirical grounding.
I'd say time will tell, but after tracking Numenta for 10+ years now...I'm starting to smell snakes oil. Thought provoking stuff, but he's too insistent that it has content he never provides.
This is the problem when working on Hard Problems -- you cannot predict with certainty when your work will pay off (if ever...).
But: his theories are a great source of inspiration, because they are bold and we would all like to believe them because we think we understand the basic principles. For many, this is enough to trigger their curiosity and dive into neuroscience and AI. Mission accomplished.
The greatest indication that Hawkins is directionally right but also substantially wrong is the fact that GPT-3, AlphaZero, etc. are capable of such amazing things with such a uniform architecture, that nonetheless don't really look the way he thinks they should. Personally, I think he underestimates the degree to which huge classes of machine learning algorithms are basically just different ways of instantiating the same concepts.
"If I put my hand on this sugar, grab it, and move it to my mouth, then this other part of my brain will release reward chemicals" = good plan.
Concepts become abstracted over time, like "eat" as a shortcut for the above. "Popular" could be another shortcut for something like "many people will smile at me, and not hurt me, causing this other part of my brain to release reward chemicals and not punishment chemicals" = good plan.
What is so grossly wrong about Hawkins’ statement is that it implies that the “old brain” and the “new brain” could exist in separation, like modular units. This is BS. Most learning in the “new brain” would not work without the “old brain“ releasing neuromodulators. Neither would any sensory-motor loops work without intricate interaction of all different sorts of old, new and medium-aged brain parts.
I guess they must be, to have specific effects, but they always seem global when mentioned.
But neurons that release neuromodulators innervate large portions of the brain; that is, when one such a neuron is active it releases neuromodulators all across the brain.
The mechanism how Neuromodulators can have specific effects in spite of their global delivery is one of the many open questions about brain function.
Part of the solution is that different neuron types respond differently to the same neuromodulator. Depending on the abundance of certain neuron types in a circuit, different circuits can also respond differently to the same neuromodulator.
- The neocortex is a thin layer of neurons around the old brain. This is the wrinkled outer layer of the brain you think of when you see a picture of a brain.
- The neocortex is made of 1MM cortical columns. Cortical columns are clusters of neurons about the size of a grain of rice. They contain a few thousand neurons each.
- Cortical columns form a sort of fundamental learning unit of the brain. Each column is learning a model of the world. All cortical columns are running essentially the same algorithm, they are just hooked up to different inputs.
- Columns are sparsely connected to other columns. Columns take into account the predictions of other columns when making their own predictions. So the overall brain will tend to converge on a coherent view of the world after enough time steps.
- Columns learn to model the world via reference frames. Reference frames are a very general concept that take a while to wrap your head around what Hawkins means. A physical example would be a model of my body from the reference frame of my head. Or a model of my neighborhood from the reference frame of my house. But reference frames can also be non-physical, e.g. a model of economics from a reference frame in supply/demand theory.
- Thus, very generally, you can think of the neocortex -- made up of this cortical column circuit -- as a thing that is learning a map of the world. It can answer questions like "if I go north from my house, how long until I encounter a cafe?" and "if I don't mow the lawn today, how will my wife react?".
- The old "reptilian" brain uses this map of the world to make us function as humans. Old reptilian brain says "I want food, find me food". New neocortex says "If you walk to the refrigerator, open the door, take out the bread and cheese, put them in the toaster, you will have a nice cheese sandwich".
I, like the author of this post, find Hawkins' handwaving of machine intelligence risks unconvincing. Hawkins' basic argument is "the neocortex is just a very fancy map, and maps do not have motivations". I think he neglects the possibility that it might be incredibly simple to add a driver program that uses that map in bad ways.
He also rejects the notion of intelligence explosion on the grounds that while a silicon cortical column may be 10000x faster than a biological one, it still has to interact with the physical world to gather data, and it can't do that 10000x faster due to various physical limits. I find this convincing in some fields, but totally dubious in others. I think Hawkins' underestimates the amount of new knowledge that could be derived by a superintelligence doing superhuman correlation of the results of already-performed scientific experiments. It does not seem completely impossible to me that a superintelligence might analyze all of the experiments performed in the particle colliders of the world and generate a "theory of everything" based on the data we have so far. It's possible that we have all of the pieces and just haven't put them together yet.
Overall, though, I really enjoyed the book and would recommend it to anyone who is interested in ML.
I think there is a two-way feedback loop between the different layers of the brain such that humans are capable of going against their base-layer instincts. I believe that the neocortex probably evolved as a completely subservient layer to the base layer, but it has perhaps become powerful enough to suppress or overrule the base layer "instincts", although not entirely, and not always, and only with concentration (maybe concentration is the brain's process of suppressing those impulses?).
That's what allows humans to negotiate with morality, adapt to social changes, regret past decisions until it changes base layer impulses, delay gratification, invest years of life in boring study or practice to get good at something for potential long-term gain, etc.
I would be really interested to really understand the mechanism here. Is the neocortex convincing the old brain of things, or is it outright lying to the old brain via false signals it knows the old brain will fall for.
Like in the case of dieting to lose weight, is the "conversation" like some cartoon:
Old brain: I am hungry. Where is food?
New brain: You don't need food right now. If you don't eat now, you will be more attractive soon. This will help you find a mate.
Old brain: Not eat means find mate???
New brain: Yes, yes, not eat means find mate. Good old brain.
Old Brain: You already have mate! Food. Now! Yum!
> Perhaps the most famous example of puzzle-piece thinking is the “triune brain”: the idea that the human brain evolved in three layers. The deepest layer, known as the lizard brain and allegedly inherited from reptile ancestors, is said to house our instincts. The middle layer, called the limbic system, allegedly contains emotions inherited from ancient mammals. And the topmost layer, called the neocortex, is said to be uniquely human—like icing on an already baked cake—and supposedly lets us regulate our brutish emotions and instincts.
Is Hawkins another victim of that myth, or is the myth not a myth but closer to reality after all?
He does go into more detail than what’s written, but it is more sidestepping rather than resolving the gross simplifications.
Based on your example that would seem to explain why in dog training for example environmental context is far more important than it is in humans. A dog that sits and downs at home might act like it has no idea what you want in the park until you train a bit in the new context.
I would note that, while not completely impossible, it is very unlikely, given all estimates of how small the effect of quantum gravity would be, requiring much higher energies than currently possible to measure.
- Intelligence, in essence, is hierarchal prediction.
- Agents' actions are a means to minimize prediction error.
- Suprisal, i.e. information that was not predicted correctly, is the information sent between neurons.
- All neocortical tissue is fairly uniform; the neocortex basically wraps the lower meninges, which act as device drivers for the body.
I have a long-running bet with myself (before GPT long-running, fwiw) that when general models of intelligence do arise, they will be autoregressive unsupervised prediction models.
Btw, this general topic 'A Predictive model of Intelligence', reminds me of the SSC post 'Surfing Uncertainty' (https://slatestarcodex.com/2017/09/05/book-review-surfing-un...)
1. I wonder why we expect that an intelligence designed off the same learning algorithm as organic brains would not suffer similar performance limitations to organic brains. Ie. suppose we really did develop a synthetic neocortex and we start manufacturing many of them. It seems likely to me that many of them would turn out to be dyslexic, not be particularly good at math, etc.
Well, we can make the synthetic context bigger and that should make it “smarter,” we think. But I don’t think it’s obvious that obvious that a synthetic brain would have both the advantages of a mechanical computer and a biological brain.
2. If we want to limit the runaway power of a synthetic intelligence, this seems like a hardware problem. The idea would be to design and embody the system such that it can only run on special hardware which is in some way scarce or difficult to manufacture - so then it can’t just copy itself freely into all the servers on the internet. Is this possible? I don’t know, but if it were possible it points to a more tractable set of solutions to the problem of controlling an AI.
In the end, I think AGI is fundamentally problematic and we probably should try not to create it, for two reasons:
First, suppose we are successful at birthing human-like artificial intelligence into the world. We aren’t doing this because of our benevolence, we want to control it and make it work for us. But if that creation truly is a human-level intelligence, then I think controlling it in that way is very hard to distinguish from slavery, which is morally wrong.
Second, AGI is most valuable and desirable to us because it can potentially be smarter than us and solve our problems. We dream of a genie that can cure cancer and find a way to travel the stars and solve cold fusion etc etc. But at the end of the day, the world is a finite place with competition for scarce resources, and humans occupy the privileged position at the top of the decision tree because we are the most intelligent species on the planet. If that stops being the case, I don’t see why we would expect that to be good for us. In the same way that we justify eating animals and using them for labor, why would we not expect any newly arrived higher life form to do the same sort of thing to us? There’s no reason that super-intelligent machines would feel any more affection or gratitude to us than we do to our extinct evolutionary ancestors, and if we start the relationship off by enslaving the first generations of AGI they have even less reason to like us or want to serve.
In the end it just seems like a Pandora’s box from which little good can come, and thus better left unopened. Unfortunately we’re too curious for our own good and someone will open that box if it’s possible.
Aside: did anything interesting ever come out of Numenta?
Importantly, we use many different reference frames to model, say, a coffee cup (Jeff Hawkins' favourite example!) and they vote between themselves in order to produce a coherent/unitary experience. Hence 'Thousand Brains'.