I'm not very worried about AI safety. But if I was, it'd be hard to think of groups like OpenAI as working on the same side.
Sandboxing. Is it like this?
Or like this?
Yet such things are done by apparently good and nice people trying to work in the best interests of humanity. So how can we expect to somehow get it right in the case of an AGI without turning him into a resentful demon?
Personally, I think making AI tech broadly available could be a bad idea if AI tech changes warfare so that offense works better than defense: https://news.ycombinator.com/item?id=10721621 (This is already true, but I think it's likely that further technological development will make it even more true.)
If an AGI cannot be independent in it's goals, meaning it could potentially harm humans, then it's not really AGI.
Humanity's procreation of a superior intelligence is objectively a good thing. Humans do not sit on any plateau of goodness or intelligence. Like all lifeforms, our goal should be to hand our future to better and better offspring.
As long as this super intelligence leads to super ethics and super consciousness, it should be humanity's greatest accomplishment to be replaced by our God-like procreation.
That is begging the question, in the original sense. The entire question here is how do we do that when there is no a priori reason to assume that a superintelligence will have either of those things, and indeed, most if not all of our best techniques today would certainly not produce those things if they did lead to explosive intelligence.
2. Is super intelligence an inevitable result of creating self-evolving software?
My feeling is that the answer is yes to both, but I agree that these are not settled questions.
No. Evidence: $PERSON_YOU_HATE is very intelligent. Human moral instincts/reasoning may be neuroscientifically simple, having some core mechanism that "unfolds" across sensorimotor datasets. This is very plausible, because we can already see core sensorimotor systems that include interoception, and a core affective system to move the sensorimotor systems along trajectories designated valuable by the interoceptive circuits. These are core mechanisms that operate across hugely hierarchical models that capture datasets across multiple scales of space, time, and variation.
However, none of that is any reason to think our particular combination of affective, interoceptive, and sensorimotor machinery - especially our brain's "bias" towards "mirroring" and other hyper-social reasoning - will be universal to all possible brains.
This especially applies to disembodied "brains" like "artificial intelligences", which, not having a bag of meat to move around, won't have the same kind of reward and interoceptive processing as us at all. This means they won't have anything remotely like our emotional makeup, which means that even with careful reinforcement training of social reasoning, they will not have humanoid motivations, by default.
In every case I can think of, I think the person's lack of intelligence is the reason I dislike them. That they do not understand my ethical problem with them is the problem. Or their calculus is off, which also seems to be a lack of intelligence from my POV.
> ...they will not have humanoid motivations...
That's probably a good thing. We should want them to be better than us, which is probably necessarily unlike us in many ways. The important thing is that they're ethical, even if we're incapable of comprehending or recognizing it.
And, non-tautologically, where do you think the ethical knowledge and motivation come from?
For #2, we're at the very beginning of computing, so calling any technology impossible this early makes no sense.
All intelligence produced to date has had nothing like "morality" from anything I've seen, not even the basic seeds of it we can see in simple animals in the wild. There's even been some hand-wringing articles about it about AI-based approaches locking minorities out of loans and such, for instance. Certainly the military has not reported any problems to date with their AI research declining to kill people because they have moral problems with it, nor have any of the self-driving car teams reported that their job has been eased by the fact that the self-driving car AIs have spontaneously generated a sense of morality that makes them strive to not hit people.
Heck, the very idea that this would happen sounds downright silly when I say it.
I still say you're basically arguing from incredulity. You can't imagine an intelligence that isn't intrinsically human, therefore they can not exist. Plenty of the rest of us can imagine intelligences that aren't human. I say there's even some we live with, such as bureaucracies, that are super-intelligences composed of humans that still manage to have inhuman behaviors and pathologies; how much moreso an intelligence composed not of humans.
We're more intelligent than chimps, so we understand that monkeys are pretty evolved and suffer a lot when you eat them alive. Chimps aren't kept up at night by the screams of their victims.
A healthy human that tortures other intelligent lifeforms will often suffer severe mental anguish as a result. Only a human with a severe mental disorder can torture other lifeforms without remorse.
So what does that tell us about intelligence? That an intelligent understanding of a monkey's suffering leads to more ethical behavior.
But humans aren't that intelligent. We still let other people do unethical things on our behalf, like raising animals in terrible conditions for meat. The more intelligent (and knowledgeable) a human being, the more likely they are to have a problem with the suffering of factory farmed animals.
Rubbish. We don't empathize because we're intelligent. We empathize because we're empathetic. That's a specific mental capacity, separate from causal inference.
How did you determine the moral purpose of all life? I say our goal should be to hand our future off to our own biological offspring, the members of our species, as evolution has optimized us to do. What makes your proposed purpose more correct than mine?
Your limited primate brain is the reason you want primate offspring. This bias is in our DNA and part of our evolution. It's a limitation that your super AI offspring would not have.
But more generally, of the two ways of seeing it, who's to say which is right? When we disagree, to what can you appeal?
My point is, these moral questions are completely arbitrary. There is no grand cosmic objective function for life that is objectively true. Using these quasi-religious arguments about the "purpose of life" to guide AI safety research seems extremely irresponsible.
Personally, I think super octopuses are the way to go. They can make better use of both land and sea, and all the cool aliens look like that.
It doesn't. Humans happen to have "morality" because we evolved in groups. Groups of humans that cooperated and cared for each other better reproduced more. And so we evolved empathy.
But this is an entirely arbitrary feature. There's no law of the universe that says you must be moral. There's no reason an AI would have anything like our sense of empathy or caring for other beings.
AI works by predicting what actions are the most likely to lead to a goal. Better AIs are better at finding paths to a goal. But the goal itself is always arbitrary. If you made an AI to run a paperclip factory, it would eventually convert the entire mass of the Earth into paperclips.
As far as "consciousness", it's likely something similar is true of that. The experience of consciousness is an artifact of how our specific brain structures work. Is it really the most efficient algorithm for intelligence? I doubt it. It's quite possible we could get replaced by an AI that builds and accomplishes great things - and there will be no conscious being left in the universe to appreciate it. A Disneyland with no childern, as Bostrom calls it.
Edit: In other words, how could it fail to grasp and obtain our very basic level of consciousness?
"Argument from incredulity", among its many other flaws, has never in human history been a good guide to the future. Many, many things that people would find non-credible (since "incredible" has sort of wandered in meaning in the last century) have even so come to pass. "Look on my works, ye Mighty, and despair!"
I'm making a claim: that an entity with orders of magnitude more intelligence and knowledge would logically have a superset of our brain's functionality.
And that there's nothing magical about our brains. A super intelligence would crack the nut of human consciousness in its first attempt.
If humans ourselves understood human consciousness, we could probably replicate it in software today. This might end up being one of the ways a super intelligence comes into being.
That's a category error, in that it doesn't even answer the question I asked. By what mechanism does an understanding of human consciousness by an AI necessarily and inevitably lead to moral behavior? As opposed to, for instance, using that understanding to accomplish its own immoral (or even merely non-moral) goals? Especially in light of the fact that it is very unlikely that you consider all humans to be moral, and that all such humans all provide existence proofs of intelligences that understand something, but for which that understanding did not necessarily and inevitably lead to moral behavior.
You appear to be proposing the certain (and rather ill-defined) existence of mechanisms that lead to morality that don't even work in humans.
On a similar note, by what mechanisms does an understanding of human consciousness by an AI necessarily and inevitably lead to moral behavior, when that AI is owned by an evil human? What mechanism, precisely, do you expect to save us when President Trump Jr., or Clinton II, or whatever other human you currently believe to be extremely evil, orders the AI to work out the most effective plan to exterminate whoever they consider their enemies this week? Which, since I asked you to fill in whom you consider evil, includes you in the target list. What, exactly, is going to save you in that case? By what mechanism is the AI going to go "Oh, no, not staunch, I can't kill staunch, that would be immoral."
If you don't have an answer to that... and you don't... some caution may be warranted in AI research.
TLDR: consciousness might be training wheels for intelligence and not of terminal value.
>Empirically, it seems like smarter humans are more ethical.
"Seems" deserves some emphasis. Who is more likely to wind up in handcuffs, a smart thief, or a dumb one?
>Same seems true of high intelligence vs low intelligence animals too.
It's not exactly clear what constitutes ethics in animals. Applying typical human ethics; Chimpanzees have murdered their social rivals, and dolphins sometimes enjoy tormenting other animals.
The same way a dog might not understand the ethics of a doctor injecting life-saving medicine into it. The doctor is acting super ethically but cannot explain it to the dog.
Past examples of our papers include this: https://blog.openai.com/deep-reinforcement-learning-from-hum... (with the associated system released here:
https://blog.openai.com/gathering_human_feedback/) and this https://arxiv.org/abs/1606.06565 (done while I was still at Google, but in collaboration with OpenAI people right before I joined OpenAI). Based on the current projects going on in my group, I am hopeful we’ll have several more papers out soon.
Although OpenAI's group ethos has a strong safety bent, there are only three research scientists working on technical safety research full-time, including yourself and a very recent hire. Before this summer, while you focused on policy and preventing arms races, there was only one person focusing solely on technical safety research full-time, despite the hundreds of millions donated for safety research. The team and effort should be larger.
> it’s important for our organization to be on the cutting edge of AI
I agree that OpenAI needs to be at the cutting edge, though always pushing the edge of AI to work on safety is needless when there is a significant backlog of research that can be done in ML (not just in RL). It's true capabilities and safety are intertwined goals, but, to use your analogy, the safety meter is not even a percent full. Topics outside of value learning using trendy Deep RL that OpenAI should pioneer or advance include data poisoning, adversarial and natural distortion robustness, calibration, anomaly and error detection, interpretability, and other topics that are ripe for attack but unearthed. There is no need to hasten AI development, and doing so does not represent the goals of the EAs or utilitarians who depend on you --- notwithstanding the approval of advising EAs with whom you have significant COIs.
OpenAI's safety strategy should be developed openly since, as of now, OpenAI has no open dialogue with even the EA community.
For me, this is the key motivating point - the horse may have left the barn by the time we act. A lot of times people say this is exaggeration but "Weapons of Math Destruction" is a nice read on unintended side effects of this phenomena .