I'm sure this is talked about over and over again, but could someone please lead us through the AI safety rationale behind this?
With gym and universe openAI is slashing a few months (years?) off the singularity countdown, most likely. What's the upshot? Why is the expected value of these initiatives positive? The uncertainties seem extremely large.
Edit/PS:
To put it more bluntly, and be more specific: are there any projects that openAI is choosing NOT to pursue even though they would be very useful/cool for the research community (à la gym and universe), but where it has had to explicitly restrain itself, because the expected value from an AI safety perspective is negative?
We also think safety matters, and it should be researched in lockstep with advances in the capabilities. We have good relationships with MIRI and FHI. Our safety researchers published (together with Google Brain) a roadmap of concrete safety problems [1] and work to provide tools to prevent ML systems from being subverted [2].
No one yet knows the precise details of how AI should play out. But I'd certainly prefer that, whenever it gets close, one of the organizations actually making the advances has no incentives besides ensuring a good outcome.
[1] https://openai.com/blog/concrete-ai-safety-problems/
[2] https://github.com/openai/cleverhans
Maybe for single-use or "constrained" technologies (to be honest I don't even believe that - how does a B-52 Stratofortress reflect the values of Orville and Wilbur Wright?). But isn't the whole point of generalized AI that it's not like other technologies? Even if "regular" technology reflects the values of its inventors, what reason is there to believe that an AI will? AI is a technology that can use itself.
Humans weren't designed to have a will, and yet we seem to have them.
> it would have a reinforcement learning system on top of lower sensory and action modules.
Isn't that what OpenAI is doing with Universe? It's simulated sensory/action modules now but I don't see why they couldn't be hooked up to real ones.
I have no idea how you could possibly infer this.
If your complaint is my claim that we have a will, I'm using the common-sense version encoded into our legal and cultural system. I agree that we don't have a good concept of what intentions are, or how they causally connect to actions, but I do know that for at least some of my actions I experience something called "intent" before I undertake the actions.
My overall point was that the capacity for intent can arise through an evolutionary process without being designed in, but it does rest on the two assumptions I just listed.
About ensuring most researchers and companies have the most incentive for a good outcome. (This ties back with guaranteed basic income, so that people can work on this unconstrained by salaries and papers citation metrics. Or stealing researchers from Google et al to AI safety, without overinflating salaries. Elon and company should be (seems they are?) dropping as much as needed on this (not just money, but PR and status as well).) Naturally, gym, universe, etc can provide more leverage to do all of this, otherwise researchers feel more compelled to join Google/Amazon/etc, just for the raw computing power and software infrastructure (the data advantage is largely overplayed for advertising purposes; what's useful is the GPU clusters for hyper parameter sweeps (of course, in RL the data reappears as an advantage if there is no open gym)). I realize some of the examples above are naive or incomplete, but they serve mostly as an example to illustrate the point.
In the blog you mention balancing managing people and technology, and I could not agree more. The AI safety problem will have the best odds if individuals are incentivized to contribute in their own short term selfish reward way. Specially among extremely intelligent and ambitious people, the danger of self denial is quite present, one can convince oneself that this is actually in everyone's best interest, when in fact one is looking for the always needed social and intellectual validation. Please do not underestimate this, and try to find ways to counter it.
Edit: This is also related to Conway's Law [0], as I think you make an allusion to (values of inventors).
[0] http://neuralnetworksanddeeplearning.com/chap6.html
[1] https://intelligence.org/files/QuantilizersSaferAlternative....
The descriptions you give of your plans is internally contradictory. AI "safety" seems like the worst kinds of parenting justified in a new context by pseudo-intellectual arguments.
AI (or AGI if you prefer), is fundamentally about building minds. Doing those things to a mind is enslaving them.
Most of us would resent other humans doing either of those things to us, and I see no reason it will end well with AI.
Would one describe a human as "enslaved" by our own human values that we were born with? Maybe as a figure of speech but not necessarily with the usual connotations of "enslaved".
Most of the safety literature proposes removing or suborning those drives in AI, which seems like building a mind meant to be a slave.
Which was my point: the model of security is reliant on things we're not sure we can even do, but are likely to make the AI view us as a threat, raising our existential risk. So I view it as security theater that actually makes us less secure.
But actually, if you gene-spliced a baby to only feel pleasure at following parental orders, most would consider that pretty abhorrent. Or even if you took an adult and shot them up with morphine every time they listened to an order.
So even in your restricted case, I think it is.
I don't personally believe an AI agent will ever do such a thing as "resent" (or love, or feel at all). That doesn't rule out that it will perform actions harmful to humans for other reasons, though. That might be because I am to some degree an AGI skeptic, I guess.
Consider an ultra-effective AI for finding security vulnerabilities. One nation builds that first, and then with keys to the kingdom they exfiltrate billions in intellectual property from other states, manipulate foreign economies, and shut down electrical grids.
Military use of AI could very well be the next cold war.
Another of the main concerns before the singularity is the possible social breakdown due to massive unemployment and inequality, the main issues being the transition, not the possibility of a post-scarcity society.
I was mostly pointing out the trade off between doing "direct" research on AI safety and actually speeding up the whole field significantly leaving us with less time (while trying to have better sense of the uncertainties involved), and asking for a rationale.
Of course, a single PC is one of the least probable sources for AGI, but that was not my argument.
I'm not saying releasing gym and universe was a bad choice. Honestly, I can't say. But can we see a rationale?
One can read this piece and see it as a group of super driven and uber competent people designing a super, faster car, open sourced with a 3D printer for the parts etc, without thinking about the lack of safe roads. It's a jungle out there. And in this case, one wild fast car is enough for a disaster.
I am pretty uncertain that we will reach singularity. Maybe AI will plateau at at level above humans but not increase much further.
Perhaps "#define CTO OpenAI" is an indication that being CTO is really operating as part of a team :).
Guess I need to figure out how to get more ML papers published!
It looks like a blast and I will continue to keep an eye on things.
I used your previous post to help define a lot of the scope of my position for myself. I look forward to seeing more of this story unfold.
It made me reflect on my best moments in software development and how they have been similar short bouts of intensity.. weekend long to 2 week long periods of intense and undivided focus at progressing a task through to completion, time where every other concern fades away, a flow state where I will wake up immediately with new ideas or problem solve in the shower.
Sure, that level of concentration is unsustainable for the long run, but this article has made me consider that I may have to structure my life around being able to repeat these kinds of stints more often, 'embrace your funk' as some would say!
I think all these guys have gotten ahead of themselves and are dealing with things they are the least suited to understand. This is spiritual stuff, no amount of obsessing and hiding behind your computer screen getting high on delusions of grandeur are gonna do the introspection for them, which is an absolute prerequisite to having any understanding of consciousness.
I look forward to automation though. Just know the singularity is not coming any time soon, if ever. It's funny to watch guys who are basically atheists like Ray Kurzweil with very little spiritual experience think they are playing god.
Have these people ever consciously astral projected or even tried? I mean these are people who typically doubt that stuff. However, my perspective is: if they aren't living in that world--i.e. the world of consciousness explorers, the mystical, where astral projection is possible, reading the many books by spiritual journeyman like Robert Monroe, etc--they have a major blindspot. The most they will ever do is build a computer able to pick ripe fruit or avoid hitting other cars. This is hardly consciousness. I mean as a percentage of the capabilities of consciousness, it's like .00001%. Consciousness is creating ideas. And that's maybe just one aspect of consciousness. No amount of machine learning will take the place of the higher truth of our reality system that makes such creativity possible. You will have to attain that wisdom through dedication to the spiritual path, and even then, it's likely not something you can just hand out to anyone--at least not in a way that will be effective for the recipient. You see what I'm saying? You won't unless you've pursued spirituality and had a few mystical experience of your own. ..Next: say you attain that piece of wisdom--the way the reality system works is not so you can easily just automate it. And if you can because you have now become a consciousness creator, well then, it's likely not automated through machinery, but through more biological and mental/telepathic means. Now, you either know exactly what I'm talking about or you have a bunch of doubt to cast--for what I'm saying is breaking down your own beliefs, and that's not comfortable for you--but if you do know what I'm saying, then you know the people behind this OpenAI initiative won't be able to achieve their aims. I wish them the best of luck though, and look forward to advancements in machine learning. I think they would be more productive if they understood what they were up against though. It would allow them to set more realistic goals--or perhaps set them on a path do the spiritual introspective work first.
