Curiously, they don't define their key term "malevolent", which ordinarily means seeking to cause suffering. It is distinct from being neutral about onsequences. For example, nature is generally not considered malevolent, despite floods, earthquakes and so on.
All technology also has this character - from Frankenstein ('s monster) to fire. A good servant, but a poor master.
Now, creating a malevolent AI seems tricky, because self-defeating: it would have less ability to survive than an AI that was undistracted from being purely out for itself (so busy trying to be evil, it neglects its own needs, poor thing). And much less able to survive than a cooperative AI, that works with and trades with other AI's and people.
Thus, good (or at least neutral) will always triumph over evil.
Yet more “research” primarily citing itself and science fiction charlatans like Yudkowsky, without any apparent awareness of (1) the actual state of artificial intelligence research or (2) any serious grounding in philosophy.
A paper like this could be interesting if there were some attempt to define an algorithmic/heuristic framework for morality, or a way of mapping philosophical frameworks to software, or something along those lines. Instead, it's just more MIRI-or-whatever-it's-called-this-week wankery.
Yeah, I too came in expecting the authors to develop philosophical framework of some description. Was disappointed to find the paper instead talks about NSA and hacking and the military and... ugh. I can't even.
Johann Wolfgang Goethe was convinced history would remember him for his (bad) science; not for his literature. They are not quite on the same level, yet, but Yudkowsky still has time to polish his writing skills.
This paper is not in the research line of the leading AI-existential-risk research, as found in, for example, MIRI, FHI.
A so-called malevolent AI, as described here, is designed to fulfil human values like destroying an enemy. Not human values that I like, indeed, the implementation of such systems should be opposed.
But the main risk analyzed by MIRI, FHI etc is completely orthogonal to this. It is that whatever the system's goals are, "benevolent," "helpful", "malevolent" or whatever, it maximally optimizes towards these goals, consuming resources while ignoring what the AI's creators really wanted it to do. For example, an AI whose goal is to eliminate cancer and does so, exactly fulfilling its specified goals, by wiping out all life.
"Roman Yampolskiy expresses appreciation to Elon Musk
and FLI for partially funding his work via project grant:
“Evaluation of Safe Development Pathways for Artificial
Superintelligence” awarded to Global Catastrophic Risk
Institute, Seth Baum (PI). The authors are grateful to Yana
Feygin, Seth Baum, James Babcock, János Kramár and
Tony Barrett for valuable feedback on an early draft of this
paper. Views in this paper are those of the authors, and do
not necessarily represent the views of FLI, GCRI, or others."
I don't think anything really intelligent would have one dimensional goals. If you think you can maximize something in this messy world, I wish them much luck. :-)
I have a feeling AI is going to be a lot like nukes. They are going to have a great reason to make it, but then after everyone is going to get freaked out by it and want to get rid of it, but its too late.
I wonder at which level of society the lines of conflict of interest will be drawn. Between corporations, governments, military alliances? Should AI drastically improve the abilities or processes within such an entity, significant - even theoretical - advancements will most likely be kept secret to ensure competitiveness. And as you said, lead to a cold cyberwar of some sort.
>If a group decided to create a MAI, it follows that
preventing a global oversight board committee from coming
to existence would increase its probability of succeeding.
It also follows that a global committee would increase the probability of the board's favored AI out-competing any other AI.
A classic wolf-in-sheep's-clothing power grab, employed by government committees worldwide.
There is a particular bullet point which I thought was telling to the use of destructive AI - "interpret commands literally". There is much to be said about dangerous AI and dangerous models which are built with overconfidence to their applicability to the real world.
The very fact that software tends to break in unexpected or unforeseen scenarios is a testament that caution needs to be taken when we hear about the latest project that makes boisterous claims about its capabilities.
In summary, in software in general making mistakes happens all the time. Bugs, security leaks, run-away scripts...etc, the same is true of AI models that perform or correspond with various actions/decisions. Malicious code, malicious AI, is fairly close to just poor judgement / mistakes being made and the consequences of those mistakes - be it code written by an engineer, or decisions that were automated based on a model.
All technology also has this character - from Frankenstein ('s monster) to fire. A good servant, but a poor master.
Now, creating a malevolent AI seems tricky, because self-defeating: it would have less ability to survive than an AI that was undistracted from being purely out for itself (so busy trying to be evil, it neglects its own needs, poor thing). And much less able to survive than a cooperative AI, that works with and trades with other AI's and people.
Thus, good (or at least neutral) will always triumph over evil.