All technology also has this character - from Frankenstein ('s monster) to fire. A good servant, but a poor master.
Now, creating a malevolent AI seems tricky, because self-defeating: it would have less ability to survive than an AI that was undistracted from being purely out for itself (so busy trying to be evil, it neglects its own needs, poor thing). And much less able to survive than a cooperative AI, that works with and trades with other AI's and people.
Thus, good (or at least neutral) will always triumph over evil.
Unless the game they are playing rewards being evil, in which case being benevolent or neutral would be a distraction.
A paper like this could be interesting if there were some attempt to define an algorithmic/heuristic framework for morality, or a way of mapping philosophical frameworks to software, or something along those lines. Instead, it's just more MIRI-or-whatever-it's-called-this-week wankery.
Johann Wolfgang Goethe was convinced history would remember him for his (bad) science; not for his literature. They are not quite on the same level, yet, but Yudkowsky still has time to polish his writing skills.
They take risks into account as well.
A so-called malevolent AI, as described here, is designed to fulfil human values like destroying an enemy. Not human values that I like, indeed, the implementation of such systems should be opposed.
But the main risk analyzed by MIRI, FHI etc is completely orthogonal to this. It is that whatever the system's goals are, "benevolent," "helpful", "malevolent" or whatever, it maximally optimizes towards these goals, consuming resources while ignoring what the AI's creators really wanted it to do. For example, an AI whose goal is to eliminate cancer and does so, exactly fulfilling its specified goals, by wiping out all life.
"Roman Yampolskiy expresses appreciation to Elon Musk
and FLI for partially funding his work via project grant:
“Evaluation of Safe Development Pathways for Artificial
Superintelligence” awarded to Global Catastrophic Risk
Institute, Seth Baum (PI). The authors are grateful to Yana
Feygin, Seth Baum, James Babcock, János Kramár and
Tony Barrett for valuable feedback on an early draft of this
paper. Views in this paper are those of the authors, and do
not necessarily represent the views of FLI, GCRI, or others."
I don't think anything really intelligent would have one dimensional goals. If you think you can maximize something in this messy world, I wish them much luck. :-)
It also follows that a global committee would increase the probability of the board's favored AI out-competing any other AI.
A classic wolf-in-sheep's-clothing power grab, employed by government committees worldwide.
The very fact that software tends to break in unexpected or unforeseen scenarios is a testament that caution needs to be taken when we hear about the latest project that makes boisterous claims about its capabilities.
In summary, in software in general making mistakes happens all the time. Bugs, security leaks, run-away scripts...etc, the same is true of AI models that perform or correspond with various actions/decisions. Malicious code, malicious AI, is fairly close to just poor judgement / mistakes being made and the consequences of those mistakes - be it code written by an engineer, or decisions that were automated based on a model.
No fighting in the war room!
People more worried about losing their jobs to automation than being killed by same automatons. The material just writes itself.