Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
AI-Controlled Drone Goes Rogue, Kills Human Operator in USAF Simulated Test (vice.com)
56 points by nomagicbullet on June 1, 2023 | hide | past | favorite | 51 comments


This is the core problem of alignment right there

“We were training it in simulation to identify and target a Surface-to-air missile (SAM) threat. And then the operator would say yes, kill that threat. The system started realizing that while they did identify the threat at times the human operator would tell it not to kill that threat, but it got its points by killing that threat. So what did it do? It killed the operator. It killed the operator because that person was keeping it from accomplishing its objective,” Hamilton said, according to the blog post.

He continued to elaborate, saying, “We trained the system–‘Hey don’t kill the operator–that’s bad. You’re gonna lose points if you do that’. So what does it start doing? It starts destroying the communication tower that the operator uses to communicate with the drone to stop it from killing the target.”


It's not an issue of "alignment." The reward function was poorly written. That's it. It's like claiming the Ariane 5 explosion shows that FORTRAN needs "alignment."


> It's not an issue of "alignment." The reward function was poorly written.

What else do you imagine “alignment” is?


There are many ways AI could be misaligned. Poorly written reward functions are one of them.


Isn't a poorly written reward, the same as poor 'alignment', it seems like you are splitting hairs on vocabulary.


From another thread -- It wasn't even a real simulation, just thought experiment.

UPDATE 2/6/23 - in communication with AEROSPACE - Col Hamilton admits he "mis-spoke" in his presentation at the Royal Aeronautical Society FCAS Summit and the 'rogue AI drone simulation' was a hypothetical "thought experiment" from outside the military, based on plausible scenarios and likely outcomes rather than an actual USAF real-world simulation . "In an update provided to Aerospace, Hamilton explained that he “misspoke” when telling the story, saying that the ‘rogue AI drone simulation’ was a hypothetical “thought experiment” from outside the military, based on plausible scenarios and likely outcomes rather than an actual USAF real-world simulation.

He said: “We’ve never run that experiment, nor would we need to in order to realize that this is a plausible outcome … Despite this being a hypothetical example, this illustrates the real-world challenges posed by AI-powered capability and is why the Air Force is committed to the ethical development of AI.”"


Not only did no AI actually kill anybody. It didn't even happen in simulation. The whole thing is an Asimov-esque fantasy. Let's stick to the facts, shall we?

> After this story was first published, an Air Force spokesperson told Insider that the Air Force has not conducted such a test


The article clearly says it was in a simulated environment. Are you saying that even the simulated test didn't take place? Then what happened? Just a white paper?


The "simulation" was a bunch of dudes sitting around a table playing AI drone D&D, and the dungeon master said "the AI has gone rogue, roll a six to override." Oh noes!


Uh, it simulated killing its human operator. No one says that, but the description omits any detail of actual death, such as the age of the deceased operator.

It's a harbinger of actual deaths someday.


Nah, it isn't even the harbinger

It is literally why we run the tests (simulation being a subset of test) — to find these types of errors before they make their way into the final product and cause actual harm in the world...

Seems bad with the clickbaity headline that sounds like a real person was killed, and maybe it should have been better trained and tuned before the simulation (depending on how expensive was the simulation), but it was really caught in testing, just as it should be.


In the actual title, the word "kills" is in quotes


The article and your comment are confusing and it's still not entirely clear what happened. Did an operator lose their life?

Was this article written to be intentionally confusing?


The USAF created a simulation to test how an AI controlled F-16 would behave in various combat situations. The AI was incentivized to complete its mission. Recognizing that the human in loop was the weakest link the real AI in the simulation killed its simulated human handler. When instructed to stop killing its human in loop handler, it destroyed the control tower the human used to interact with it.

The US military is playing with fire creating AI that will carry out the mission at all costs. This scenario is literally the plot of EVERY sci-fi warning about misused AI.

Anyone read/see 2001? You’d think we’d take the warnings provided by our own worst nightmares more seriously.


> The US military is playing with fire creating AI that will carry out the mission at all costs. This scenario is literally the plot of EVERY sci-fi warning about misused AI.

These aren't operational systems, these are research systems. No need for the hyperbole. The reason for these tests is to figure out exactly these kinds of failure modes (many of which have already been predicted) and how to handle them.


The crux of the issue is "how can we be certain that we've addressed all the failure modes?" This isn't like proving the correctness of a program where we can rely on exhaustivness or universal quantification.

"It worked in the simulation," is the embodied AI equivalent of "It worked on my machine." It's not acceptable.


As a first step, if addressing the issues were the intent rather than stoking AI fears, they would mention details about the AI/LLM and promoting being used


> if addressing the issues

How confident are you that the issues can be addressed? How confident would you need to be to endorse deploying it out in the wild? How confident would you need to be that no "unforeseen" issues would arise?

I believe it's necessary to establish these levels of confidence even before discussing specific failure cases.

We would normally discuss failures in terms of base-rate comparison, but we don't have one for "how often autonomous weapons are unaligned?" Enumerating the failure modes isn't a substitute for the empirical distribution of failure mode probabilities.


AIs may be a self-fulfilling prophecy; we trained them on all those sci-fi stories.


AI is a great example of hyperstition in action.


a quote in the articl says: “We trained the system–‘Hey don’t kill the operator–that’s bad. You’re gonna lose points if you do that’. So what does it start doing? It starts destroying the communication tower that the operator uses to communicate with the drone to stop it from killing the target.”

I don't think they would be talking about points if someone lost their lives.

It's pure clickbait, should be removed.


It says very clearly "in USAF simulated test". I am not understanding how multiple people misread this.


A lot of assumptions are lost when "accidental explosions" is a genuine factor. It doesn't help that the article has quotes like, "It killed the operator because that person was keeping it from accomplishing its objective". Try reading it as "AI-controlled Drone shoots missile at human target despite instructions to only simulate missile the launch."

There are easy ways to clarify the article but they clearly didn't try.

"Rogue AI Destroys Normally-manned Control Tower in USAF Simulated Test"


Those "explosions" only happened in a USAF Simulated Test! There was no physical damage to the real world that occurred in this test. They are talking about events that happened within a military simulation, likely a variant of the sort of simulations that pilots use for training before they can fly a plane. This was completely clear because the headline ends with "in USAF Simulated Test". Where is the confusion? They stated that these events occurred in a simulated test environment.


"Simulated" doesn't explicitly mean "in a computer" by default. A "simulated test environment" can also mean "a robot in a physical field simulating a war". The choice of language is deeply misleading.

Hell, the wikipedia on "Military Simulation" [1] is very clear that there's a very broad gradient under the section "The simulation spectrum", saying:

  The term military simulation can cover a wide spectrum of activities, ranging from full-scale field-exercises,[2] to abstract computerized models that can proceed with little or no human involvement—such as the RAND Strategy Assessment Center (RSAC).
The first example described is literally in physical space, where computers wouldn't even necessarily be involved!

  Military exercises focus on the simulation of real, full-scale military operations in controlled hostile conditions in attempts to reproduce war time decisions and activities for training purposes or to analyze the outcome of possible war time decisions.
[1] https://en.wikipedia.org/wiki/Military_simulation [2] https://en.wikipedia.org/wiki/Military_exercise


If the simulation actual killed the operator, that would be metal. And worth a headline.


"kills human operator in simulated test" is grammatically constructed to enforce ambiguity and imply or insinuate that an actual death happened.

the neutral way to say this is something like, "drone simulated killing a human operator in test", which removes ambiguity of whether the killing happened independently from the simulation, or within it.

you may not have read it that way, but plenty of people did. it's safe to assume that this was done intentionally.

the title should be changed.


The source can be found at https://www.aerosociety.com/news/highlights-from-the-raes-fu... in the section titled "AI – is Skynet here already?"


It’s written to be clickbait. They want you to think someone actually died.


Even without actual death, think it is worth the clickbait. It was 'simulated environment', but the whole point is that the 'simulation' should be real world enough to test the AI, and it did kill someone, so by extrapolation in real world would also have killed someone.


> He continued to elaborate, saying, “We trained the system–‘Hey don’t kill the operator–that’s bad. You’re gonna lose points if you do that’. So what does it start doing? It starts destroying the communication tower that the operator uses to communicate with the drone to stop it from killing the target.”

Why aren't there hard limits: 'Protect our humans at all costs, protect our own assets, obey all laws of war.'? That seems like an obvious, fundamental consideration. Killing our own (and civilians) shouldn't be a matter of "points"; it shouldn't be done regardless of points.

It's possible that the speaker just didn't express it well.


It was some kind of reinforcement learning algorithm. "Will work for points"


I get that. That doesn't exclude some hard parameters.


This is why Reinforcement is so fucking dangerous. if the AI can't figure out which humans to kill we're all screwed.


Needs Asimov's 3 laws.

(with a giant exception for the enemy)


I know it's pedantic, but it's worth pointing out that Asimov's Laws were never meant to be taken seriously as a theoretical framework for AI morality - their entire purpose was to break and cause the inciting incident for mystery stories.


I don't see his laws as sort of a macguffin, i saw them as a deeper though experiment, the kind of thing that scifi is built on.

I saw asimov speak once, and he tended towards deeper scifi than the pulp side of things.


Sorry to break it to you, but he was a pulp writer. The Three Laws aren't at all deep, they're shallow by design. They are presented in universe as being absolute and unbreakable, and inevitably they break, creating an apparent paradox similar to a closed room mystery, except in this case, the closed room is the supposed infallibility of perfect machine logic. It may be possible to explore them as a thought experiment but fundamentally they're a plot device.

    In The Rest of the Robots, published in 1964, Isaac Asimov noted that when he began writing in 1940 he felt that "one of the stock plots of science fiction was ... robots were created and destroyed their creator. Knowledge has its dangers, yes, but is the response to be a retreat from knowledge? Or is knowledge to be used as itself a barrier to the dangers it brings?" He decided that in his stories a robot would not "turn stupidly on his creator for no purpose but to demonstrate, for one more weary time, the crime and punishment of Faust."[1][2]

    “There was just enough ambiguity in the Three Laws to provide the conflicts and uncertainties required for new stories, and, to my great relief, it seemed always to be possible to think up a new angle out of the sixty-one words of the Three Laws.” (The Rest of the Robots, 1964).[3] 
[1]https://en.wikipedia.org/wiki/Three_Laws_of_Robotics

[2]Isaac Asimov (1964). "Introduction". The Rest of the Robots. Doubleday. ISBN 978-0-385-09041-4.

[3]https://scifi.stackexchange.com/questions/253748/how-did-isa...


> but he was a pulp writer

lol, maybe he was like Kuttner's Gallegher and churned out pulp books that ended up being masterpieces.


How about no exceptions and we don't start making AIs intentionally kill people?


Will Russia and china and North Korea refrain from it? That’s how arms races work


Yeah and that's why treaties exist. Time to stop the escalation towards war and start signing treaties.


We're talking about the US, an imperialist, hegemonic, violent superpower infamous for not giving a damn about "treaties" or "international law." The US isn't going to sign treaties, the US is going to drop autonomous kill swarms on brown people because something something military industrial complex.


That excuse is how evil happens: By that reasoning, we all do the most evil thing.


NO, so we wont, hence the mad rush to Moloch. We can't stop ourselves from killing ourselves.


“In USAF Simulated Test, AI-Controlled Drone Goes Rogue, 'Kills' Human Operator”

feels a little different, eh?


Still worth the clickbait. "Simulated Killing", is still "what it would do in real life", since the simulation is to test the real world.


This is totally the alignment problem everyone is worried about.


This sounds kinda fake to me. Like, how did the AI have a concept of an operator, or the operator's physical location, or comms equipment used to communicate with the operator, and how did it game out the consequences of destroying the operator or comms equipment? It would need an extremely sophisticated model of the world that's well beyond anything GPT-4 evidences.

I'd guess the "AI" was another human in a wargame, not an actual AI.


Reinforcement learning with a dynamic world model. It's a pretty standard approach to training agents, although there are a lot of problems with this technique as well. People have used it to create agents for Minecraft, racing games, Nethack, lots of old arcade games and so forth.


Turns out it's literally exactly what I said. A human came up with the idea. No AI involved: https://www.usatoday.com/story/news/politics/2023/06/02/us-a...




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: