That’s the alignment problem. We don’t know what the actual goals of an AI trained neural net are. We know what criteria we trained it against, but it turns out that’s not at all the same thing.
I highly recommend Rob Miles channel on YouTube. Here’s a good one, but they’re all fascinating. It turns out training an AI to have the actual goals we want it to have is fiendishly difficult.