"Another challenge with current AI systems is that they require explicit programming or hand-designing of reward functions in-order to make correct decisions"
Personally I think calling randomized fine tuning of large statistical models "Artificial Intelligence" is incredibly pretentious. But then again, humans pretend to be so much more then the naked monkeys we really are and isn't that the whole point of this brief 'civilization' phenomenon we seem to bloom into shortly before killing our host planet?
Personally, I would argue that David Silver's body of work at DeepMind is making a very strong case in favor of "simple reward + lots of compute". Richard Sutton wrote about this back in 2019 as The Bitter Lesson[1]. An important note is that David Silver did his PhD under Sutton.
Quite frankly, it's been a while since I've read through Silver's papers, but the original Deep Q-Networks paper[2] is probably a good start.
I read the exact same thing 20 years ago.