Exploring this topic is currently my primary hobby. Specifically, I've been using OpenAI's retro (Sonic, Contra, Mario, Donkey Kong and, more recently FZero) and comparing the ancient NEAT with more fashionable stuff like DQN, PPO, A3C and DDPG.
With my extremely limited experience, NEAT seems to outperform all of these other algorithms. I believe the advantage is the potential for strange/novel network structure.
And the best part is that NEAT doesn't require a powerful GPU.
Apologies for the shameless plug but here's a link to a series on youtube I made about using Retro and NEAT together to play Sonic. https://www.youtube.com/watch?v=pClGmU1JEsM&list=PLTWFMbPFsv...
(Edit) One of the funniest parts I remember is that they had to leave it running on a Pentium III for like a month or something.
My PhD thesis in 2000 already used genetic algorithms for seeds and it was hardly new then.
Many of the recent "AI" development very often boil down to finding a local extremum using some sort of ski down the slope optimization program (aka "training"). These techniques very rarely tackles global optimization, or when they do, bundle it up under the moniker "hyper-parameter tuning".
A good example of something that falls under "global optimization" and isn't often tackled in deep learning would be finding the correct deep net architecture for a given problem.
The problem doesn't lend itself very well to local optimization, but might yield to GA-type optimization.
*a very limited one, imagine 2d, no inertia, 5 sensors at 15° each other each reporting distance from the nearest "grass", with the road 20px wide, drawn by hand gray on green and fitting the screen 13 space. network was 5x5 neuron, with weights all part of the gneetic code. car never made past the fourth corner, let alone did a lap.
GAs allow optimization of parameters without a differentiable loss function - a major problem with evaluating behavior of a neural model, for example.
But also GAs could benefit from ML/DL. Predicting loss functions from a chromosome representation (to save computing time), learning to select promising pairs and even learning cross-over operators.