Yup, there's an increasing amount of GPU use these days, mostly related to soup searching -- see https://catagolue.hatsya.com/home for the software and a tabulation of results from the last several years of collaborative searching.
Caching is very very heavily used for running the biggest universes, which are truly mind-bendingly large. Golly's "HashLife" algorithm can in practice handle patterns that are over a trillion cells in each dimension:
Patterns with interesting behavior very often have a lot of repeating patterns, with the interesting stuff happening as complex interactions between those predictable patterns. HashLife capitalizes on remembering interactions that it has seen before, so basically the more memory your computer has available, the better HashLife will do in the long run at simulating that type of pattern.
"Soup searching" generally means not looking for anything in particular. It just involves setting up a random initial configuration, letting it run until it stabilises ("goes boring") and then takes a census of what's sitting around in the ashes of the burned-out pattern.
Mostly, of course, the census just reports piles and piles of blinkers and blocks and beehives and boats and everything else that you almost always see when you run a random scribble -- but every now and then something turns up that has never ever been seen in the history of Life, and that turns out to be useful and building new mechanisms that weren't possible before:
Are they run on gpus now?
Has anyone looked into ASICs?
Is caching heavily used for optimization?