KenLM ist not a neural network and instead a purely statistical n-gram model. So...

KenLM ist not a neural network and instead a purely statistical n-gram model. So it's no surprise that it would be faster on a CPU in many cases. However, as soon as you have to deal with noisy data, KenLM gets blown out of the water by DL architectures like LSTM and, more recently, Transformers. There's a reason why purely statistical models have seen very little progress in the last 10 years (KenLM was published 11 years ago) and that reason is that this "noise" is basically just a consequence of the central limit theorem applied to data with a huge amount of nuance - much more than any human coded feature vector could ever account for.