MD simulations rely on FFT but I'm not sure how much is typically (or can be) done on the GPU. For example, NAMD employs cuFFT on the GPU in some cases. (https://aip.scitation.org/doi/10.1063/5.0014475)
He is not wrong, convolutions between an image and a small kernel can be done faster by direct multiplication than by padding the kernel and performing FFT + iFFT. This is what tensor cores are aiming to do really fast. However, doing a convolution betwen an image and a kernel with the similar size is the general use case for the convolution theorem and is the thing that is currently implemented in VkFFT.
Fluid flow, heat transfer, and other such physical phenomena that you might want to simulate.
Phase correlation in image processing is another example. (https://en.wikipedia.org/wiki/Phase_correlation)
MD simulations rely on FFT but I'm not sure how much is typically (or can be) done on the GPU. For example, NAMD employs cuFFT on the GPU in some cases. (https://aip.scitation.org/doi/10.1063/5.0014475)