What does this mean in practice? Let's take a simple gaussian blur for images. A single output pixel is formed by overlapping the gaussian kernel on top of the image, multiplying then pointwise, then summing the result. Repeat for every pixels. What you can also do is take FFT of the gaussian kernel and multiply it with the FFT of the image and inverse transform and you will get the same result as actually calculating it for every point separately. Is this faster than doing it point by point? Depends on the blur radius.
You can do awesome things with this blazingly fast. As an example a simple water wave simulation can be made by simply taking a fourier transform, multiplying it with the dispersion relation of the water waves and then doing an inverse transformation. http://www.youtube.com/watch?v=MTUztfD2pg0 Just like what is done here. Normally this convolution would take O(N^2) amount of operations where N is amount of vertices but with FFT it's O(N log N).
FFT is for convolution what quicksort is for sorting. Imagine how limited would you be if all your sorts would take O(N^2) time. The examples I gave are quite limited in scope, going trough all the applications of convolution would take textbooks. It's probably one of the most important concepts in electrical engineering.
Oh yeah, convolution is just like cross correlation except in another case the function is reversed. So you can imagine the applications in data mining etc.
All in all Fourier Transform, and related ones, is an extremely huge and massively important concept, it's hard to overstate it's usefulness.
Personally I can say that I've used filter design tools to make a really smooth accelerometer data processing function. It does not jump around like a raw signal does nor does it lag a lot just like the standard exponential smoothing does.