> The sound wave produced by a piano note is a simple sine wave.
No, it's not. A piano note is a complicated stack of overtones (some of which are harmonic and some of which aren't) and transients. If it was just a sine wave, it would sound like a sine wave and not like a piano.
This is part of why things like Shazam are so difficult: musical notes aren't just a single frequency in the FFT, they are a stack of them.
> You could just tell them a handful of numbers—the sizes of the different circles in the picture above.
This is actually the exact same number of numbers as in the time-domain series. Taking the FFT on its own doesn't reduce the amount of data, it's about discarding or compressing some of the frequencies after you do.
> The really high notes aren’t so important (our ears can barely hear them), so MP3s throw them out, resulting in added data compression.
This is only part of what MP3's do, and at high bitrates this really is inaudible. The main source of compression is that the precision of the numbers used to represent the amplitude of the sine waves is reduced.
When you have a loud sound and a quiet sound at the same time, the loud sound will "drown out" the quiet one (called "auditory masking"). You won't be able to hear the quiet one, or one be able to hear it precisely. MP3 and other audio codecs take advantage of that by encoding quieter frequencies with less fidelity when there are other louder frequencies at the same time. You don't notice the loss of precision since it's buried under louder sounds.
> Just as MP3s throw out the really high notes, JPEGs throw out the really tiny circles.
This is off too. If JPEG discarded high-frequency signals, you would just be blurring the entire image. It would be exactly like saving it scaled down and then scaling it back up with some smooth interpolation.
Obviously, JPEGs don't appear to be stretched out thumbnails, so that isn't what happens. Instead, it's not that high-frequency signals are discarded, it's that their precision is reduced.
Human eyes are quite good at detecting sharp edges and fine details. What they aren't good at is detecting how sharp an edge is. We can definitely see a break between two colors, but we can't accurately detect the magnitude of that the difference.
JPEG takes advantage of that by rounding off those high-frequency variances to nearby values. That means there are fewer possible values at high frequencies, so fewer bits are needed to encode them.
I realize I'm being a negative Nancy here. I really liked your post and I agree 100% on how awesome the Fourier transform is. It's also quite hard to describe it in an approachable way, and you've done an admirable job. I just get bugged when simplifications for a lay audience are actually off the mark.
> No, it's not. A piano note is a complicated stack of overtones (some of which are harmonic and some of which aren't) and transients. If it was just a sine wave, it would sound like a sine wave and not like a piano.
This is really such a fundamental* aspect of frequencies and sound, it should have been the starting point of the article. "Look how noisy and squiggly this wave is, but with FFT we can see that it really only consists of waves at multiples of the same frequency".
* no pun intended
While not a perfect analog, it is similar to how vision works. If you are in a bright room and the lights go out, it takes you a little time to readjust to the darker room before you can see things again. While your irises were adjusting, the chair and take were still there, and light reflecting from those objects were still reaching your eyes, but you were unable to see them. It works the other way too. If you are still in that dark room and the lights come on, it will take your eyes a little time to readjust, less time then it took your eyes to adjust to the dark, but still your eyes are effectively discarding that information.
The psychoacoustic model in MP3s is even more bizarre. It turns out that some frequencies, when heard BEFORE another frequency, your brain will throw that first frequency away. It is unintuitive, but that has been proven in laboratory settings. Knowing how the majority of humans discard the same auditory information under different conditions, MP3s are compressed by throwing away the same information that the brain would normally discard. The Fourier transform is a significant part, but it isn't the whole story.
So, Nice enough article for slipping into the topic - now give me more! harder! faster!
If you are interested, see http://groups.csail.mit.edu/netmit/sFFT/ .
In a way, it's twice as many, since they have both real and imaginary components.
The "naive" Fourier Transform has a pleasing mathematical simplicity and makes it very useful for deriving all sorts of theoretical properties for scientific analysis, but in practice it has a computational complexity of O(N^2), that would quickly make all those applications you name in your article utterly infeasible.
Instead, only with the discovery of the Fast Fourier Transform algorithm roughly halfway the 20th century, lowered the complexity to a very manageable O(N log N). And only then all these applications and properties of the Fourier Transform came within computational reach. Without the FFT, the Fourier Transform would mostly just be useful on paper for functional analysis and math proofs, things like that, but not in practice and we'd be missing out on all those MP3s, JPGs and speech recognition.
It's definitely not an overstatement when the WP article says: Fast Fourier transforms have been described as "the most important numerical algorithm[s] of our lifetime".
 or 1805, depending: https://en.wikipedia.org/wiki/Fast_fourier_transform "The basic ideas were popularized in 1965, but some FFTs had been previously known as early as 1805."
1.) In the first part of the article talking about decomposing a periodic signal into sinusoids of different frequencies - it would be more correct to refer to it as the Fourier _Series_, which is different from the transform.
2.) The image showing three sinusoids summing together into a non-period signal made me cringe (the sum is also periodic - that's the whole point of the Fourier Series).
2.) It will only be periodic if there exist some frequency for which the frequencies of the sinusoids are integer multiple.