Hacker News new | past | comments | ask | show | jobs | submit login

"With pictures, high frequencies give you detail, while low frequencies show you structure."

I have a vague idea of what this means but can someone please explain it in a bit more detail?




In short, a wave with a high frequency has a short wavelength, and vice-versa.

The Discrete Cosine Transform is a variant of a 2-dimensional Fourier Transform. The 1-D version of a Fourier Transform is what we use to break a signal, like a sound wave, into its constituent frequencies. It takes as input the wave amplitude at various times, and returns amplitudes for various frequencies. If you were to take waves of those frequencies and amplitudes, and add them together, you would get back the original sound wave you started with. (I'm hand-waving away a bunch of details like phase, boundary conditions, undersampling, and overtones--but this is the general idea.)

You can make the Fourier Transform and its relatives deal with images the same way as sound, by pretending that the image is periodic, i.e. that you are tiling an infinite wall with copies of that image. You could create this same wall by overlaying waves of color on top of each other. The Fourier Transform will find these waves, the same way it found the frequencies for the sound.

With sound, low frequency = slow vibration = long wavelength (imagine an oscilliscope). High frequency = rapid vibration = short wavelengths. So if you were to try yo draw a picture using waves instead of a brush, you would use low frequencies for large things like a head. You would use medium frequencies to add smaller objects like eyes. You would use high frequencies to give small details, like hair or freckles, or the specific shape of a specific person's head.


Ok, I'll take a shot at this.

Let's work in one dimension rather than two dimensions. It's easy enough to extend later.

You know that a any signal is the sum of a (potentially infinite) number of sine waves. For example, a square wave is the sum of ever higher-frequency (but smaller-amplitude) sine waves.

The higher frequencies are necessary to get the sharp edges.

If you strip the high frequencies, the sharp edges dissapear, leaving only the larger motions of the lower-frequency (yet bigger amplitude) waves.

So the low frequencies are the hill, and the high frequencies are the grass.

Does that make sense?

Edit: Here's an image: http://cnx.org/content/m0041/latest/fourier4.png


Thanks, that is a bit more helpful.


another way i like to think of it (someone please correct me if i'm wrong) is that high-frequency means high-detail (highly frequently needing information to specify how it looks) whereas low-frequency means low-detail

(is that completely off or is it an analogous transform?)


What is misleading when one talk about Fourier transform for pictures, is that it has nothing to do with the waves emitted by the colored particles and received by our eyes. It is more about the spatial distribution of intensities.

Applied to the sound, this "frequency view" is much more natural: we hear a sound, and there is a low and a high part of it. It's because our ears really do real time frequency analysis, a kind of biological Fast Fourier transform.

From what I remember, doing this transformation is just a matter of taking the original signal s, get its level n of the lowest frequency f, and compute s - n × f, and recurse on the result with the next frequency. The theorem proves that if you go to the limit you get two equivalent representations of the signal, one being the wave itself s = f(t), one being its "spectrum" s1 = f(freq) (a function of the frequencies).

For many purposes, f(freq) is much more convenient than f(t), including comparisons, frequency shifting, extraction, compression, etc.

It applies equally well to images, but for me the frequency representation of a picture is not perceptively useful, maybe because our eyes are not Fourier transforming what we see.

All that is's old story for me (I studied acoustics in IRCAM), please correct if my memory is wrong.


indeed. it took me forever to wrap my head around fft of an image.

one thing you can do is read how JPEG works, the DCT is a lot like generalized FFT.


I believe he's referring to the DCT transform, in which the image is interpreted as a mixture of sinusoidal waves. Low frequencies then correspond to differences between separate areas of the picture, while high frequencies correspond to texture.

I'm not a big fan of the periodic transforms, but they do have that nice perceptual interpretation.


Also, the DCT is what is used for JPEG, so maybe that helps you think about this a bit more intuitively: http://en.wikipedia.org/wiki/JPEG




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: