Hacker News new | past | comments | ask | show | jobs | submit login
An interactive guide to the Fourier transform (2012) (betterexplained.com)
221 points by uticus on Dec 15, 2023 | hide | past | favorite | 74 comments



I don’t like the casting of the frequency domain view as the ‘recipe’ and the time domain view as the ‘product’. The point of Fourier is that you can switch between these perspectives losslessly - they contain equivalent information. The ‘smoothie’ metaphor of ‘unmixing’ the smoothie to get the ingredients, and then blending it to get the smoothie back conjures the impression that Fourier transformation is some sort of entropy reversing process, which is misleading.

A time domain function has a frequency domain interpretation - that doesn’t mean the frequency domain function is what ‘made’ the time domain function. It’s a chicken and an egg - both make each other.

Just like a function has a derivative, and you can recover the original function (modulo it’s y-offset) by taking its antiderivative - that doesn’t make the derivative the ‘recipe’ for the function.


the best analogy is a basis change in a vector space: you can have the same data (the same vector) viewed in different ways if you look at it using different a base. for example, in physics, the numerical value of the coordinates of an object change in different frames of reference

and this is not just an analogy: in the vector space of functions the fourier transform is indeed a basis change (to be more precise, a rotation). and from this arises the fractional fourier transform, which is a halfway change: if the fourier transform is a 180 degrees rotation, a fractional transform is something in between

https://en.m.wikipedia.org/wiki/Fractional_Fourier_transform

and this also explains why the fourier transform is the inverse of itself: two 180 degrees rotations gets you to the same place you were before


> the best analogy is a basis change in a vector space:

It's closer than an analogy. FT is essentially a map of basis into a dual space, but this dual is non uniquely isomorphic with the original so given an arbitrary choice of isomorphism you can turn it into "just" a change of basis.


FT is a 90 degree rotation not a 180, the FT of the FT of a function is the mirror image about the origin, not the function itself.


Ok you're right! I originally wrote down 90 degrees but then I had a conflicting view about being the inverse and then reasoned it must be 180 degrees

So the fourier transform of the fourier transform isn't the same as the inverse fourier transform? (ignoring the scaling bits that can be normalized I think), so I've been lied to?

Anyway here is a funny pair of questions

https://math.stackexchange.com/questions/1472528/why-is-the-...

https://math.stackexchange.com/questions/3922412/why-isnt-th...


90 degree rotation? That would imply that the fourier transform is orthogonal to the original function.


It is, because wavenumber and position are distinct variables and are orthogonal to each other. FT turn position into wavenumber (positional frequency) and wavenumber into negative position:

[ 0 1] [x] [ ω]

          = 
[-1 0] [ω] [-x]

see also https://en.wikipedia.org/wiki/Linear_canonical_transformatio...

the rotation matrix [[ 0 1], [-1 0]] is a 90 degree rotation.


Ok, now I see it!


fractional FT is the uncertainty principle then? you can know position locally, or momentum locally, or you can know them both blurrily with a rotated basis in between


Yes, but that's because the momentum is the fourier transform of the position and vice versa

So there's an uncertainty principle for signal processing, entirely because of the fourier transform and not because of quantum mechanics

See this video https://www.youtube.com/watch?v=MBnnXbOM5S4


>the fourier transform is the inverse of itself

Not exactly. Almost.


Agreed, in image processing the time-domain representation doesn't always carry significance when spatial extent is pertinent


https://youtu.be/spUNpyF58BY?si=dM8J8Df5U7DTV9ls

This is the definitive, the last Fourier transform guide you’ll ever need. It’s so intuitive and simple to understand. I watched this video *once* six years ago, and I can still rebuild the Fourier formula from memory.

If only this had been around during my digital signal processing coursework in undergrad.


3b1b’s video is excellent, agreed.

But I also want to give a shout out to this video: https://youtu.be/ToMyB5Hk06w?si=yJLDbb82JireH4Q9

It does a really solid job of building the intuition for the way the underlying mathematical machinery works, using ‘inner products’ as its mode of thinking rather than complex analysis - I think it’s a great complement to Grant Sanderson’s explanation and for some people I think it’s probably more useful.


> If only this had been around during my digital signal processing coursework in undergrad.

I hear that.

At my school the course in Fourier Series & Boundary Value Problems was better known to students as "Mystery Math". This because there was typically a single moment of enlightenment.. where suddenly... the entire content of the course became pud.

If this moment of mathematical satori came before the final exam, you were in luck.


The short explanation of the DFT that I like the most is that it projects the signal vector on to the vector room with the exponential functions as basis vectors. Then you can see how much of the signal that each basis vector, corresponding to a frequency, can explain.

It's intuitive to see if you start with a 2D vector space (the regular euclidian plane) and a 2D vector, and then you can expand the definition of vector spaces to include orthogonal functions as bases.


> vector room

Is Swedish, "rum" means both space, and a room. In English, "vector space" is used.


Minor point, it's not a projection because you don't lose dimensions.


nit: the identity matrix is a projection

P^2 = P is the definition. not losing dimensions


Nit to your nit, which is incorrect w.r.t. parent reply:

A Fourier transform is not a projection, it's a change of basis represented by a unitary transformation.


The w Fourier coefficient F(w) is the dot product of f with an exponential function, `e_w • f`, and is in that sense a projection. The inverse Fourier transform writes the original function as a sum of the projected components: `f = sum_w (e_w • f) e_w = sum_w F(w) e_w`. This is exactly how writing an "arrow" style 2- or 3-D vector as a sum of orthogonal projections works.


True, but missing the point as you would not normally refer to the identity matrix as a projection.


What was your point? An incorrect characterization of projections?


Is it a math terminology? Because the way I think it's still a projection, just with 0 residuals?


previous submission with 79 comments

https://news.ycombinator.com/item?id=27229836


Thanks! Macroexpanded:

An Interactive Guide to the Fourier Transform (2012) - https://news.ycombinator.com/item?id=27229836 - May 2021 (79 comments)

An Interactive Guide to the Fourier Transform - https://news.ycombinator.com/item?id=10635075 - Nov 2015 (18 comments)

An Interactive Guide To The Fourier Transform - https://news.ycombinator.com/item?id=4948082 - Dec 2012 (25 comments)


It is super to easy to find resources on HOW the FT works. But I find it very difficult to find resources on WHY we need it and WHERE it is useful to have.

Does anybody have some good sources that explain the practical applications and how it is useful on real world usage?


From TFA:

Stop. Here's where most tutorials excitedly throw engineering applications at your face. Don't get scared; think of the examples as "Wow, we're finally seeing the source code (DNA) behind previously confusing ideas".

If earthquake vibrations can be separated into "ingredients" (vibrations of different speeds & amplitudes), buildings can be designed to avoid interacting with the strongest ones.

If sound waves can be separated into ingredients (bass and treble frequencies), we can boost the parts we care about, and hide the ones we don't. The crackle of random noise can be removed. Maybe similar "sound recipes" can be compared (music recognition services compare recipes, not the raw audio clips).

If computer data can be represented with oscillating patterns, perhaps the least-important ones can be ignored. This "lossy compression" can drastically shrink file sizes (and why JPEG and MP3 files are much smaller than raw .bmp or .wav files).

If a radio wave is our signal, we can use filters to listen to a particular channel. In the smoothie world, imagine each person paid attention to a different ingredient: Adam looks for apples, Bob looks for bananas, and Charlie gets cauliflower (sorry bud).

The Fourier Transform is useful in engineering, sure, but it's a metaphor about finding the root causes behind an observed effect.


Yes I saw that. But it still hides the "meaty" part. Where are the articles that explain those processes?

For example I would love to find an article that starts with "let's make a wav file smaller". And then somewhere in the middle it just says "and here we will use FT to achieve X".


Here's a pretty good lecture on exactly that subject: https://www.rose-hulman.edu/~bryan/invprobs/jpegtalk2.pdf

Honestly though, the Fourier transform is useful anywhere that it's easy to think in terms of frequency instead of time. 90% of use cases in the real world come about from trying to do something that's that's easy to express in terms of signal frequency and difficult to express in terms of signal time/space. You can just use the Fourier transform to convert between them efficiently.

To give a morbid, but non-obvious example, I once did some work with a charity that was concerned with migrant deaths. They wanted to figure out where the migrants most at risk were working to approach the farmers hiring them directly. We got a dataset of when migrant bodies were found and did an FFT (among other processing) to find the periodicity of the crops that they were coming to harvest. There's only a few major crops and they tend to have distinct growing periods, so this is enough information to pinpoint certain crops like strawberries.


FT breaks down a cyclic signal into the frequencies present, and their intensity. The longer the sample of the signal, the more precise becomes detecting the real frequencies present in the signal.

Mainly is tool for to obtain information that will feed latter other algorithms. Also can be used as a signal filter with the Inverse FT.

If this does help to understand, it is homologous to use several band pass filters repeatedly for trying to obtain the root signals, but without adjustments requirements, highly faster to compute, simpler, and with better result.

So your question is, What are the uses of knowing the main elements that make up a signal?

That information is useful for analysis of sounds or images, for to detect the presence of elements composing the signal out of the expected range.

Also for pattern detection, as you may give the data a signal form of your own, and analyze the frequencies peaks for example.

For compression, the above two, as knowing the main cyclic elements and their intensity allows to determine if some ones may be latter omitted (for loss compression), or what elements latter should be replaced as parameters that the decompression algorithm will use for to reconstruct the signal.

Also it is important what said the other answer. It does not give information about the time/space moment in with is produced the element, it only tells the element is present (the frequency in the signal).


Take a look at [0] to get a quick visual demonstration of how that works. As others have said, lossy compression often throws away high frequency data (for sound, it corresponds to high pitches that you can barely hear. For images, high frequency corresponds to fast changing parts of a picture, so think crisp edges, fine detail (e.g. hair or stubble), and noise/film grain). So a basic lossy compression algorithm might use a FT and then only store the lower frequencies. When you do the inverse transform without the high frequency data, you lose details.

Note that the spectrograms in [0] use what's called a short time fourier transform[1]. Basically, split your song into windows (e.g. 1 second each), and do a fourier transform on each window. This lets you get a picture of the spectrum over time.

[0] https://sound.stackexchange.com/questions/38709/320kbps-mp3-...

[1] https://en.wikipedia.org/wiki/Short-time_Fourier_transform


It's not an article, but I found The Scientist and Engineer's Guide to Digital Signal Processing book to be very comprehensive. There are a couple of chapters on applications, but not much code.

http://www.dspguide.com/pdfbook.htm


If you understand the fourier transform, you should have some idea of its inverse. And that's exactly how (one part) of compressing an image or audio works.

You take a signal, take its fourier transform, and then cut off the frequencies above some threshold you don't care about (I'm ignoring certain complicating factors for simplicity). Let's say your original signal has frequencies up to 22KHz. If you're only interested in moderate fidelity human voices talking, maybe you cut off everything below 100 Hz and above 3.4 KHz. You just toss that information. Then on the receiving end, you do an inverse DFT and reconstruct the signal.

JPG and MP3 do something like this, along with a pile of other tricks.


Your cell phone wouldn't exist without the Fourier Transform, or the discrete fourier transform, to be correct. Image compression is another application, albeit, a 2-dimensional version. Software defined radios, or SDR, are completely dependent on the DFT. Radar processing. Earthquake analysis. The list goes on and on. Basically, our technological society would not exist in its current form without the fourier transform. To me, it is one of the key mathematical algorithms of the 20th century.


> Your cell phone wouldn't exist without the Fourier Transform, or the discrete fourier transform, to be correct

Yes that is great to know. But where can I find an article that explains how exactly FT helps in my cell phone? What exactly do we do with FT in a cell phone?


For example, to create a JPEG, part of the process is removing the "high frequency" parts of your image, since those take the most information to store. Here, high frequency refers to noise, or any large difference between neighboring pixels, as opposed to low frequency parts, which are averages over larger groups of pixels. So, at the extreme, if you average the whole image, you only have to store one pixel, so it's obviously less data to store, vs storing info about every single pixel. JPEG (and other lossy compression formats) tries to find a good middle ground between storing every pixel perfectly and storing just one pixel.

So, how do you remove the high frequency parts? You apply a Fourier transform to the image (in this case, it's a "Discrete Cosine Transform", which is extremely similar to a DFT and has no differences for the purpose of this explanation) and get a 2d array. This 2d array has the low frequency parts in the upper left corner, and the high frequency parts everywhere else (to the right and down for horizontal and vertical frequencies). So your compression algorithm will simply zero out the high frequency parts so you don't have to store them. In the simplest case, this is equivalent to a simple blur of the image, but there are some heuristics about how much to remove (zero out) to minimize image degradation.

To decode the image, you take the 2d array and apply the inverse FT to get the original image (now slightly blurry because it's been compressed).

More details here: https://en.wikipedia.org/wiki/JPEG#Discrete_cosine_transform


We went from mostly fearing vibrations in a mechanical world to harnessing them in a chemical and electrical world. It entirely reshaped our relationship to the world and perhaps so fast that our brains haven't entirely caught up yet. I wonder how many well controlled oscillations of mostly electrical charges happened between me swyping this and you hearing it in your head. We rely on them instead of fearing them now.


I don't think there's a satisfying concise answer to this question. You're asking to boil down multiple semesters of engineering and mathematics courses to explain something that isn't simple.

The "why" we need it is because it's a convenient mathematical tool to simplify complex problems dealing with sequences/series. It decomposes sequences of numbers (*) into another sequence of numbers that represents a summation of repeating patterns in the original sequence.

The "where" we use it is anywhere that knowing about which patterns show up in sequence of numbers might be more convenient than looking at the original sequence.

There are other nice properties, too. There's a mathematical operation called convolution that's very useful in domains operating with sequences of numbers (machine learning, control systems, audio and image processing, etc). It happens that convolution in the original signal's domain is equivalent to multiplication in the fourier domain. Why that's useful is that convolution is an O(N^2) algorithm but we can compute the Fourier transform in O(nlogn) complexity (**) and multiplication is constant. We can even do things like deconvolution (taking an output signal that we know was convolved with something else) and extract one of the inputs.

Another nice property of the Fourier transform is that most sequences of numbers do not have lots of energy (meaning magnitude) in "high frequency" (meaning fastly repeating patterns in the original domain). The FT of a sequence of numbers will naturally compact most of the information of the sequence into few places. We can exploit this for lossy compression (***).

Ultimately, it's a way of taking information that is hard to grok and transforming it to a domain that's more meaningful and operations/analysis are easier. Both for humans and machines.

* And it works on continuous functions too, but let's not get into that

** Technically this is a variant called "circular" convolution

*** The FT is not the most convenient tool for this job, so very closely related transforms like the DCT can be used, which have the same computational advantages of the FT.


Here's a nice example with the Fast Fourier Transform (an algorithm for computing the Dicrete Fourier Transform more efficiently):

Source: PCA and Fourier Analysis (2010) J Banfelder, Weill Cornell Medical College

> "A beautiful example of how this knowledge can be used in medicine is found in the cochlear implant. This device is used in patients with inner ear damage. The entire mechanical transduction mechanism is bypassed when the device is implanted. Instead, a microphone worn on the outer ear records sound that is digitized and sent to a signal processor. Here an FFT and an array of bandpass filters are applied. Results are passed to the implanted device, which electrically stimulates the neurons in the cochlea. Typical devices divide the frequency range of 0 to 4 kHz into about 15 or 20 bands, and stimulate neurons accordingly. However, profoundly deaf patients have recovered their hearing and have been able to understand speech even when as few as five bands are used."

See also:

https://allsignalprocessing.com/lessons/the-four-fourier-rep...


Some of the more well-known examples are:

Low-bitrate audio compression (MP3) use Fourier transforms.

Image and video compression use Fourier transforms.

Less-well-known:

AC electricity has to run at (almost) exactly 60hz / 50hz. The power plants use Fourier transforms to measure the exact frequency. (And then use the measurement to adjust accordingly.)

Some telecommunications techniques may use Fourier transforms: IE, the dial-up modems used in the 1990s used Fourier transforms to interpret the analog signal and determine what the bits were.

This article explains using a Fourier transform to remove the "dots" from an image that was printed in a book: https://matzjb.se/2015/08/08/smoothing-a-halftone-photo-usin...


Disclaimer: I wrote this, but it's a little more from an engineering POV than a Math POV, and covers some applications: jezzamon.com/fourier

Hacker news folks seem to enjoy it :)

Another thought on the "why": many things in the real world operate on frequencies, sound being the most obvious but also electrical signals, mechanical systems like springs, etc. Doing the analysis representing a signal as a bunch of frequencies makes more sense, and Fourier transforms lets us do that


Making convolution faster.


This is circular logic.

So what real world problems we have right now that depend on making convolution faster?


Literally everything. The list of thing that are not some type of convolution is really short. Name 5 concepts and I'll try to connect it to convolution.


Request for someone to make an intuitive explanation of why the Fourier Transform is (almost) it's own inverse. I know the math proof from taking analysis, but the formula is too pretty and symmetrical for the explanation to be so technical.

Same for why it preserves L2 norm.


Same answer for both. It's an orthogonal transform, aka a change of basis. You're conceptually rotating the function/series to an equivalent one that's orthogonal to the original. The magnitude/energy hasn't changed (Parseval's theorem is a more succinct definition). And to perform the inverse transform you need to conceptually rotate it back to the original, which should mirror the original transform very nicely.

If it didn't preserve energy then there wouldn't necessarily be an inverse transform, since that implies information was lost.


I really appreciate this reply, since this is something I've always been curious about. If you have time, I would really appreciate it if you could elaborate on this point (maybe with some equations), but you've already given me a ton to think about, thank you!

Also, is the new basis orthogonal to the original, or just another orthonormal basis? I don't see why it would be orthogonal to the original.


It's an orthonormal basis. See my comment here[0] for more on why the forward and inverse transforms look similar (I've written e_w to mean e^iwt, but we want to think of it as a vector). The forward direction is doing a dot product with an exponential with a specific frequency to get the transform at that frequency (i.e. the function's projection/component at that frequency), which is a sum over the entire time basis. The inverse transform sums over all components/projections of the function at each frequency to rebuild the function. This is how you do an expansion in terms of orthogonal projections in any dimension.

This also explains why the forward transform has a minus and the inverse doesn't: a complex dot product sums over complexConjugate(v_i)*w_i, while the inverse/reconstruction just sums over the basis exponentials scaled by the Fourier coefficient for that frequency.

You can also prove that the sum over all imaginary exponentials is Dirac delta[1], so you could think of the inverse transform as being a dot product with delta to get the function at a specific time, and that dot product is a sum over the frequency basis.

[0] https://news.ycombinator.com/item?id=38658312

[1] https://math.stackexchange.com/questions/1343859/why-does-in...


It is a least squares fit of sine waves. (A^T)A = I because sines are orthogonal.

Use Euler's eqn to convert e^jw into sin + cos and just work through the algebra.


Just a nit. The pair of equations the author showed at the beginning of the article are not the equations of the Fourier Transform and its inverse. The Transform is a continuous function operating on an infinite input. The equation for the Transform involves the use of the integral taken over +/- infinity. What is shown, using the summation operator, is a discrete form on the Transform where the input is a limited time series.

A good alternative explanation can be found on Grant Sanderson's 3 Blue 1 Brown channel https://youtu.be/spUNpyF58BY?si=uqq2OOSATYcWmaG8


Fourier transform can be defined for any locally compact abelian group. Integers modulo n is one such.


You actually don't need abelian. e.g. the group of 1d affine transformations T(a,b)(x) = ax+b gives a variation of a wavelet transform. But you no longer have 1-D irreducible representations, so your Fourier coefficients become operators instead of numbers or something like that.


What is Haar measure in this case?


Apparently the left invariant measure is 1/a^2 dadb, and the right invariant measure is 1/a dadb. The book I have just sticks to left measures, and is an engineering book so it doesn't really get into details that a math book would (the whole group theory chapter is 31 pages out of a 1500 page book and the wavelet portion is ~2 pages, though it also has an 8 page section on wavelets in a previous chapter). It's very much a "what are groups, what are representations, and why do they matter for physics and signal processing" kind of thing. For reference it's Barrett & Myers Foundations of Image Science.


Given that monads are monoids in the category of endofunctors, I'm perfectly fine with it.


There is no Royal Road to mathematics. There is also no shortage of people convincing themselves they understand some particular aspect of it by watching a YouTube video.

It is rare when a mathematician, or anyone, can write a book on a topic that no other expert in the field can top. Walter Rudin did that with Fourier analysis.


One of the guys he thanks is Steve Lehar, who is a hero of mine. Steve had incredible intuitions about Harmony and Resonance and was crushed down by normies in psychology. But, indeed, the world is made of waves and resonance/harmony does govern pretty much everything! He says it better than me:

https://scholar.google.com/citations?user=UmJanU0AAAAJ&hl=en


Just seen this, was playing with complex Fourier series yesterday. Here < 100 LOC of python that calculates the coefficients and draws the circle animation https://gist.github.com/kaleidawave/bdaf8649e7917152b6cdd624...

Mind blowing that there might be a solar system out there with a collection of orbiting objects whose final object could have square orbit


A square orbit would require infinite acceleration. Specifically, a dirac delta.


Approximate square.


It would also be nice to have an understanding of its relation to the Laplace transform, something more than saying that the real component goes to zero.


+1. Pop-sci explanations of the frequency domain and Fourier transform are too many already, but does anyone know if there's a similar physical interpretation or visualization of the Laplace transform and the S domain?


Maybe this? https://youtu.be/iP4fckfDNK8

It shows how the FT is a 2D slice of the 3D LT in the s-domain.



This article incorrectly labels the DFT as as a Fourier transform. To people who use these tools, the distinction is important and misnomers like this will make learning harder for people starting out. The article should be updated to correct this error.


Adding epicycles to the circular orbits - an example before Fourier!

https://en.m.wikipedia.org/wiki/Deferent_and_epicycle



There was an old macintosh application that had an interactive FFT. I remember seeing it at the Haus Der Musik in Vienna and have always looked for it.


I find all these Fourier explanations bad. The basic idea is very simple:

Fourier takes a function, and converts it into the reciprocal domain.

So if your X axis is time, t. Then Fourier gives you 1/t. What’s 1/t if t was some unit time? Frequency.

If you want a nice intuitive example of this, (hand waving begins) a lens will give you a Fourier transform on its back focal plane if all the light coming in are parallel.

Don’t quote me on that I need to double check the precise conditions.

But you can send laser light through an image printed on transparency and manually apply filters to clean up small artefacts.


4F Optical Correlator: https://youtu.be/wcRB3TWIAXE

A simple optical computer.


Yep. It’s a great experiment because you can have spatial frequency as well which gives you a physical demonstration of a Fourier transform.

Then you can do things like create physical filters and reform the original image to see if your filter worked.


Reading this I realise Bartosz Ciechanowski's work has ruined my definition of "interactive" for blogs.


Exactly what I was thinking!




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: