Hacker News new | past | comments | ask | show | jobs | submit login
I over-engineered a Fast Fourier Transform for Arduino (klafyvel.me)
294 points by psychphysic on Nov 25, 2022 | hide | past | favorite | 84 comments



There’s an alternate way to save space: the discrete Hartley transform.

https://news.ycombinator.com/item?id=27386319

https://twitter.com/theshawwn/status/1400383554673065984

  dht(x) = fft( (1 + 1j)*x ).real / area(x)**0.5

  area(x) = np.prod(np.shape(x))
The dht is its own inverse, which is a neat trick. No need for an inverse fft. In other words, dht(dht(x)) == x.

Notice that the output is the real component of the fft. That means the storage required is exactly identical to the input signal. And since the input signal is copied to the real and imaginary components (1 + 1j)*x, the intermediate representation size is also equal to the input signal size. (Just take sin and cos of the input signal.)

The dht has almost all the same properties of the fft. You can blur an image by taking a hamming window and multiplying. You can resize an image by fftshift + pad with zeros + fftshift. Etc.

In practice the dht seems to have all the benefits of fft with very few downsides. I’m amazed it isn’t more popular.


The convolution theorem is much uglier for the hartley transform. I guess this is the main reason. You can see the Fourier transform as a simple variant of the Hartley transform where the linear filtering can be done directly in the frequency domain, without all the fuss of the hartley transform convolutions.

Another advantage of Fourier with respect to Harley is that derivatives become products with polynomials. Thus the Fourier transform transforms linear pde into trivial algebraic equations. For the Hartley transform, you get more complicated functional equations, so that it's not as useful.

> You can blur an image by taking a hamming window and multiplying.

It doesn't seem like you can? (Unless your image has some symmetries).


Here's an image: https://i.imgur.com/hNCwisy.png

Here's fftshift(dht2(img)): https://i.imgur.com/EB6yIb4.png

Here's a hamming window: https://i.imgur.com/NsGKb0X.png

Here's a hamming window raised to power of 20: https://i.imgur.com/ejCviD7.png

Here's the result of multiplying that with the transformed image: https://i.imgur.com/trHdWZS.png

And here's dht(fftshift(that)): https://i.imgur.com/of4bvhv.png

Presto, blurry image. No symmetries required.

We can increase the blur factor by changing the power from 20 to 200: https://i.imgur.com/yqkR3qb.png

And 2000 (megablur): https://i.imgur.com/7h9patv.png

The code I used to blur:

    def blur(img, factor):
        x = dht2(img)
        x = fftshift(x)
        x = x * hamming(img) ** factor
        x = fftshift(x)
        x = dht2(x)
        return x
There doesn't seem to be any limitations -- anything you can do with FFT, you can do here. And since the DHT isn't complex, you don't need to deal with phase/amplitude.

It's kind of like working with wavelet coefficients in a jpg transform, except wavelets are a lot harder to work with compared to fft signals (which is why people use fft instead of wavelets everywhere). But dht is the best of both worlds -- all the power of fft, none of the hassle of complex values.


> There doesn't seem to be any limitations -- anything you can do with FFT, you can do here.

There's certainly no limitations if all you want to do is to convolve your image with symmetric kernels. But you asked for advantages of the Fourier transform, if any. Convolution of two non-symmetric images is one of them ;)

There's also the fact that derivatives have an easier relationship with the DFT than the DHT. If you are interested in discrete linear pde, that's a big advantage.

For signal processing, the main advantage of the DFT is that the amplitudes and phases are cleanly separated. For the applications where you only want the amplitude, it is already there in the DFT, but you have to untangle it from the DHT, because the DHT intermingles the phases and amplitudes.

An example: imagine that you have an image affected by periodic noise, and you want to remove this noise. This periodic noise appears in the DFT as a frequency with very large amplitude (and an arbitrary phase, corresponding to the phase of the periodic noise, that you don't really care about). Thus, it is easy to locate and filter-out this periodic noise using the DFT: just find the frequency with large amplitude and set it to zero.

Now, for the DHT, the frequency that corresponds to the noise period is not necessarily large (in magnitude). It may even be zero, depending on the phase of the noise! To find that frequency, you have to essentially recover the DFT and compute the amplitudes.

Of course the DHT and the DFT are equivalent and there's an easy conversion between both, so you can do exactly the same things with either. The difference is just a matter of convenience. It's certainly useful to know both.

If you are scared of complex numbers for some reason (e.g., if your programming language makes it difficult to work with them) then you may prefer the DHT. Otherwise, the DFT neatly separates the phases and amplitudes and it's slightly easier to interpret.


On this point, for images much of the structure is encoded in the phase. This fact was very surprising to me when I learned it.

See the images in this stack overflow for an example [0].

[0] https://stackoverflow.com/a/40387839


Good morning!

> If you are scared of complex numbers for some reason (e.g., if your programming language makes it difficult to work with them) then you may prefer the DHT.

It’s not about that. The DHT cuts memory bandwidth in half. With FFT it’s a complex number, so it’s doubled.

FFT on a real signal can be cut in half too, but it requires doing weird shuffling of the values. (See submission’s blog post on the topic.) It’s unnecessary complexity unless you really need a full FFT for some very specific reason (like the case you mention). Most of the time, people just want to do some DSP like a low pass filter or resizing.


That’s impressive thanks.

I’m we did Fourier transforms as lab exercise using lasers for my Masters, that was one of quite a few amazing and memorable experiments.

Another was using impedance to pulse lasers, set me to thinking about impedance as a metaphor in software development.

Chemistry A-Level had rates of reaction in Physical chemistry, a few certain key parts being critical in a reaction.


What was the Masters in? Maths or physics or something else? Sounds fun, I wish there'd been more lasers in mine :)


Laser Engineering


To explain further, HT is unable to compose complex valued functions - it is only a complete system for real valued functions, that is: R -> R. So naturally, it's simpler, because it's less expressive.

Let me wow you since HT is not even on the map of exotic or alternative wavelet decompositions.

For example, Hermite transform: https://en.wikipedia.org/wiki/Hermite_transform

or if looking in the discrete space: https://en.wikipedia.org/wiki/Hadamard_transform

See: https://en.wikipedia.org/wiki/Orthogonal_functions

I'm amazed that folks think Fourier series is so special when it's just one of many in a deluge of complete function systems.


the fourier transform is very special indeed because of what its basis functions are the eigenvectors of


So you use one special thing, the Laplacian, to connote the specialty of another? I can go ahead and do the same duality for those other transforms and their respective basis functions. Specialness in mathematics is a fruitless and fealty affair - G.H. Hardy


that's not it


Genuinely curious - How does this save space? If the input was real, the first operation (inside the fft call) makes it complex, which doubles the storage space. Sure the output is real again, but the intermediate storage was 2x the input and output.


Since the input is copied to the real and imaginary components, you don’t actually need to copy it at all.

Think of it like having a pointer (in the C sense) to the real and imaginary image buffers. Set both pointers to the input signal, then run your FFT algo on that.

The key is that it minimizes memory bandwidth, not computation. It’s highly cache coherent.


Thanks, so the real and imag components always map to the same value during the processing, i.e. After every butterfly the real and iamg will be the same value? Cause again if the real and imag components ever differ it means more storage needed.


Correct, they are identical by definition.


What is `area(x)` here? Some kind of numerical integration like the trapezoid rule? Can someone please provide example code which can actually be tested?


The author links to another blog post called "A nice approximation of the norm of a 2D vector" [1], Recently I learnt from Youtube about the Alpha max plus beta min algorithm [2] which is a better approximation, it uses 1 max and linear combination to achieve 3.96% largest error compared to the approximation in the blog post which uses 2 max and has 5.3% error. I also wrote a comment on his blog so hopefully he will see it.

[1]: https://klafyvel.me/blog/articles/approximate-euclidian-norm... [2]: https://en.wikipedia.org/wiki/Alpha_max_plus_beta_min_algori...


From wikipedia it looks like both the article's approximation and this method have octagonal level sets. I was wondering what the actual difference was.

Looking into it a bit, I came across the error graph [2] from the article[1]. The main difference seems to be that the octagon used by the article is 'too small'. The maximum error would be smaller with a larger octagon. Currently there is an undershoot of 5.2 % and an overshoot of 2.5%. To reduce the maximum error, you need undershoot and overshoot to be equal.

Another notable difference is that the article's octagons have a different rotation than the alpha plus max beta algorithm. The article's octagons cross the axes perpendicularly, whilst the alternative algorithm is rotated by 360/16=22.5 degrees so it has vertices at the axes. That certainly changes at which angles the approximations are better, which might matter in practice.

What I wasn't able to find quickly is the average error for the article's approximation. Would be interesting to compare that to the 2.41% for the better algorithm.

[1] https://klafyvel.me/blog/articles/approximate-euclidian-norm... [2] https://klafyvel.me/assets/blog/articles/approximate-euclidi...


the difference is error metric. the article is minimizing L2 error instead of maximum error.


Little known fact: FFT is an O(nlgn) algorithm, but incremental FFT is O(n), meaning that if you want to compute FFT on the past n points of a stream of samples, it only takes O(n) to compute it for every new incoming sample.


O(n) for every new samples is O(n*2) when seen over time/samples?


Yes but you have no choice for streaming/real-time applications or anytime you need results for each sample, where doing the full recompute would be O(n^2 log n) total.


No, it is O(mn) if your stream is m samples long, compred to O(m n lg(n)) for the alternative.


@utopcell do you have a reference for incremental FFT? I'm not sure if I understand what your implying.

Given a polynomial f(X), obtaining f's evaluations over the n-th roots of unity takes O(nlogn) using FFT. Are you stating that obtaining g(X) != f(X) evaluations over the n-th roots of unity should take O(n) time assuming some precomputation derived in the FFT for f?


If X_k is the coefficient for the sample window (x_0, x_1, .., x_{n-1}) and X'_k is the coefficient for window (x_1, .., x_n) then X'_k = (X_k + x_n - x_0)exp(2pik/n).


One tidbit I got out of this that is not directly related to the content of the post is that there are apparently some high-quality instructables out there, and that people spend time writing them!

99% of the instructables I’ve come across in the last 10 or so years have been low-effort junk that aren’t even worth reading. The few the author linked are quite great.


The Oogoo articles on Instructables were eye-opening for me.


Maybe should have done it in a GitHub README.


It would be interesting to know what the side project that required FFT on the Arduino was. I'd also be interested in which other applications the FFT algorithm has on Arduino given the performance shown. Could you create a real time guitar tuner for example?


You wouldn't use an FFT for a guitar tuner, you'd just time zero crossings.

I did use an Arduino to detect "key tones" on a wireline radio controller, sampling at 16kHz (ADC running flat out on one channel) and implementing Goertzel's Method to detect the 2175Hz guard tone and 1950Hz and 1850Hz keying tones, and a twin-T notch filter to remove the 2175Hz tone from the audio. The controller sent a short burst of 2175Hz followed by a short burst of one of the lower tones, to key up the transmitter on the correct channel, then the presence of any audio on the line (including the "low level guard tone", 2175Hz but much quieter) kept the transmitter on until you'd finished speaking.

It was sensitive enough to detect the low-level guard tone even when the audio over it was quite loud.

As far as I can tell, it's been running for a good ten years now and I've never been called back in to fix it.

Edit: just to be clear the notch filter was analogue in the audio path! I did make a similar device to process audio from an RTP stream between devices linking radios that did everything digitally but that required a Raspberry Pi with a massive amount of CPU horsepower and a limitless almost infinite amount of RAM - it used almost 128MB!


Apart from the fact that FFT is way more complex than needed for a guitar tuner, I also believe it’s not even really possible to use a Fourier Transform for a useful guitar tuner. Because you won’t achieve good enough frequency resolution.

I would be interested if I am missing any possible tricks here. But the low E string on a guitar is about 82.4 Hz. For a useful basic tuner you would need a precision of maybe 5 cents (~ percent of a semitone), or roughly 0.25 Hz (roughly 1 Hz for the high E string). Frequency resolution with the FFT improves when you analyse a longer signal. For 0.25 Hz resolution, you need more than two seconds of audio. Two seconds where the pitch doesn’t change. I don’t think you’d get anywhere with such a design.

I’m not sure if this could be changed with a higher sample rate. With a higher sample rate, you have more samples per second to feed into the FFT, but also the upper boundary for frequency increases, so I believe frequency resolution just stays the same. I’d need to look up the formulas, though. Maybe with extreme zero padding? Or any other tricks?


It's possible to interpolate between FFT bins if you're assuming the input is a reasonably pure tone (because a pure tone in between two FFT bins will fill both of them in proportion to how close it is to each of them). This can give you extremely good frequency resolution (and is also good at rejecting noise far away from the tone of interest).


You're absolutely right. Even on a bass where your low E is about 40Hz (or a 5-string where you have a low *C* at 32Hz give or take) you'd only need a handful of cycles to track pitch very accurately.

The trick works because you're only trying to tune one string at a time. An FFT approach would presumably be able to track the pitch of all the strings regardless of what notes you were playing, with a bit of thought, although if you played an A on the 6th string 5th fret and an open 5th string it wouldn't be able to tell which was "out".


> good enough frequency resolution

phase vocoder, little man


s/little man/my friend/

the deprecatory reading was unwarranted


One example is audio visualisers, though you can buy chips like the MSGEQ7 that will output amplitudes at a range of fixed frequencies.

See for example: http://elm-chan.org/works/akilcd/report_e.html - Elm Chan's fftavr library has been around for ages now.


Yeah, I know Arduino is really low power and cheap, but these days you can spend like $35 to get a Rasp. Pi (or equivalent) small computer for actual compute girth for an application. Alternatively, a UART-to-wireless link could connect your data-stream to a server somewhere for stronger processing.

My guess is that you need something incredibly small power-usage, that must run on Arduino only.

A lot of FFT projects I'm thinking of are just... crappy SDRs. Maybe audio is "slow" enough that an Arduino's FFT is fine, but most applications are probably doable with just regular digital filters.


Think about using the Despain algorithm, derived by recursively decomposing an N-point FFT into FFTs of size sqrt(N). You end up with a bunch of 4-point FFTs where the twiddle factors are rotations by multiples of pi/2, which can be implemented with swapping of real/imaginary parts and a change of sign. Non-degenerate twiddle factors still occur between those 4-point FFTs In this way the number of maths operations can be reduced. The cost is that the input and output are both in a weird order.


Really this should be titled for AVR - Arduino covers a broad range of microcontrollers with very different capabilities. As much as some people deliberately try suggest Arduino is a microcontroller, it isn't.


True, but Arduino sounds better and drives clicks as it is well known.

Most readers, even on HN, don't know exactly what AVR is. It could very well be Automatic Voltage Regulator.


> abhilash_patel's instructable is a Great instructable

Christ, I hate Instructables. A site with such shitty presentation that should be so much better.

Years ago I tried using an FFT library for the Arduino to convert audio to frequency domain in order to "bucket" the audio spectrum and (hint: music visualizer) light up some LEDs.

The Arduino+FFT were not however up to the task and I ended up going with a dedicated IC to do the audio frequency bucketing.

I suspect these days a better FFT or hardware like the Teensy could likely handle the job.


> Christ, I hate Instructables. A site with such shitty presentation that should be so much better.

Yeah, the proper way of using Instructables is to download the PDF version of the article.


This AVR FFT is potentially faster.

http://wiki.openmusiclabs.com/wiki/ArduinoFFT

There's also a Hartley available there.

Some good links too. Like http://www.elektronika.kvalitne.cz/ATMEL/necoteorie/transfor...


Pretty cool stuff.

For people know who the capabilities of Arduino, is it possible to do real time image processing using opencv or similar library on it?


Probably not, unless you're willing to stretch the definition of 'real time' or 'image'. The classic Arduino Uno processor aka the ATMega328 does about 1 million 8-bit instructions per second per MHz of clock[0]; the Uno iirc has a 8mhz crystal. Even for the contrived example of a 1MP webcam that outputs uint8 grayscale values, you'd only be able to read it at about 8 frames per second, max, much less do anything with it. If you limited yourself to 64x64 boolean images, maybe; you could probably run a simple NN to categorize MNIST examples or do blob detection.

If you want to mess around with image processing within the realm of the Arduino framework, you could pick up a super cheap ESP32-CAM dev board [1], which has a small camera and a dual core microprocessor with much less anemic specs. You can of course program it using the Arduino IDE or, as I prefer doing it, using PlatformIO [2] which is a CLI tool and VSCode extension that allows you to use the [Arduino, ZephyrRTOS, mbed] framework with a zillion different dev boards and architectures.

[0]: https://en.m.wikipedia.org/wiki/ATmega328

[1]: https://www.amazon.com/HiLetgo-ESP32-CAM-Development-Bluetoo...

[2]: https://platformio.org/


Indeed, RP2040 and ESP32 is where the world has moved to. What would be the point of ever using ATMegas anymore, when the RP2040 has two Cortex-M0+ 133MHz cores, a bunch of peripherals and only costs $1 ?


The RP2040 itself 'only costs $1' without the flash chip it needs. Then what if you need an analog comparator? Another chip. Now what if you want a medium current PWM output? Another chip. AVRxt can output as much current on a single pin as an RP2040 can through its entire package. The two are simply not competing in the same space. AVR isn't strong in computation but is a great general purpose brain for many simple real world interfacing machines.


> Now what if you want a medium current PWM output

mmbt2222, 2n7000, uln2003, tip122, irf540/9540, sheesh

its inputs have a schmitt trigger mode. if you can deal with 200mv hysteresis you can maybe use three resistors instead of an analog comparator


All options and all costing you more money on top of the 'only $1'.


an mmbt2222 costs 1.6 cents and handles three times the current and six times the voltage of an avr

technically yes it's a chip but not an ic

resistors are under a cent


Looks like you've cracked the code. AVR is obselete after all.


has been for ten years


Given that's the case, do you have an explanation for Microchip's development of numerous new AVR product lines since their acquisition of Atmel in 2016?


Comfort and familiarity. They’re comfortable chips you can learn inside and out with a fairly huge ecosystem (importantly one that is approachable because it’s also powered by people in a similar position and with similar priorities/concerns to your own) around the parts/arch/IS and you don’t have to read scary looking 600-page datasheets (then realize there’s a separate datasheet for the core vs the rest), worry about soldering TQFN or TFBGA packages with your (t)rusty old Weller iron, don’t need to learn the distinction between the Cortex Core and the manufacturer-specific everything else around it, worry about things like initializing timers and peripheral buses, concern yourself too much with power states, etc. If the chip can’t do DMA and can’t do hardware-accelerated x, y, or z, it takes a lot of pressure off of things when it comes to nerd-sniping yourself into doing something in a better way or even just deciding “this isn’t possible, might need to pick a different approach/problem/solution altogether.”


yeah, good points

also they do have adcs and comparators and can drive 200ma. and they are reasonably tough

but you don't need much adc precision or speed or much drive current before you want something better


people buy them


RP2040 is for people that aren’t comfortable leaving the relative safety and comfort of Raspberry Pi and this prevented from exploring other, better options.

I’ll take an STM32 over an RP2040 any day. (The “Black Pill” dev/breakout board is a dollar and fifty cents on AliExpress, and I’ve seen it sell for a dollar or less.)


STM32 with integrated OpAmps probably is more useful to more people (especially as your "default pick" for most electronic projects), than almost any other microcontroller.

-----

EDIT: Speaking of which, I'd have to imagine that most FFT algorithms for low-power / simple electronics work would rather be implemented in analog OpAmps + Analog Filters rather than digital logic (or digital filters).


There's also the OpenMV cam: https://openmv.io/


little bit faster at 20Mhz, but not that that would help much.

No reason i see to use a microprocessor for image processing. You either go for an FPGA, or something beefier like a raspberry pi


I think you meant "microcontroller", the Pi 4B for instance is powered by a 64-bit quad-core ARM A72 running at 1.5 GHz. That is most certainly what most people would call a (rather beefy) microprocessor.


yeah, definitely meant microcontroller. I'm targeting my statement at hobbyists and people working on solo projects.

If you're some company that has the R&D budget to make it work with a beefy microprocessor, then that makes perfect sense.


You’d be surprised how far you can get with an ESP32-S3, the Xtensa LX7 is a weird little chip, but surprisingly can be quite powerful. Though I’m biased of course because we use them to their limits at work!


haha, if you can get an ESP32 to do image processing, then you're not an average hobbyist


I guess, but it’s also pretty open code these days now that the S3’s LX7 has some nice vector extensions and other additions for ML applications :)

https://github.com/espressif/esp-who

It’s not out of reach for hobbyists, I think, though getting it to run very well might be. And I would certainly tell anyone interested to start with something more powerful first!

I’m currently experimenting with porting Arraymancer to the S3, just to see how it can run, outside of work time mind you.

https://github.com/mratsim/Arraymancer

Slightly orthogonal to what we are discussing above, but more to show that these weird micros have some surprising grunt if you know how to make them sing.

Though I still wish ESP-IDFs malloc was better behaved.


There are many reasons: power usage, size, cost, availability being some of the main ones. I've been using microcontrollers for wearables and, as much as I'd like the power and ease of use of a Raspberry Pi, the size and power requirements can be too much to deal with. The Teensy 4.0 has proven to be a much better alternative for me - much smaller, lower power consumption, and still plenty of performance including a floating point unit which has been a game-changer.


I was going to quibble with your use of "microprocessor", given that you seem to be implying that it wouldn't include more powerful CPUs, but that got me thinking: given the change in transistor sizes since that name was coined should we be calling them "nanoprocessors" these days?


Agh, I meant to say microcontroller. But you're right, microprocessor is too broad of a term these days when it can refer to both a top-flight AMD workstation chip and a tiny stm8. Nanoprocessor sounds about right for the latter :)


Wouldn't the Pi be a problem, considering it's core OS / especially graphic processing "OS" is closed source ?


Face recognition is possible on a tiny and cheap ESP32-CAM[0] using Espressif's own ESP-WHO[1] framwork. Apparently it can process images at around 3 fps. I have an ESP32-CAM on order to try this out as part of a side project of mine[2], so I can make the eyes look at faces :) I would also love to get hold of a Person Sensor[3] which runs custom firmware specificially designed for face detection and would be absolutely perfect for this, but they're not in stock currently.

There's other libraries[4][5][6] that look like they could work too and I plan to try. As far as microcontrollers goes, the Teensy 4.0[7] should be powerful enough for reasonably fast processing (and it has a floating point unit!), though I'm yet to find a good library for doing so.

[0] https://robotzero.one/face-tracking-esp32-cam/

[1] https://github.com/espressif/esp-who

[2] https://github.com/chrismiller/TeensyEyes

[3] https://usefulsensors.com/person-sensor/

[4] https://github.com/ezelioli/Face-Detection-on-Microcontrolle...

[5] https://github.com/nenadmarkus/pico

[6] https://www.reddit.com/r/embedded/comments/g77exq/recommende...

[7] https://www.pjrc.com/store/teensy40.html


This is super cool—but why use a micro with no fp unit and only 2048 bytes of ram? Something much more capable likely wouldn't be more expensive.


For one AVRs are really cheap and neat, and a few cents can add up, e.g., if you plan to sell 10k+ devices. Sometimes it's also just what you already have lying around, so better use that then to let it get covered with dust.

Some anecdotes for when I assisted on a µC course at my university where we used AVRs:

We also had one practical assigned with FFT to read out the RPM of a fan via a microphone input sampled by an ADC to detect blocking of the fan (as fallback for early detection of cooling failures in a simulated processing plant), with some user interface displayed on a GLCD on top of it; was quite a pleasure to see the tiny thing handle all that in realtime and still provide a fluid user interface experience.

Another assignment was creating a lunar lander like game controlled by Wii/Nintendo nun chucks with sound (read from SDCard over SPI and then routed to an external sound module over SPI, DMA would have been great so to say), and high scores send over an external SPI attached Ethernet module.

The game required fixed point arithmetic too for its physics, and it truly learned students that parallelism and concurrency are two different things and that one simple core can already do a lot of cool stuff.


Same reason stories are about running Doom on underpowered devices and not on GPUs. For the challenge.

Running FFT on powerful devices with FP would not be a story.


Fair enough, that’s a good answer!


For the same reason as I implemented a VA synth on an Arduino with an antialiased oscillator and 2-pole resonant filter - because building a Ship In A Bottle is fun!


Is it open source? I'm curious.


The oscillator bit is, but I don't appear to have uploaded the one with the filter which I will need to find.

https://github.com/ErroneousBosh/slttblep

The filter is just an SVF with, with a precomputed expo scale "bent over" at the top to correct for quantisation. From that the two SVF coefficients ω/Q and ω*Q are calculated every time there's a control update and of course because you can't divide on an Arduino it uses a lookup table of reciprocals just as for the blep. I could probably use a "wider" table of 16-bit values for better precision.


where would be the use cases for FFT ? I'm a newbie.


If you have 26 minutes spare, Veritaserum's video [0] on the history of FTs and FFTs is quite good.

Effectively, a Fourier Transform can turn a signal into its underlying frequencies using a neat property of maths. For a piece of music, say, you could produce a spectrogram [1] – Foobar2000 [2] has a nice one built in – to visualise the frequency of music with time. In general, this is all sorts of useful applications any time you have a signal – electronics, climate physics, astrophysics, detecting nuclear explosion testing underground, and so on. It's essential to modern medical imaging, and a whole bunch of differential equations can be solved a lot faster by using this.

The FFT takes the naive O(N^2) algorithm and makes it O(n log n) by bunching together repeated calculations, so if you have a million samples, that's some 70,000 times faster. At the time of its discovery it was critical for Fourier analysis to actually be reasonable to do on a computer, taking days instead of years.

[0] https://www.youtube.com/watch?v=nmgFG7PUHfo [1] https://en.wikipedia.org/wiki/Spectrogram [2] https://www.foobar2000.org


FFT (“Fast Fourier Transform”) accelerates the computation of the Fourier Transform from the naive O(N^2) algorithm to O(N lg N) using divide and conquer. Because of the “Convolution Theorem” which says “convolution in the time domain is multiplication in frequency domain” the FFT allows you to compute the convolution of two vectors by performing: (1) two O(N lg N) FFTs, (2) one O(N) product in the spectral domain, (3) and one O(N lg N) inverse FFT. This is a asymptotic speed up over the naive O(N^2) convolution.

Convolution is the mathematical operation used to compute linear filters, correlate two signals, and multiply polynomials.

For example, if you have a string of length M and want to find copies of it in another string of length N, the FFT blows away the O(NM) naive approach (for big enough M).


As has been pointed out, a music visualizer is an obvious use-case.

Audio data (via microphone) digitized in real-time and sampled by the Arduino is necessarily in the in the "time domain". To bucket the audio into frequency bands (perhaps your visualizer is simple and just wants to light up LEDs relative to the amplitude of low, mid and high frequencies) you need to run that time-domain data through an FFT to get frequency-domain data out of it.


Veritasium has a recent video that goes into the algorithm a little bit. So does 3blue1brown. But basically anything that processes signals digitally will probably use the FFT somewhere. But even outside of that, you can use it for some kinds of matrixes in linear algebra and probably many other places :)




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: