Hacker News new | comments | ask | show | jobs | submit login
Using Fourier Transforms to Multiply Numbers (robertelder.org)
208 points by signa11 42 days ago | hide | past | web | favorite | 43 comments



Intuitive explanation: if you interpret a time series as the coefficients of a polynomial, the Fourier transform is just polynomial multiplication (this makes sense because polynomial multiplication is just convolution). Now we can write base b numbers as polynomials with single-digit coefficients, evaluated at x = b. Then after multiplying two such numbers/polynomials the coefficients are too large. The only thing remaining is too carry the overflows in the digits to the next digit (much like the school algorithm for adding numbers).


The Fourier transform isn't polynomial multiplication. (It operates on one thing, not two.) What the Fourier transform does is to translate between two representations of a polynomial: (1) its coefficients, (2) its values at a suitable set of points. And then the thing that makes FFT multiplication work is that pointwise multiplication of values is cheaper and simpler than the convolution operation that gives you the coefficients of the product of two polynomials.


Every once in a while I pick up "The Scientist and Engineer's Guide to Digital Signal Processing" (freely downloadable: http://www.dspguide.com/) and it's always so satisfying to see the different relations between domains and how you can think about it. Your example is no different.

But since I never get to actually use that stuff in my just, I always forget the details.


Thanks for that alternate way of thinking about it. As an engineer, half of my intuition about the FFT comes from the notion of the "frequency domain", and the other half from "change of basis".

The explanation you gave, of FFT as an alternative representation of a polynomial through its value at selected points, gives that nice intuition about pointwise multiplication, without appealing (directly) to the "convolution-turns-into-multiplication" catchphrase.


Ouch, good point! I was at work and a bit too eager to have the first post, I think ;)


For anyone who's interested in a more long-form, intuitive explanation of the fourier transform and how you can use it to multiply polynomials, I'd highly recommend Erik Demaine's lecture on the topic: https://www.youtube.com/watch?v=iTMn0Kt18tg.


A very interesting look at large integer multiplication, including the FFT method, that finds some interesting underlying connections mathematically between them all is given in the paper "Multidigit Multiplication for Mathematicians" by Daniel J. Bernstein [1].

He ties them all to exploiting certain ring homomorphisms, either to compute r s with the help of f(s) f(s) or to compute f(r) f(s) with the help of r s. Karatsuba, Toom, FFT, Schönhage–Strassen...all are using ring homomorphisms to go between what you want to be doing and something that is easier.

Also see his later "Fast multiplication and its applications" [2].

[1] https://cr.yp.to/papers/m3.pdf

[2] https://cr.yp.to/lineartime/multapps-20080515.pdf


I wrote a much more in-depth analysis of Schonhage-Strassen and why exactly it's an O(N log N log log N) algorithm here: https://psun.me/post/fft2/

It's pretty interesting--it's actually a recursive multiplication algorithm with log log N levels of recursion and N log N work at each level. And it actually uses another fast multiplication algorithm within it, the Karatsuba algorithm.


Man, is there anything Fourier Transforms can't do? For a music project in college, I created a book of Hammond Organ presets by taking slices of samples from actual instruments and running FTs on them, then finding the best approximation in Hammond drawbar notation. The fact that it worked so well was such a satisfying experience that I managed to make 100+ combinations.


I sort of did the reverse for a song once, I used FFT to analyze a recording of a drawbar organ I wanted to mimic, then turned that into a ZynAddSubFX additive synthesis preset.


This is where I started, actually! I wondered how far back I could go though, and the Hammond B3 worked extremely well.


Wow so you made an organ sound like a bunch of other instruments? You can't just post this and not share the sounds...


That is exceedingly cool. Is there anything available about this online? Sound clips, source code, etc.?


Most were typed up, but the source files were on a floppy disk somewhere in my closet. I can scan them in/upload them when I get a chance.


floppy disks what are those Kappa


I have to admit that i've never gotten around to sitting down and thinking through why fast fft multiplication works, so I have a question that maybe someone here could answer. Some background:

If a number is a multiple of two (i.e. n= 2k+0), we call it even, or if n=2k+1 then it's odd. This is interesting because evenness and oddness form an algebraic structure - an amoeba of an algebra with only two elements: even * odd = even, odd * odd = odd, etc...

But a moment's reflection reveals there is nothing special about being a multiple of two. We can just as well consider multiples of three and divide the universe into three groups: if a number is a multiple of three (i.e. n=3k+0) we call it 3-even, or if n=3k+1 then it's 3-odd, or if n=3k+2 then it's 3...blern! Again this is interesting because we have a mini-algebraic structure, this time with three elements. For example, a 3-odd times a 3-blern is (3k+1) * (3j+2) = 3(k3j+k2+j)+2 which is a 3-blern.

Similarly we have multiples of four: 4-even, 4-odd, 4-blern, 4-pop. Making up these names is of course absurd, so we instead invent a notation: modular arithmetic. Traditional even and odd are arithmetic modulo 2 (mod 2 for short); 3-even 3-odd 3-blern are arithmetic mod 3, etc. The mini-algebras are the cyclic groups or the multiplicative groups of integers modulo n.

The key word there being cyclic: we've identified periodic structures in the integers with different 'wavelengths' or periods, which is how long it takes for the even/odd/blern cycle to repeat:

  (mod 2) 0 1 .. 2 3 .. 4 5 .. 6 7 .. 8 9 .. 10 11 ..
  (mod 3) 0 1 2 .. 3 4 5 .. 6 7 8 .. 9 10 11 ..
  (mod 4) 0 1 2 3 .. 4 5 6 7 .. 8 9 10 11 ..
 
----------

Back to the topic of discussion. Typically we represent a number 'locally' with coordinates given wrt a base system, but it seems we might describe a number instead nonlocally, using these cyclic groups. That is, instead of saying 1 * 10^0 + 1 * 10^1 + 0 * 10^2 ... (11) i would say

  2-odd:   11 = 2 * 5 + 1
  3-blern: 11 = 3 * 3 + 2
  4-pop:   11 = 4 * 2 + 3
  5-odd:   11 = 5 * 2 + 1
So we're doing a kind of spectral decomposition: examining the number modulo 2, then mod 3, mod 4 etc. The key observation is, if two numbers are represented this way, how do I compute their product? Well its very easy, because i know what happens when i multiply, say, an an odd times a blern.. I get a blern! Multiplication becomes a pointwise operation. My question is this... is this an equivalent encoding to the one being done by the complex roots of unity in the FFT?


You're essentially describing a residue number system, except the different moduli need to be coprime for things to be well-defined.

FFT multiplication is basically a residue number system product on polynomials. Evaluating a polynomial p(x) at a point w_i is the same as computing p(x) modulo (x - w_i), where w_i here is one of the nth roots of unity. So the FFT computes p(x) mod (x - w_0), p(x) mod (x - w_1), ..., multiplies each residue individually, and the inverse FFT recovers the result modulo (x - w_0)(x - w_1)...(x - w_{n-1}) = x^n - 1.


Thank you!


One interesting application of NTT polynomial multiplication is in fully-homomorphic encryption libraries, such as Microsoft's SEAL [1], where the ciphertexts are polynomials of very high degree. (Disclaimer: I work at MSR using SEAL for machine learning.)

[1]: https://github.com/Microsoft/SEAL


This is well know in competitive programming usually it's the second best alternative after karatsuba.


Generally Fourier methods are asymptotically better than Karatsuba, but with much higher constants, which makes them kind-of impractical until you get very big numbers. Anyway, do competitive programmers tend to program everything from scratch, or is there another reason they don't just use GMP?


> Anyway, do competitive programmers tend to program everything from scratch, or is there another reason they don't just use GMP?

I think directly using algorithms and programs implemented by others would be considered cheating (and these libraries are not presented on the server which checks your answer anyway). In competitive programming, participants compete their own ability and knowledge to implement algorithms.


In ACM ICPC, we could bring printouts of library code and manually transcribe them into the computer. IIRC, FFT was apart of our team's library code, but you wouldn't use GMP since it would require too much typing.


In C++ thanks to complex the implementation is short see https://ideone.com/tbHUN7


C99 has complex, too. I think most competitive programming contests are now using C99, finally.


This property has helped make matrix profile computations feasible for motif discovery. https://www.cs.ucr.edu/~eamonn/MatrixProfile.html


I don’t think I’ve ever come across an algorithms book that discusses FFT that doesn’t discuss fast polynomial multiplication when introducing the algorithm.

That’s literally what the FFT does, fast polynomial multiplication.


If you come at it from an engineering approach (e.g., electrical engineer), it doesn't look like that at all. In fact, the focus is on filters and spectral analysis. It looked something like this when I was at Georgia Tech:

First, you learn about the convolution operation that, given any input function x(t), get the output of any linear time-invariant circuit as y(t). The convolution integral is nasty, professors make you feel the pain a bit. Then introduce the Laplace transform to make the computation much easier. Then go to continuous time, continuous frequency Fourier transform. Talk about frequency domain a bunch, learn filter topologies, etc. Circuits class over.

Now comes a signal processing class where you learn about discrete-time signals. First, talk lots about sampling. Then, get introduced discrete time convolution. Now, learn the z-transform and the discrete time Fourier transform (DTFT) for a discrete-time, continuous frequency signal. Mostly discuss filtering and spectral analysis. Intro to signal processing class over.

Learn about sampling the DTFT in the frequency domain. This is the DFT, which is usually presented as a sum that would require O(n^2) operations to compute. Learn that this corresponds to circular convolution and learn about zero padding for traditional convolution. Finally, get presented with Cooley-Tukey FFT algorithm for base-2. Focus is still signal and spectral analysis. Talk lots about windowing. You may get a mention that convolution corresponds to polynomial multiplication here. Or maybe they talk about grade-school multiplication, its really the same thing as polynomial multiplication with a carry. Senior level signal processing class over.


I've worked with filters and spectral analysis and you would often do this in CS in Digital Image Processing.

The point isn't what you use the FFT for but what the FFT does. The FFT does fast polynomial multiplication.


Yes that's true. But if you come at it looking at it as a sampled continuous signal and spectrum, you don't think of the FFT as fast polynomial multiplication. Neither viewpoint is wrong. Just different lenses for looking at different problems.


I think I still think of it that way. In EE you are convolving a filter with a signal in the time domain, which you can think of as a polynomial. Doing the fft, then a vector multiply +ifft accomplishes the same thing, but using the fft.


How can you think of a continuous time signal as a polynomial though? I don't know of a continuous time analogy. It seems like the polynomial only works for the discrete case. And FIR filter convolution is rarely computed by FFT in FPGAs (in stream applications).

Look at a system doing DSP for example. Let's say you have an input RF signal that goes into an analog band-pass filter, a mixer, an analog low-pass filter, a DAC, and a FPGA. The FPGA implements a FIR filter - lets say a Hilbert transform using a distributed arithmetic algorithm. You are getting envelope information out of the FPGA. There's no FFT anywhere in the FPGA to compute this filter output.

You are seeing some weird noise in your DSP output. You dig a little by looking at the output of the DAC and then look at the FFT magnitude log10. You see some suspicious spurs popping up, but at really strange discrete frequencies. You show the spectrum to an RF engineer, then you two go to the lab and put an analog spectrum analyzer on the input to the DAC. Sure enough, there are mixing products of your input carrier, mixer frequency, and sampling clock frequency. The weird spur frequencies make sense now - you are seeing folding of the mixing products from the sampling. You tweak the sampling clock to move the spur digital frequencies.

In that long but real world example, I don't see anywhere here where thinking of things in terms of polynomials would be helpful here. It's also going to lead to confusion of terminology with your RF engineer - they tend to think continuous.


Sorry, I think I misspoke. I didn't mean that I think of it in those terms normally. working with RF everyday, I think of it as time and frequency domain in the sense that you were describing. But mathematically I could see how the polynomial makes sense. Also, I was only thinking about discreet time since we are talking about the fft.


So interesting thing that. While I was introduced to the FFT in the same manner, (an algorithm for fast polynomial multiplication in a class by the CS department) my Electrical-Engineering-backgrounded colleagues are completely unaware of this use of the FFT. They use it as a change of basis to directly observe and manipulate frequency. The EEs I work with are much more familiar with the relationship between the Fourier transform of a function and what the original function looks like.


Bioengineer, I've basically also just used Fourier and the FFT for signal analysis, never heard of it being used for multiplication.


Combinatorialist here; I mainly think of fourier transforms as an efficient way to shuffle cards.

https://statweb.stanford.edu/~cgates/PERSI/papers/aldous86.p...


Neat, I've spent years with the Fourier transform and never have seen this!


Yep. Grade school multiplication of two large numbers is the same thing (except for the carries) as convolution of two signals in the time domain. The numbers are the signals. You already know that piecewise multiplication in the frequency domain is equivalent to (and faster than) convolution of those signals in the time domain. So that's what FFT multiplication is about.

It's only useful for VERY large numbers (thousands of digits), which is why most people never encounter it.


Note that a “signal” in this context is just a trigonometric polynomial over a periodic interval.

If you think of your periodic interval as representing angle measure, and the points in the interval as points on the unit circle in the complex plane, then your trigonometric polynomial can alternately be thought of as a Laurent polynomial in the complex plane. https://en.wikipedia.org/wiki/Laurent_polynomial

What the FFT does is convert between the values of your function at n roots of unity in the complex plane -> the coefficients of the Laurent polynomial interpolating those values.


Hmm, I'd bring convolution to this salad. I think in polynomial product when thinking about convolving.


I'm guessing most people aren't introduced to FFT in an algorithms course though, they're introduced to it as the computational way to calculate an FT.


You're probably right -- and even when I saw it elsewhere it was still within the CS department.

For instance we glanced over it during my Digital Image Processing course. The professor did introduce it as a form of fast polynomial multiplication but we didn't dive too deep into Cooley-Turkey because frequency analysis was the focus of the work.


If anyone's interested in writing and testing their code for this - https://www.spoj.com/problems/MUL/




Applications are open for YC Summer 2019

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: