Hacker News new | comments | show | ask | jobs | submit login
Loefflers Discrete Cosine Transform Algorithm in Futhark (pepijndevos.nl)
45 points by fulafel 7 days ago | hide | past | web | favorite | 8 comments

Very nice! I wasn't aware of this algorithm at all, but it seems to be an early application of the lifting scheme to the DCT filter, 8 years before this method was introduced in the more general context of perfect reconstruction filters.

There's a great expository paper about these kinds of filter implementations by Daubechies and Sweldens in the general case: https://9p.io/who/wim/papers/factor/factor.pdf

Interestingly enough, Sweldens does not seem to cite Loeffler's paper, so it's likely that they came up with the same method independently of one another.

Doing DCT the naive way, the formula is as easy as the DFT. But computing the DCT quickly (like in O(n log n) time) is much harder to understand than the FFT.

I read a couple of papers and implemented the fast DCT algorithms. The Arai, Agui, Nakajima 8-point DCT uses 13 multiplications. The Lee DCT algorithm is recursive and works on any length that is a power of 2. https://www.nayuki.io/page/fast-discrete-cosine-transform-al...

> papers, I found the original [...] after days of struggling, I almost understood it


> From this code and staring at the paper, I learned a few things. First of all figure 1 is wrong.

Probably anyone who has implemented a non-trivial algorithm from a paper will feel a sympathetic twinge ;-)

I'm always sad that scipy and numpy do not use good FFT implementations due to ridiculous political concerns. However, in that case, there was a very happy outcome with this beautiful article!

I Googled around a bit and didn't find anything on political concerns other than the GPL thing (which would cover linking, not reimplementation of math). Is that the political concern in question or is there also something else?

Yes, I was referring to that. They refuse to use the standard and state-of-the-art FFTW3 library because it's distributed under the GPL. (Actually, it's dual-licensed GPL/commercial).

FFTW3 is where near close to state of the art anymore actually... At least on a few platforms, where GPGPU is available.

in any case it's an order of magnitude faster than scipy's implementation. Can you point me to a good free software implementation of the fft in gpu?

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact