Hacker News new | past | comments | ask | show | jobs | submit login

No, I don't use autocorrelation. Of course, it isn't a pitch detection software.



It's a cool project. I'm just curious, because I haven't bumped into anyone else who's dabbled with the CQT before.

So in what way is it variable Q? I thought the innovation of CQT over a regular FFT is that the bins represent each note step with the same 'Q', making notes more distinguishable? (whereas a FFT has more 'Q' in the high frequencies)

I think having an autocorrelation option on the visualization could be cool, as it could reduce the spikes from the overtones (and also show a missing fundamental). But I think you'd need a different convolution for each instrument.


linear STFT's window length in time domain = k (constant)

CQT's window length in time domain = k/f

showcqt-js window length in time domain = a * b / (a / c + b * f / (1 - c)) + a * b / (b * f / c + a / (1 - c))

whera a = 384, b = 0.33, c = 0.17

I also apply asymmetric window to reduce latency before doing VQT.

> I think having an autocorrelation option on the visualization could be cool, as it could reduce the spikes from the overtones (and also show a missing fundamental). But I think you'd need a different convolution for each instrument.

Research needed. Doing autocorrelation on simple single instrument monophonic audio is probably easy. But doing it on complex multi instrument audio isn't easy. Probably, it needs some sort of machine learning.


> showcqt-js window length in time domain = a * b / (a / c + b * f / (1 - c)) + a * b / (b * f / c + a / (1 - c))

> whera a = 384, b = 0.33, c = 0.17

So this is some sort of compromise to increase speed? Have you thought about implementing in WASM?

> Probably, it needs some sort of machine learning.

Agreed, but for a visualisation, it could just be a parameter just for the user to mess with.


> So this is some sort of compromise to increase speed?

Partially. The main purpose is to increase time domain accuracy on low frequency. If you want to do experiment on window length (in time domain), use ffmpeg showcqt filter (https://ffmpeg.org/ffmpeg-filters.html#showcqt). showcqt-js window length is hardcoded to tc=0.33 attack=0.033 and tlength='st(0,0.17); 384tc / (384 / ld(0) + tcf /(1-ld(0))) + 384tc / (tcf / ld(0) + 384 /(1-ld(0)))' (https://github.com/mfcc64/mpv-scripts/blob/a0cd87eeb974a4602...).

> Have you thought about implementing in WASM?

It is implemented in WASM.

> Agreed, but for a visualisation, it could just be a parameter just for the user to mess with.

If you want to do experiment on it, showcqt.js exposes intermediate color data cqt.color[]. It can be modified arbitrarily, including autocorrelation. If some day I do experiment on it and find that the result is satisfying, maybe I'll include it in YouTube Musical Spectrum.


> The main purpose is to increase time domain accuracy on low frequency

I thought so. The main issue with regular CQT is latency on low frequencies due to the excessive window required. It makes it unsuitable for real- time applications, as you know. Thanks for the insights




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: