
Building Spectro: a Real-Time WebGL audio spectrogram visualizer - shakes
https://github.com/calebj0seph/spectro/blob/master/docs/making-of.md
======
Lichtso
Also try using the Wavelet-Transform instead of short time FFT (with
overlapping windows).

It is easier to configure (less parameters, there is no need for a window
function), offers more flexibility (exponential frequency band; e.g. for music
scales) and can reach the Gabor-Heisenberg uncertainty limit without
artifacts.

The only downside is that you need to know the entire signal in advance, so it
can only be used for recordings.

Shameless self-promo of my implementation:
[https://github.com/Lichtso/CCWT](https://github.com/Lichtso/CCWT)

~~~
IshKebab
You don't need to know the entire signal in advance - you can just window the
wavelets at some reasonable size.

~~~
Lichtso
Yes, but then you are back at time-windows, a window-function, overlap and
artifacts, which is defeating the purpose.

~~~
jimbo1qaz
I'm more familiar with Fourier transforms and have limited experience with
wavelets. But if each wavelet intrinsically falls off at a Gaussian curve,
cutting it off (possibly with a window) at 3-4 sigmas won't change the wavelet
substantially. Maybe for some use cases, the wavelet will be narrower at high
frequencies (short delay), and wider at low frequencies (high delay). I don't
know how you'd perform incremental updates of a plot drawn with non-uniform
delays though...

~~~
Lichtso
The signal will continue over the seam between two windows, meaning you will
cut the wave in the signal "in half". Mathematically, waves are always
infinite and to cut them you would actually introduce overtones (higher
frequencies) to model the sharp end / start of the base wave. These then
result in artifacts regardless of what method is used for the transformation
(Fourier or Wavelet).

~~~
jimbo1qaz
> The signal will continue over the seam between two windows, meaning you will
> cut the wave in the signal "in half".

You can window the wavelet, then slide the finite-duration wavelet by a few
samples at a time, even if the wavelet is hundreds to thousands of samples
long. This is possible in STFT as well (each part of the original signal shows
up in many separate FFTs).

Again, I don't know the implementation details of wavelet transforms. Maybe
I'll look into your repo when I have time. What's your asymptotic and
practical runtime?

~~~
Lichtso
O(n log n) for the x axis (time samples) and O(n) for the y axis
(frequencies).

But you can downsample the signal in frequency domain, meaning you will pay
mostly for the output resolution.

------
npollock
the demo is fun to try:
[https://calebj0seph.github.io/spectro/](https://calebj0seph.github.io/spectro/)

~~~
skykooler
The "Record from microphone" option doesn't seem to work in Firefox for some
reason; it just spins a loading icon endlessly.

~~~
calebj0seph
Very strange! Did you get the permission prompt from Firefox after it started
spinning? If not you might have denied access to the microphone in all sites
which is why the prompt wouldn't come up.

------
davidy123
Record from mic works for me. Listening to
[https://www.youtube.com/watch?v=FATTzbm78cc](https://www.youtube.com/watch?v=FATTzbm78cc)
in one window with mic recording does the expected at the end of the song —
[https://www.magneticmag.com/2012/08/the-aphex-face-
visualizi...](https://www.magneticmag.com/2012/08/the-aphex-face-visualizing-
the-sound-spectrum/)

~~~
calebj0seph
Cyberdemon from the DOOM soundtrack is another fun track to put through a
spectrogram.

------
xchip
Here is another spectrogram visualizer but with a twist, the frequency bins
are the notes of a piano and hence you can use it to tune instruments or your
voice.

The project:
[https://github.com/aguaviva/GuitarTuner](https://github.com/aguaviva/GuitarTuner)

Online demo:
[https://aguaviva.github.io/GuitarTuner/GuitarTuner.html](https://aguaviva.github.io/GuitarTuner/GuitarTuner.html)

~~~
zeroxfe
Just FYI, that's not a spectrogram, it's a frequency histogram. :-)

------
GistNoesis
Little plug of something similar I developed one year ago : Wisteria :
[https://gistnoesis.github.io/](https://gistnoesis.github.io/) It does the
real-time spectrogram using tensorflow.js with gpu. And it also run some
transformer neural networks real-time to transcript the notes into a piano-
roll.

------
emmanueloga_
Looks really cool. Sounds like a similar approach could be used to render
audio waveforms. I wonder why a project like this one [1] decided to use
server side waveform generation instead.

1:
[https://waveform.prototyping.bbc.co.uk/](https://waveform.prototyping.bbc.co.uk/)

~~~
calebj0seph
I had a look into peaks.js - it looks like it supports both server and client-
side waveform generation these days. Server-side generation still makes sense
in some cases imo - like if you have a very long audio file such as a podcast
that you don't want users to download the entirety of just to display a
waveform.

------
est31
Awesome, I love spectograms!

Why did you implement your own FFT instead of using WebAudio?

[http://arc.id.au/Spectrogram.html](http://arc.id.au/Spectrogram.html)

~~~
calebj0seph
Hey, cool demo and article! You clearly have more experience with DSP than me
haha

I was considering using an AnalyserNode since it's implemented natively by the
browser and therefore a lot faster than using a FFT implementation in
Javascript. My biggest issue with AnalyserNode though is that there's no way
to control the window function or overlap amount between windows. While I'm
sure you could make a decent spectrogram with an AnalyserNode (as you've
done!), I think implementing the FFT yourself lets you do more fine-tuning.

When I get some time I might make Spectro use a Wasm FFT implementation like
PulseFFT ([https://github.com/AWSM-WASM/PulseFFT](https://github.com/AWSM-
WASM/PulseFFT)) for better performance. At the moment I'm using jsfft
([https://github.com/dntj/jsfft](https://github.com/dntj/jsfft)) inside a web
worker, which definitely isn't as efficient as a native implementation.

~~~
est31
It's not my code btw, only found the post on the internet. Your points about
the restrictions of AnalyserNode make sense. A wasm solution is indeed the
ideal way to solve it if you want full flexibility.

------
eg312
Very cool! Are you the author?

~~~
shakes
I am not. I just found it and thought it was super interesting.

Caleb Joseph is the original author:
[https://github.com/calebj0seph](https://github.com/calebj0seph)

~~~
calebj0seph
Thanks for posting to HN, blown away by how much interest there's been!

