As an audio developer, making a timer metronome is the wrong approach. Still, many metronome apps with this design flaw exists and... they perform bad.
(that's why prescheduled audio events works best...)
Timers in general will jitter.
Audio processing usually involves a callback to process an audio block (array of audio samples... floats).
To simplify things, this audio block runs on realtime thread and copies the block to desired output. you're guaranteed it'll be continuous between each call.
(you can cause overflow/stutter if you hold it for too long).
With setTimer approaches each callback WILL have offset.
But if you need to synchronize things or your metronome needs to run with other audio, it WILL be jerky.
That's why a proper metronome would also be 'sample accurate'.
WebAudio / AudioContext was made exactly to solve that!
similar to common audio processing standard, you end up getting a callback and usually need to use WebAsm/C++ under the hood to keep the golden DSP rule - never allocate on block processing callback.
btw, on Firefox 69 (macOS) only the pre-scheduled audio made a sound :)
This is completely true. I just want to add that the problem with currently standard WebAudio is that the audio callback still goes into the main js context/thread, so any custom callbacks via ScriptProcessorNode can be pre-empted by pretty much anything. That means that the target latency is the upper time bound for any js callback or work unit, and it gets really iffy when you approach real-ish time (10ms or so).
The next attempt is AudioWorklet, which runs in a separate js context and on the audio thread. Sadly still exclusively in Chrome, so I can't be bothered to invest heavily in it.
Very true. Another upside of deriving all timing from actual outputted samples: If your application have a sequencer which does "bouncing" (i.e. generating a recorded file, or similar, from the project) you can just run the sequencer as fast as possible in a loop which just writes audio to a file without any concerns about real time, etc.
Not directly on topic, but I'm very interested in getting started in audio development. Any good suggestions for resources for someone with a strong math and programming background?
I’ll start with JUCE (https://www.juce.com) just because it’s easy to get started. It’s similar to QT but oriented towards audio and can produce VST3/AU or audio apps pretty much out of the box.
Once you get the hang of it it’s all about DSP. FFT, filter design etc.
Also most actual DSP code is C/C++ depends on deployment needs.
another approach is to try Faust, PureData. Look those up and see what’s looks as a good starting point for you.
This is a really great writeup -- it's concise and to the point, the demos are good.
It's not just audio; timing in general is kind of terrible in Javascript. Spectre/Meltdown made it worse, because now some browsers introduce arbitrary jitter for security purposes. The advice I've always seen is to try and avoid ever doing your own timing in Javascript -- use `requestAnimationFrame` instead of `setTimeout`, use web audio scheduling instead of `setTimeout`. Basically, avoid `setTimeout` for most high-precision things if possible.
I don't know what the status is with shared array buffers and WASM. At one point I was very excited about the potential of WASM to get around some of the problems above, but last I checked, this was being partially held back by the same security concerns.
The desktop has spoiled everybody. Everybody wants web apps to act like and have the control of desktop apps. So far, it's not possible without top IT rocket scientists, and the next browser release will probably break the rocket anyhow.
I don't think it was concise. It could be shortened to "timers in js are extremely imprecise and not useful for real-time audio. use the audio API, which has real-time scheduling for audio." and then maybe go deeper and show examples of that.
I guess my next question then is... what are you doing in javascript that you need this level of precision? If you need this level of precision, there are so many other things that are going to mess with you, other than timing problems.
I'm experiencing a lot of jitter on your page using Chrome 76 on Android.
- At 60bpm it's clearly noticeable.
- At 120bpm every 12-15th tick is noticeably delayed.
- At 300bpm it's playing two ticks at very close to the same time about every 5th tick.
your music apps are great. wondering if you'd be interested in putting them in front of hundreds of thousands of musicians from around the world via the online leader in guitar instruction (TrueFire, my company) - let me know if you want to learn more - zach at truefire dot com - great work!
I want to weigh in about this whole perceivable jitter thing.
I think it’s important when making music related programming decisions to recognize there’s a whole area of perception in between actually conscious “hey that metronome is off!” and actually being imperceptible. In that area the feel and impact of music can be altered while no one can pinpoint why.
For safety’s sake I think sub-millisecond timing in controllers and things like metronomes needs to be the standard.
The ideal should be audio rate accuracy, when it can be reasonably achieved.
It’s really bad to fall in the trap of thinking that just because no one can point out a problem, it isn’t having an effect. Especially with audio where people have trouble explaining what they are hearing.
Keep in mind that the testing methodology may have already accounted for your concerns. I would guesstimate that mechanical vinyl and tape players have rate variations on the order of milliseconds.
Good point, but jitter is within a phrase, speeding something up and down smoothly would be entirely different than discontinuous random timing variation.
I am not sure exactly what the tests and methodologies are that different people refer to, but I do know that when I have brought up this concern to instrument and software developers who are operating at the cutting edge of this stuff and really should know the answer, they never bother to debate it until you get to the difference between sub-millisecond and audio-rate.
It is axiomatic that anything that has less resolution than audio rate can be perceived, under the correct circumstances.
For example, if you had two metronomes which each played the same wide-frequency burst and had independent jitters on their start time, the combined sound would likely shift in timbre due to the phase relationship of the summed waves.
If you had those two outputs going to a stereo output, one on the left channel, and one on the right, the resulting effect should be that the "click" will randomly pan around the soundfield in the listener's perception.
So, I guess it also depends on not only your use case but how much of a hassle it is to get the right resolution. I would be really sad if I subtly messed up some musician's sense of time for years in the future because they were practicing diligently with some jittery metronome I made.
> In that area the feel and impact of music can be altered while no one can pinpoint why.
This has happened to me on many occasions. Having a part that needs to be in-the-pocket accidentally shifted (due to MIDI latency on weird gear or bad sample editing) just throws everything off in a way that's really hard to even identify.
The age old solution to this problem continues to work just fine - have jitter-tolerant low frequency scheduler callbacks that schedule events using an accurate clock (sample time) a short way (100ms or so) into the future. That way you get both accurate audio with timely response. Been doing this kind of thing for 20+ years now .. web or noweb.
I use an encapsulated "steller" library to do this with JS on the web. Steller can sync graphics with audio too.
Really wish the joint "get time stamp" function that gives current DOMHiresTimestamp and the synchronized AudioContext currentTime were available on all browsers.
Nicely done. I wonder: what is the use case that prompted this investigation? 40-50 milliseconds is correct in terms of latency that musicians notice. If you're aiming at any realtime musical collaboration over the internet, the Big Problem is network latency. Unpredictable and just too long. And, if you're not working over the internet, why use a browser?
Edit: https://hello-magenta.glitch.me seems to be the use case. Cool and inevitable that this would be worked-on, but after spending years with both algorithmically-supported composition and all-human composition, I'm skeptical. There is a special sauce that machines will never grok. Or, I'm wrong.
I wrote a bad metronome using setInterval at some point (http://server.saagarjha.com/metronome) and had somewhat different issues: if you moved the page to the background, my browser would throttle it and coalesce the events so it wouldn’t actually “tick” at the right rate anymore. And for whatever reason, in WebKit setting 60 bpm just doesn’t make any noise at all…
hard to imagine an environment with worse input response and timing behavior than web browsers. Buffering audio output would be easiest way to compensate for callback jitter, but shortening delay between user input and a note playing seems unresolvable.
Cool article, but there's one additional experiment worth trying: use recursive `requestAnimationFrame()` calls. On most browsers this will give you a relatively steady 60fps/17ms callback rate.
This is intended for game loops but it is a nice timer that targets iterations per second (FPS) using requestAnimationFrame. This is my preferred solution when I cannot just read the system clock on a recursive setTimeout.
and to say that 20 years ago people were laughing at MS windows and considering it as a completely non-viable platform for pro audio for its 10ms-accurate timer...
This is similar to a problem I ran into when building Bezie. I needed to keep track of the current tempo to send out MIDI clock events. I ended up using a web worker since `setInteral` was unreliable in backgrounded tabs. Here's the worker: https://github.com/jperler/bezie/blob/master/app/workers/tic...
This brings back so many vivid memories of me trying to write a hybrid mobile app a few years ago which had a metronome feature. I wrestled with a pure Javascript metronome clicker for so long, and eventually built that component using WebAudio which was much more accurate and reliable.
I did start writing a blog post about it but never finished it. This article is a much better and more in depth analysis of the problem anyhow.
Don't know about JS, but speaking of metronomes in general, for Android there's a few of them on F-Droid, and last time I checked, the only one that worked consistently is the one that's integrated in an old app Practice Hub. I don't get it, my phone has a clock that seems to work ok, how hard can it be to click in time?
I enjoyed reading this. Playing with a problem is such a great way to understand it. The charts are hard to read though - I wish they showed something like absolute error, or at least omitted the initial zero point so the y-axis could be centered around the target.
Timers in general will jitter.
Audio processing usually involves a callback to process an audio block (array of audio samples... floats).
To simplify things, this audio block runs on realtime thread and copies the block to desired output. you're guaranteed it'll be continuous between each call. (you can cause overflow/stutter if you hold it for too long).
With setTimer approaches each callback WILL have offset. But if you need to synchronize things or your metronome needs to run with other audio, it WILL be jerky. That's why a proper metronome would also be 'sample accurate'.
WebAudio / AudioContext was made exactly to solve that!
similar to common audio processing standard, you end up getting a callback and usually need to use WebAsm/C++ under the hood to keep the golden DSP rule - never allocate on block processing callback.
btw, on Firefox 69 (macOS) only the pre-scheduled audio made a sound :)