
Four common mistakes in audio development (2016) - PascLeRasc
http://atastypixel.com/blog/four-common-mistakes-in-audio-development/
======
jcelerier
For less Apple-centric guidelines : [http://www.rossbencina.com/code/real-
time-audio-programming-...](http://www.rossbencina.com/code/real-time-audio-
programming-101-time-waits-for-nothing)

~~~
bitwize
Name me a working professional musician who doesn't use Apple gear (assuming
they use computers at all to make music).

~~~
atoav
I worked in a studio (with 8 work spaces and a Dante network) and there was no
single Apple machine. Of all the musicians I know maybe two use Apple, most
use Windows.

The whole “musicians use Apple”-thing is really not true — at least in my
subjective sample size.

~~~
Jerry2
I used to work for a well-known audio plugins company (if you're a musician,
you've heard of this company) and 70% of our customers were Mac users (that
was back in 2016).

------
cyberferret
As an Audiobus user since pretty much day 1 (and also a Loopy app user), I'd
just like to shout out to the author of this article with a "Thank You" for
revolutionising music creation on Apple's small devices.

I still remember the first time I managed to combine some of my favourite
sound generator & synth apps and create a mix all on my iPad without having to
export things to Garageband/Logic [0], and it was a real game changer on that
platform.

If anyone knows about audio development on iOS devices, this author is the 'go
to' guy!

[0] - [https://soundcloud.com/cyberferret/industrial-ants-
wav](https://soundcloud.com/cyberferret/industrial-ants-wav)

~~~
jacobolus
> _favourite sound generator & synth apps_

Any you’d specifically recommend?

~~~
PascLeRasc
AudioKit Synth One is fantastic, and it's all open-source too.

~~~
djmips
Thanks for the link. To close the loop the audio kernel documentation in
AudioKit Synth One references the OP.

[https://github.com/AudioKit/AudioKitSynthOne/blob/master/Aud...](https://github.com/AudioKit/AudioKitSynthOne/blob/master/AudioKitSynthOne/DSP/README.md)

It's interesting in that it's written in Objective-C++ which in xcode are
suffixed with .mm - This is for performance reasons I assume.

I didn't know about Objective-C++ and here is a link about it. Is it kind of
pushed into a dusty closet? Because I only found an archive.org link.

[https://web.archive.org/web/20101203170217/http://developer....](https://web.archive.org/web/20101203170217/http://developer.apple.com/library/mac/#/web/20101204020949/http://developer.apple.com/library/mac/documentation/Cocoa/Conceptual/ObjectiveC/Articles/ocCPlusPlus.html)

------
CamperBob2
_Let’s say your GUI thread is holding a shared lock when the audio callback
runs. In order for your audio callback to return the buffer on time it first
needs to wait for your GUI thread to release the lock. Your GUI thread will be
running with a much lower priority than the audio thread, so it could be
interrupted by pretty much any other process on the system, and the callback
will have to first wait for this other process, and then the GUI thread to
finish and release the lock before the audio callback can finish computing the
buffer — even though the audio thread may have the highest priority on the
system. This is called priority inversion._

I guess I fundamentally don't understand priority inversion. I've done a lot
of realtime programming but have never had to confront that particular
buzzword.

If my realtime-priority I/O completion routine tries to take a lock that might
be held by an idle-priority GUI thread, why is that a problem? If I'm waiting
on a lock, then _by definition_ I'm not running at realtime priority while I'm
doing so. Thread priorities should apply only to threads that are actually
doing stuff. It would be crazy for the kernel to fail to schedule any other
threads while I'm waiting for one of them to release the lock I need. My
realtime thread is effectively suspended during that wait, so it can't keep
any others from running.

Yes, I might have to wait for other threads and even other processes to be
scheduled. Too bad. If I didn't want that to happen, I shouldn't have tried to
take a lock in a realtime thread. What exactly should I have expected to
happen?

~~~
saagarjha
Something else might start running instead of your realtime-priority thread or
GUI thread.

~~~
CamperBob2
Yes, that's an occupational hazard when you're waiting on a lock. Someone else
might do something. In fact, someone else had _better_ do something, or you
won't get your lock.

IMHO, if your system is designed such that a GUI or other low-priority thread
is contending for the same resources as a realtime I/O thread, in a situation
where the lock might be held for so long that the I/O thread starves, then you
have made a really elementary mistake... one so fundamental that it doesn't
deserve an abstract-sounding term like "priority inversion." I still feel like
there's something I'm missing here.

I mean, I understand that the GUI thread in question might be heavily pre-
empted by medium-priority threads, but eventually the GUI thread _is_ going to
be given enough time to finish whatever it was doing with my lock, and life
will go on. Is this behavior somehow unexpected or counterintuitive?

If kernels are designed such that lower-priority threads _never_ get a time
slice until higher-priority threads block on something or explicitly sleep or
yield, well, that seems like a really foolish call on the part of the kernel
designer. You literally might as well have a cooperative-multitasking OS at
that point.

~~~
Khoth
> I mean, I understand that the GUI thread in question might be heavily pre-
> empted by medium-priority threads, but eventually the GUI thread is going to
> be given enough time to finish whatever it was doing with my lock, and life
> will go on. Is this behavior somehow unexpected or counterintuitive?

The mistake people make is, they think "my high-priority thread has to be
active every 10ms, and the low-priority thread doesn't do much work with the
lock held, 1ms max, so I'm totally fine, there's 9ms of safety factor there".
Priority inversion is why that logic is wrong.

------
pier25
Previous discussion:
[https://news.ycombinator.com/item?id=11907887](https://news.ycombinator.com/item?id=11907887)

------
saagarjha
> Behind Objective-C’s message sending system (i.e. calling Obj-C methods) is
> some code that does a whole lot of stuff — including holding locks.

Only if your method is not found in the IMP cache. Trying to acquire a lock
for every objc_msgSend would be quite slow.

> I decided not to mount an expedition into the source code because I didn’t
> know what to look for, or where to look for it, and I wasn’t even sure I’d
> even find it given that iOS and OS X are both pretty closed-off systems. So,
> we’ll have to take the word of those more knowledgeable than us, at least
> for now.

macOS's libmalloc is open source (if not updated very often):
[https://opensource.apple.com/source/libmalloc/libmalloc-166....](https://opensource.apple.com/source/libmalloc/libmalloc-166.251.2/)

~~~
jjoonathan
The post is about chasing the long tail of latency risk. A high -- even very
high -- probability of not acquiring a lock is not good enough. Investigating
libmalloc and discovering that somehow it had strictly bounded, suitable
execution time wouldn't be good enough either, because it could plausibly
change without notice.

~~~
vnorilo
To get an idea of the length of the tail:

Suppose a deadline of 6ms to produce an audio buffer. If 0.01% of the
callbacks choke on a lock, we get a dropout once a minute, which is quite a
lot. This means we're interested in the 99.99th percentile.

------
NieDzejkob
> It can be helpful to know that, on all modern processors, you can safety
> assign a value to a int, double, float, bool, BOOL or pointer variable on
> one thread, and read it on a different thread without worrying about
> tearing, where only a portion of the value has been assigned by the time you
> read it.

What? NO! This is only relevant when you're writing assembly. In C, you _have
to_ use atomic (or possibly volatile, I'm not sure), or this just undefined
behavior.

~~~
coryrc
If you are assigning something that fits in a single register, how would you
get tearing? (Probably shouldn't have double in that list, in case it's in the
special FPU registers)

~~~
cesarb
The compiler is smarter than you think. Suppose one thread alternates between
writing 32 and 33 to a 4-byte variable, while another thread alternates
between writing 1024 and 1025 to the same 4-byte variable. You might think
that the only possible values are 32, 33, 1024, 1025; but then you read the
variable and find a 1056 or 1057 there. Why? Because the compiler noticed that
the first thread, after its initial write of 32 or 33, only modifies the least
significant byte of the value; so it generated a single-byte write to
overwrite only the modified part of the value.

~~~
reitzensteinm
There's zero benefit to just writing one byte. I'm having a hard time
believing that a compiler capable of such analysis would use it for this
purpose.

~~~
nitrogen
I think the point is that unless the language guarantees atomicity (and the
compiler implements the guarantee correctly), counterintuitive behavior is
permissible and somewhat common, and compiler behavior varies across and
within CPU architectures.

~~~
reitzensteinm
Actually I agree with the general point, OP is right that you should use
volatile. I was just a bit taken back by the specific example above.

However, as PeCaN has pointed out, I'm probably suffering from tunnel vision
from x86 monoculture.

~~~
vnorilo
Volatile is, well, volatile. You can basically only rely on the fact that the
compiler respects source order and keeps all reads and writes (instead of
hoisting and reordering).

I believe msvc used to put in memory barriers. Not sure if it still does.
These days you are better off ignoring the entire keyword and using the
properly specified atomic stuff.

~~~
jcelerier
> I believe msvc used to put in memory barriers. Not sure if it still does.

it does but now there's a flag to disable them : /volatile:iso
([https://docs.microsoft.com/en-
us/cpp/build/reference/volatil...](https://docs.microsoft.com/en-
us/cpp/build/reference/volatile-volatile-keyword-
interpretation?view=vs-2019)).

------
rstuart4133
Wow, is all this really necessary now? Decades ago I recall programmers being
immensely proud of their Z80 queue functions that didn't disable interrupts.
They were needed for the same reasons the article spells out: dealing with
hard real time events. But all it took to break it was another programmer
leaving interrupts off we a wee bit too long in another piece of code, so we
replaced these things with hardware queues that grew bigger and bigger over
time - even UARTS had them.

I am sort of amazed that audio recording hardware doesn't provide at least 100
milliseconds worth of buffering. It's all of 600 bytes of memory on a 48k
bit/sec stream, so it would cost them nothing.

~~~
Doxin
The main problem with audio is that in many cases 100ms of latency is
unacceptable. Having a large queue is a fine solution when latency isn't a big
problem, but in a lot of cases for audio you want to get as close to no
latency as possible.

~~~
rstuart4133
That's true in things like phone calls, but I did read the article looking for
cases like that and didn't see any. It is all about post processing, where
latency isn't an issue. All you need for that case is an accurate clock
embedded in the stream.

It really isn't rocket science, whereas here we have an entire article on
keeping latency down in multithreaded code which should tell you it's both
time consuming and requires an pretty skill full programmer. The rocket
science bit starts after it's been written, and someone who doesn't have the
big picture on where all the latencies are starts touching the code. The
program doesn't break in an obvious way, or even deterministically. Instead
the first symptom you are likely to notice is heisenbugs occurring in
production and the frequency is gradually increasing as the code ages.
Tracking that down and fixing it _is_ rocket science.

Compared to that the price of adding a buffer plus a clock in the next re-
design is less than peanuts. We've been doing this stuff for decades now and
gone through plenty of re-designs, which is why I'm amazed it hasn't happened.

~~~
Doxin
But what makes you think sound cards _don 't_ have buffers? They do, plenty of
it. You can easily cause any modern OS to have multiple seconds of audio lag
by cranking the buffer size. People just generally don't want that from their
system which is makes all this somewhat tricky.

Of course if you're doing batch processing you hardly care about latency and
the problem is trivial except for the fact that if you hit the play button in
your post processing software you _still_ expect instant audio output,
especially for deciding cut points and whatnot.

------
vemv
> Don’t use Objective-C/Swift on the audio thread.

His argumentation is that Objective-C uses locks internally.

But isn't it possible that those locks (as opposed to _application-level_
locks) cause just negligible pauses, because there's no contention on them?

I guess those internal locks exist to make ObjC's runtime dynamism feasible.
One would expect the typical application to not leverage that dynamism (how
many apps continously redefine methods under normal operation?).

~~~
Gibbon1
The problem with locks on a non real time os is that the OS may decide to
interrupt your thread while it's holding the lock. No matter how short that
is. I do embedded I've had similar things happen when the 'lock' is held for
just a few instructions.

Tip: If you suspect crap like this, adding spin delays will make it happen
often enough to debug.

~~~
proverbialbunny
By any chance do you have any references or information on how to do this
(adding a spin delay)? It sounds interesting. -thanks :)

~~~
Gibbon1
In C it's very simple

    
    
       volatile int cnt = 0x1000;
       while(cnt--)
         ;

~~~
hoseja
I've grown so paranoid of optimizing compilers this wouldn't even cross my
mind. But it is correct, right?

~~~
AlbertoGP
Not OP, but the “volatile” qualifier should prevent the compiler from
optimizing it out.

~~~
Gibbon1
Yeah should.

Multicore super scalar processors make me nervous about my assumptions. I
wouldn't put it past one of them to realize that the result is unused and nop
it.

It's always good to check.

------
Nimitz14
Let's say you are recording and processing audio in realtime. You have one
function available which runs on a separate thread and provides you with the
samples (fixed number every call). As I understand this article (and similar
ones), you are not supposed to malloc on that thread (as that thread hanging
could result in dropped samples). But if you want to save the audio, how else
are you going to save it without appending to some array (for which you need
to malloc)?

~~~
bsder
> But if you want to save the audio, how else are you going to save it without
> appending to some array (for which you need to malloc)?

I don't know why everybody is making this so hard: You use a lock-free
statically allocated circular buffer for all communication to or from a real-
time audio thread. Period. Full stop. The audio thread wins under all
contention scenarios. The audio thread _ONLY_ ships data from the circular
buffer to the hardware system or vice versa-- _IT DOES NO OTHER TASK_.
Everything else talks to the circular buffers.

Nothing else works.

The big questions are even if you do that--1) how do you keep the circular
buffer from starving even with your other threads working to fill the circular
buffer? and 2) if the circular buffer starves, what do you do?

~~~
caf
_The audio thread ONLY ships data from the circular buffer to the hardware
system or vice versa--IT DOES NO OTHER TASK._

All you have done in this case is push the problem up one level. The interface
with the hardware already works this way.

The interesting problem is actually doing something more substantial - eg.
deciding which audio to play, or processing it in some way - in a way that
lets you keep feeding that circular buffer on time, every time.

~~~
bsder
> All you have done in this case is push the problem up one level.

Actually, using a circular buffer has transferred a "hard, real time" problem
into a "soft real time" problem. As long as what you want to do has an average
time shorter than the time between audio packets and you have enough audio
packets buffered, you can ride across the occasional schedule miss because the
system was off doing something else (like GC).

For example, VoIP quite often uses a "jitter buffer" for exactly this task.

Now, you still need to keep the circular buffer fed and that may not be easy.
However, it's a _lot_ easier than being 1:1 with a hard audio thread.

> The interface with the hardware already works this way.

Most of the time when I'm interacting with low-latency audio threads, they
generally don't allow me to specify the buffer semantics with very much
flexibility.

------
skybrian
I'm wondering if anyone here does embedded audio synthesis. Is a Teensy the
way to go?

~~~
moron4hire
It's a little slow. I've used a Mega, but all around prefer the Adafruit
Feather line.

~~~
skybrian
That's what I bought the first time, but unfortunately there don't seem to be
good audio libraries for the Feather line, and the Teensy Audio library looks
quite sophisticated. Or did I miss something?

~~~
moron4hire
They should be compatible. It's all Wiring at the end of the day. But IDK, I
just write my own.

------
adamnemecek
Current OS's are fundamentally unsuitable for audiowork. macOS < Windows <
Linux < other OS. None of them really do real-time processing.

It's really bad, I want a new OS where I can actually tell the scheduler what
to do.

Also, check out this music composition app I'm launching soon
[http://ngrid.io](http://ngrid.io).

~~~
tfolbrecht
Why is the realtime Linux patchset + Jack audio server insuffecient for what
you do?

~~~
adamnemecek
A patchset is not an OS. Like I can make it work for myself but good luck
making it work for users.

~~~
iainmerrick
I thought you wanted it for yourself, though?

If you wanted something that lots of other people could use, in principle you
could make your own Linux distribution with a particular set of patches.

~~~
adamnemecek
You can't expect people to use your os.

------
amos19870630
Excellent article. Great explanation. Thanks Michael!

------
amelius
> Rendering live audio is quite demanding: the system has to deliver n seconds
> of audio data every n seconds to the audio hardware. If it doesn’t, the
> buffer runs dry, and the user hears a nasty glitch or crackle: that’s the
> hard transition from audio, to silence.

I think you can do better when a buffer runs dry. Instead of outputting
silence, you could output the same frequency spectrum as you were right before
the event. That way you will not hear any cracks or pops.

And of course you can fade out the effect when the buffer stays dry for more
than a few seconds.

Obviously, you'd have to do some filtering when the audio continues because
that too can introduce cracks.

~~~
mbell
There is really no such thing as 'instantaneous frequency response'. For any
frequency to meaningfully exist, you need data for the corresponding period.
e.g. If the audio contains content to 20hz, you need at least 1/40th to 1/20th
of a second of data for that to materialize.

Put another way - what you are proposing it looping the buffer which is what
some devices do, portable CD players were kinda notorious for it and it
doesn't sound much better than cracks or pops. Computers also have a tendency
to fall into buffer looping when the system hangs (which is likely the result
of the failure mode of realtek codecs).

~~~
amelius
> There is really no such thing as 'instantaneous frequency response'

Yes that's true, I'm proposing something that uses an approximation of it.

Consider it from a different angle: the inner ear essentially performs a
Fourier transform. At every moment the "instantaneous" spectrum determines
which hair cells are triggered. Now what I propose is to keep triggering those
same hair cells (and not any others) when the buffer runs dry. The exact way
of accomplishing this is left as an exercise (though using short windows where
you take a FFT could be a good approximation).

~~~
human20190310
> The exact way of accomplishing this is left as an exercise

Perhaps you should undertake this exercise and let us know how it sounds :)

EDIT: In my experience with audio, when I have a bug that introduces even the
slightest discontinuity (or even just a cusp) in the audio, well short of a
pop to silence, I can still hear a "weirdness". Ears are pretty attuned to
things that sound unnatural. I'm not confident that essentially "forging" the
audio is going to sound natural.

~~~
StavrosK
What if you train a deep neural network on the song so far, so it can generate
plausible-sounding music whenever the buffer drops?

You can even hang intentionally to generate original music!

(/s, please don't)

