
I don’t know who the Web Audio API is designed for - gulbanana
http://blog.mecheye.net/2017/09/i-dont-know-who-the-web-audio-api-is-designed-for/
======
symstym
I've spent quite a lot of time working with the Web Audio API, and I strongly
agree with the author.

I got pretty deep into building a modular synthesis environment using it
([https://github.com/rsimmons/plinth](https://github.com/rsimmons/plinth))
before deciding that working within the constraints of the built-in nodes was
ultimately futile.

Even building a well-behaved envelope generator (e.g. that handles
retriggering correctly) is extremely tricky with what the API provides. How
could such a basic use case have been overlooked? I made a library
([https://github.com/rsimmons/fastidious-envelope-
generator](https://github.com/rsimmons/fastidious-envelope-generator)) to
solve that problem, but it's silly to have to work around the API for basic
use cases.

Ultimately we have to hold out for the AudioWorklet API (which itself seems
potentially over-complicated) to finally get the ability to do "raw" output.

~~~
bsder
The API is frustrating because it is meant to hide the fact that Android audio
sucks giant hairy donkey balls.

If you give Web developers access to raw samples, they are going to expect it
to work. When it doesn't on Chrome on Android, lots of people are going to
start complaining and filing bugs.

So, instead of fixing the audio path, they decided to bury its crappiness
under a "higher-level" API which has fuzzier latency and can be built with
hacks in the audio driver stacks themselves.

~~~
code_duck
Android audio is truly terrible for instrument apps. I don't understand how it
suffices for things like games. I also don't understand why people even bother
to make things like pianos and drum set… The latency is so extreme and
inconsistent that even on recent phones they are useless. In contrast, iOS has
had excellently playable instruments at least as far back as the iPod Touch 4.

~~~
dbrgn
Here's an interesting video on this topic back from 2013:
[https://youtube.com/watch?v=d3kfEeMZ65c](https://youtube.com/watch?v=d3kfEeMZ65c)

------
greghendershott
My first lesson in this was the Roland MPU-401 MIDI interface. It had a "smart
mode" which accepted timestamped buffers. It was great... if you wanted a
sequencer with exactly the features it supported, like say only 8 tracks. It
was well-intentioned, because PCs of that era were slow.

The MPU-401 also had a "dumb" a.k.a. "UART" mode. You had to do everything
yourself... and therefore could do anything. It turned out that early PCs were
fast enough -- especially because you could install raw interrupt service
routines and DOS didn't get in the way. :)

As a sequencer/DAW creator, you really want the system to give you raw
hardware buffers and zero latency -- or as close to that as it can -- and let
you build what you need on top.

If a system is far from that, it's understandable and well-meaning to try to
compensate with some pre-baked engine/framework. It might even meet some
folks' needs. But....

~~~
subwayclub
IIRC games made pretty good use of MPU-401 intelligent mode to drive the MT-32
module. The first really elaborate game scoring work on the IBM platform came
through the MT-32(Sierra picked it up and everyone else followed - it was a
good target for composers but in practice most people heard the music on
Adlib/SB), so I would consider it successful in that niche.

And on that note, what I _think_ Web Audio tried to be was a drop-in kit for
game engines. Getting the full functionality of Unreal into the browser
motivated the requirement for audio processing. But the actual implementation
was muddled from the start: basic audio playback remains challenging(try to
stream a BGM loop instead of load+uncompress and discover to your woe that
it's not going to loop gaplessly, even when the codec is designed to allow
that.) and my hobby stab at an independent implementation ran out of gas when
I tried to get their envelope model working. The spec has a lot of features
but not enough detail, and my morale sank further when I looked at how Chrome
did it(stateful pasta code). I got something half-working, put it aside and
never came back.

OTOH I had also tried Mozilla's system. That was very simple, and I got a
synth working in no time at all with decent performance and latency.
Optimizing from that point would have been the way to do it, but something in
browser vendor politics at that time led to it being dropped.

~~~
CamperBob2
_IIRC games made pretty good use of MPU-401 intelligent mode to drive the
MT-32 module_

Very few games used the MPU-401's intelligent mode, actually. Never mind how I
know, that was a long time ago...

------
roca
Here's my take on the history here: [http://robert.ocallahan.org/2017/09/some-
opinions-on-history...](http://robert.ocallahan.org/2017/09/some-opinions-on-
history-of-web-audio.html) From the beginning it was obvious that JS sample
processing was important, and I tried hard in the WG to make the Web Audio API
to focus on that, but I failed.

~~~
mov
Back there I followed a bit the discussion and your alternative spec, which
was really interesting. If I remember it well, that will take lots of work and
you were the only one working to get it implemented on FF. Are there any plans
to get back to that? Maybe as an independent API for JS sample processing by
workers only, in parallel with WA? Congratulations on your past efforts and
thanks in advance for your answers.

~~~
roca
Audio Worklets are the future of JS audio processing.

------
fenomas
I have to cast a vote in opposition here.

I've been heavily into procedural audio for a year or two, and have had no big
issues with using Web Audio. There are solid libraries that abstract it away
(Tone.js and Tuna, e.g.), and since I outgrew them working directly with audio
nodes and params has been fine too.

The big caveat is, when I first started I set myself the rule that I would not
use script processor nodes. Obviously it would be nice to do everything
manually, but for all the reasons in the article they're not good enough, so I
set them aside, and everything's been smooth since.

So I feel like the answer to the articles headline is, _today as of this
moment_ the Web Audio API is made for anyone who doesn't need script nodes. If
you can live within that constraint it'll suit you fine; if not it won't.

(Hopefully audio worklets will change this and it'll be for everyone, but I
haven't followed them and don't know how they're shaping up.)

~~~
Jasper_
If you're a creative mind and you constrain yourselves to the effects
available in Web Audio, I'm sure you'll be just at home.

The effects are useful in one setting: hobbyist and toy usage, where you
really don't have that many constraints and can play with whatever cool things
are around. That said, I'm sure you'd actually get a lot more mileage out of a
library of user-made script nodes, rather than whatever the browsers have
built for you.

If you're trying to build something production-ready, or port an existing
system to the web, most of the fun toys seem like just that: toys.

AudioWorklets don't look like they would improve things for me, but that's a
topic for another blog post.

~~~
fenomas
I didn't say I was making trivial toy that isn't production ready. :| I just
said I'm not using script nodes, and I think that's what TFA boils down to -
half of it is about script nodes not being usable and the other half is about
sample buffers not being suitable replacements for script nodes.

And obviously not having raw script access isn't a _good_ thing. Nonetheless,
the other nodes mostly work as advertised, in my limited experience so far, so
the stuff that you'd expect to be able to do with them (e.g. FM/AM synthesis)
seems to work pretty well.

> AudioWorklets don't look like they would improve things for me, but that's a
> topic for another blog post.

AFAIK worklets are supposed to be script processor nodes that work
performantly. They wouldn't solve the sample rate problems mentioned in TFA
but apart from that I'd think they should be pretty usable if they someday
work as advertised.

~~~
bigbadotis
Agreed that you can do much more than "trivial toys" with the current WAAPI!
But you can only do a small part of FM without feedback (unless you're just
talking about vibrato as opposed to canonical FM synthesis). Look at the
modulation paths (aka algorithms) of original Yamaha FM synths...

------
kowdermeister
Stopped reading at: "Something like the DynamicsCompressorNode is practically
a joke: basic features from a real compressor are basically missing, and the
behavior that is there is underspecified such that I can’t even trust it to
sound correct between browsers. "

Then if you look into it:

    
    
        dictionary DynamicsCompressorOptions : AudioNodeOptions {
                 float attack = 0.003;
                 float knee = 30;
                 float ratio = 12;
                 float release = 0.25;
                 float threshold = -24;
    

Which are indeed the basics that you need and totally enough for most use
cases.

Check out a vintage compressor that has a dozen implementation as VST plugins:

[http://media.uaudio.com/assetlibrary/t/e/teletronix_la2a_car...](http://media.uaudio.com/assetlibrary/t/e/teletronix_la2a_carousel_1_1.jpg)

~~~
Jasper_
Where's the sidechain input?

~~~
kowdermeister
You can access the reduction property and connect it a gain node of anouther
source.

[https://developer.mozilla.org/en-
US/docs/Web/API/DynamicsCom...](https://developer.mozilla.org/en-
US/docs/Web/API/DynamicsCompressorNode/reduction)

~~~
Jasper_
That's using the output of the compressor to drive another node. A sidechain
compressor has two inputs.

~~~
kowdermeister
I know what sidechain compression is, yes, you need to code a little bit more
:)

------
fzzzy
Mozilla had a competing api that just worked with sample buffers.
Unfortunately it didn't win the standardization battle.

[https://wiki.mozilla.org/Audio_Data_API](https://wiki.mozilla.org/Audio_Data_API)

~~~
EdSharkey
I always thought some browser vendors who own mobile app stores wouldn't
appreciate gamers having access to a distribution channel for great games on
their platform that they didn't control. You can't have great games without
great sound, so their mucking up the Sound API would be a nice way to stall
the emergence.

It's a conspiracy theory, I know. Reality is probably far more boring and
depressing. :/

Like the blog poster, I cut my teeth on the Mozilla API, and I was able to get
passable sound out of a OPL3 emulator in a week's time. Perhaps Mozilla could
convince other browser vendors to adopt their API _in addition to_ Web Audio
API?

~~~
andrewguenther
My theory is that Google used it's influence to hinder the API so they could
work around the problems with Android's audio stack. They pushed for an API
they knew they could get to work on Chrome for Android, rather than fixing
Android (which is supposedly improved in 8.0).

~~~
roca
I doubt that theory. Chris Rogers @ Google drove the Web Audio API design, and
he was recently ex Apple's Core Audio team, and probably neither knew nor
cared about Android. More history: [http://robert.ocallahan.org/2017/09/some-
opinions-on-history...](http://robert.ocallahan.org/2017/09/some-opinions-on-
history-of-web-audio.html)

~~~
mort96
Some person working at Google didn't know or care about Android? It doesn't
seem all too unlikely that while he personally didn't care, his corporate
overlords told him to work within the constraints of Android.

~~~
roca
Chris Rogers started implementing his API in 2009, about three years before
Chrome for Android was first released.

And do you seriously think in 2009 some corporate overlord said to Chris
Rogers, "Android is going to be big, and so is Chrome for Android, but we've
decided Android will have a crappy audio stack for several years, so you need
to design around that"?

~~~
mort96
I didn't know it was in 2009. I don't think it would've been too unlikely if
he started it in, say, 2014.

------
mcbits
I tried making a simple Morse code trainer using the Web Audio API, which
seemed perfectly suited to the task, but I ran into two major problems:

1\. Firefox always clicks when starting and stopping each tone. I think that's
due to a longstanding Firefox bug and not the Web Audio API. I could _mostly_
elminate the clicks by ramping the gain, but the threshold was different for
each computer.

2\. This was the deal-breaker. Every mobile device I tested had such terrible
timing in JavaScript (off by tens of milliseconds) that it was impossible to
produce reasonably correct-sounding Morse code faster than about 5-8 WPM.

I found these _implementation_ problems more frustrating than the API itself.
At this point I'm pretty sure the only way to reliably generate Morse code is
to record and play audio samples of each character, which wastes bandwidth and
can be done more easily without using the Web Audio API at all.

~~~
kobeya
> Firefox always clicks when starting and stopping each tone. I think that's
> due to a longstanding Firefox bug and not the Web Audio API. I could mostly
> elminate the clicks by ramping the gain, but the threshold was different for
> each computer.

You sure it is not due to the sound files you are using not having a
normalized start?

~~~
mcbits
There were no sound files. I used an oscillator and changed the gain with
`setTargetAtTime` to briefly fade in and out. That should prevent any
clicking, but in Firefox it required an excessive amount of time.

------
Tanner
This article focuses on emscripten examples and for good reason! The effort to
resolve the differences between OpenAL and Web Audio has been on-going and
exacerbated by Web Audio's API churn, deprecations and poor support.

That said, this current pull request on emscripten is a fantastic step forward
and I'm very excited to see it's completion:
[https://github.com/kripken/emscripten/pull/5367](https://github.com/kripken/emscripten/pull/5367)

~~~
jpernst
I'm one of the authors of this PR, and yes, WebAudio's baffling lack of proper
consecutive buffer queuing has been no small source of frustration. They seem
to have put so much effort into adding effects nodes and other such things,
but something as simple as scheduling one sound to play gaplessly after
another can't be (easily) done. Requests for such support have been batted
aside as unnecessary, which is funny considering where all the effort is going
instead.

To do it properly would require just giving up on WebAudio's features
completely and doing all the mixing in software via WebAssembly. Honestly
though, if you're going to do that, you may as well just compile OpenAL-Soft
with emscripten and use that, so I opted to just try to get the best out of
WebAudio that I could. Hopefully it's good enough.

------
aaroninsf
Plus one for sure.

I put some weekends into trying to build a higher-level abstraction framework
of sorts for my own sound art projects on top of Web Audio, and it was full of
headaches for similar reasons to those mentioned.

The thing that I put the most work into is mentioned here, the lack of proper
native support for tightly (but prospectively dynamically) scripted events,
with sample accuracy to prevent glitching.

Through digging and prior work I came to a de facto standard solution using
two layers of timers, one in WebAudio (which support sample accuracy but gives
you hook to e.g. cancel or reschedule events), and one using coarse but
flexible JS timers. Fugly, but it worked. But why is this necessary...!?

There's a ton of potential here, and someone like myself looking to implement
interactive "art" or play spaces is desperate for a robust cross-platform web
solution, it'd truly be a game-changer...

...so far Web Audio isn't there. :/

Other areas I wrestled with: • buffer management, especially with CORS issues
and having to write my own stream support (preloading then freeing buffers in
series, to get seamless playback of large resources...) • lack of direction on
memory management, particularly, what the application is obligated to do, to
release resources and prevent memory leaks • the "disposable buffer" model
makes perfect sense from an implementation view but could have easily been
made a non-issue for clients. This isn't GL; do us some solids yo.

Will keep watching, and likely, wrestling...

------
cyberferret
I had a discussion on Twitter recently about a possible use case for WebAudio
- and that was a sound filters - in pretty much the same way as Instagram
popularised image filters for popular consumption.

One thing that really irks me at the moment is the huge variation in sound
volume of the increasing plethora of videos in my social media feed. If there
was some way we could use a real time WebAudio manipulation on the browser to
equalise the volume on all these home made videos, so much the better. Not
just volume up/down, but things like real time audio compression to make
vocals stand out a little.

Add delay and reverb to talk tracks etc. for podcasts.

EQ filters to reduce white noise on outdoor videos etc. also would be better.
People with hearing difficulties in particular ranges, or who suffer from
tinnitus etc. would be able to reduce certain frequencies via parametric
equalisation.

It would be intriguing to see a podcast service or SoundCloud etc. offer real
time audio manipulation, or let you add post processing mastering effects on
your audio productions before releasing them in the wild.

------
trejj
Curiously, reading through Web Audio API bug tracker find items such as
[https://github.com/WebAudio/web-audio-
api/issues/1305](https://github.com/WebAudio/web-audio-api/issues/1305) and
[https://github.com/WebAudio/web-audio-
api/issues/938](https://github.com/WebAudio/web-audio-api/issues/938), that
echo the point from the article quite well. Oh dear..

~~~
kevingadd
The huge set of deficiencies in this API were communicated to the designers
from the very beginning, and unfortunately most of them went unresolved for a
long time (or indefinitely). It's a real bummer.

For a while there was a huge footgun that made it easy to synchronously decode
entire mp3 files _on the ui thread_ by accident. Oops (:

Even better, for a while there was no straightforward way to pause playback of
a buffer. It took a while for the spec people to come around on that one,
because they insisted it wasn't necessary.

------
joshontheweb
I'm running a SaaS built on the back of the Web Audio + WebRTC apis. While it
isn't perfect at all, it is still pretty impressive what progress has been
made in the last few years allowing you to do all kinds of audio synthesis and
processing right in the browser. It seems to me that it is a pretty general
purpose api in intent. The approach seems to be to do the easy low hanging
fruit first and then get to the more complicated things. This doesn't satisfy
any single use case quickly but progress is steady. No doubt it would be nice
if it was totally capable out of the gate but I'm simply happy that even the
existing capabilities are there. Be patient, it will improve vastly over time.

EDIT: I should also add that the teams behind the apis are quite responsive.
You can make an impact in the direction of development simply by making your
needs/desires known.

~~~
j_s
"Record in Lossless WAV" as feature #2 made me smile. Success to you with your
SaaS!

~~~
joshontheweb
Thank you ;)

------
cyberferret
I worked on a (now abandoned) project a while back using Web Audio API, but it
was NOT for Audio at all - in fact, it was to build a cross platform MIDI
controller for a guitar effects controller.

As someone mentioned elsewhere on this thread Android suffered from a crappy
Audio/MIDI library. iOS's CoreMIDI was great, but not transportable outside of
iOS/OSX. Web Audio API's MIDI control seemed a great way to go - just build a
cross platform interface using Electron App and use the underlying WebAudio to
fire off MIDI messages.

Unfortunately, at the time of developing the project, WebAudio's MIDI SYSEX
spec was still too fluid or not completely defined, so I had trouble
sending/reading SYSEX messages via the API, and thus shelved the project for
another day.

~~~
pitaj
I think I know what you mean, but in my mind, MIDI is very much Audio.

~~~
BHSPitMonkey
It's more about expressing events in time than audio necessarily... Sometimes
it's used just to keep other devices in sync with a tempo, sometimes it's used
to control lights, and -sometimes- it tells an actual audio synthesizer when
to start and stop making noise.

------
gwbas1c
> "16 bits is enough for everybody"

Not really, the full range of human hearing is over 120db. Getting to 120db
within 16 bits requires tricks like noise shaping. Otherwise, simple rounding
at 16 bits gives about 80db and horrible sounding artifacts around quiet
parts.

It's even more complicated in audio production, where 16 bits just doesn't
provide enough room for post-production editing.

This is why the API is floating-point. Things like noise shaping need to be
encapsulated within the API, or handled at the DAC if it's a high-quality one.
(Edit) There's nothing wrong with consumer-grade DACs that are limited to
about 80-90db of dynamic range; but the API shouldn't force that limitation on
the entire world.

~~~
Jasper_
In the same vein, sure, the audio production nodes should use floating point,
but for simple playback, which I'd argue is the 90% case, it shouldn't require
me to use floats. Real-time audio toolkits like fmod and wwise all work in
fixed-point formats on device, because the cost of floats is too expensive for
realtime audio.

The floats are only required if you have a complex audio graph -- with a
sample-based API, you can totally do the production in floats, and then have a
final mix pass which does the render to an Int16Array. All in JavaScript.

~~~
gwbas1c
> because the cost of floats is too expensive for realtime audio

Round(Sample * 32767) is really that slow?

If you're doing integer DSP, you still need to deal with 16 -> 24, or 24 -> 16
overhead; and then the DAC still is converting to its own internal resolution.
(Granted, 16 <-> 24 can be simple bit shifting if aliasing is acceptable.)

------
captn3m0
I once attempted to do RS232 decoding in WebAudio (A speedstack timer of mine
does RS232 over aux) and faced these exact issues before giving up.

------
cromwellian
Isn't Web Audio based off of MacOS'x Audio API?

I think the whole point is that Javascript used to be slow, and using the CPU
as a DSP to process samples prevents acceleration. Seems to me what is needed
is like "audio shaders" equivalent to compute/pixel shaders, that you farm off
to OpenAL-like API which can be compiled to run on native HW.

Even if you grant emscripten produces reasonable code, it's still bloated, and
less efficient on mobile devices than leveraging OS level DSP capability.

~~~
iainmerrick
_Isn 't Web Audio based off of MacOS'x Audio API?_

I hadn't heard that, but some of the "processor node" stuff does sound
familiar.

What OS X _also_ has, though, is proper low-level low-latency sound APIs. And
that's why there are so many Mac (and iOS) music apps.

~~~
kevingadd
It is more or less based on OS X audio, yes. The author of the Web Audio spec
previously was an architect on Core Audio at Apple. He basically moved over to
Google, implemented his chosen subset of Core Audio in webkit, and shipped it
prefixed. Then the evangelism group got big players like Rovio to ship apps
that depended on the half-baked prefixed API so it was the de-facto standard
for game SFX on the web.

~~~
iainmerrick
Interesting! It's surprising to me that Apple (or Google?) is behind such a
poor API, after Apple did such a good job with both Canvas and CSS animation.

------
barryhoodlum
How to play a sine wave:

    
    
      const audioContext = new AudioContext();
      const osc = audioContext.createOscillator();
      osc.frequency.value = 440;
      osc.connect(audioContext.destination);
      osc.start();
    

"BufferSourceNode" is intended to play back samples like a sampler would. The
method the author proposes of creating buffers one after the other is a
bizarre solution.

~~~
Jasper_
I picked a 440Hz sine wave because I didn't want to write a more complex demo
example, knowing full well someone would nitpick this.

Please use your imagination and try to imagine one of infinitely many other
streams that I could make at runtime that are not easily made with the built-
in toy oscillators.

~~~
barryhoodlum
It's a higher level API and you're deliberately ignoring all of its higher
level features and concentrating on the part that clearly is underdeveloped.
Maybe you should use your imagination instead of putting a square peg in a
round hole?

~~~
Jasper_
> It's a higher level API

As the title of the post asks: "I don't know who the Web Audio API is designed
for".

The high-level nodes are not featureful enough for professional audio
production, and too slow and underspecified for game engines like FMOD /
Wwise.

The low-level bits fall somewhere between impractical, deprecated, and
useless.

Who is it designed for?

~~~
barryhoodlum
I think you're missing the point entirely. It's like a modular synthesiser.
It's not "serious business" but this is the browser after all.

Plug a few oscillators into each other and you have an FM synth. Feed delays
into each other, etc, etc. You can do that in a few lines of code with no
dependencies.

To me, that's a huge potential audience.

If you want a array of samples and depend on dozens of JS libs for
functionality, well, I'm sure the AudioWorkers will catch up eventually.

~~~
Jasper_
So, the answer to "Who is the Web Audio API designed for", according to you,
appears to be "people who want primitive FM synthesizers".

Not game developers, not professional audio production people. It's... not an
answer I was expecting, but I suppose it's valid.

~~~
Lazare
Talk about damning with faint praise.

------
jancsika
Just from skimming the spec, the AudioWorklet interface looks very close to
what is needed to build sensible, performant frameworks for audio profs and
game designers.

So the most important question is: why isn't this interface implemented in any
browser yet?

That a BufferSourceNode cannot be abused to generate precision oscillators
isn't very enlightening.

~~~
symstym
It's been under development for a very long time in Chromium:
[https://bugs.chromium.org/p/chromium/issues/detail?id=469639](https://bugs.chromium.org/p/chromium/issues/detail?id=469639)

I think there were some false starts where previous specs were written and
then found to have issues.

~~~
kevingadd
For bonus points, prep work done for the eventual rollout of AudioWorklet in
Chromium shipped a bug to release channel Chrome that breaks all uses of Web
Audio. The bug wasn't caught in beta/canary channels because it only affects
some user machines, and they can't revert the bug because of architectural
dependencies. A basic way to summarize it is that AudioWorklet required the
threading structure of Web Audio to change for safety reasons, and this
results in a sort of priority inversion that can cause audio mixing to fall
behind forever. Even simple test cases where you play a single looping sound
buffer will glitch out repeatedly as a result.

So basically, Web Audio is unusable in release Chrome on a measurable subset
of user machines, for multiple releases (until the fix makes it out), all
because of AudioWorklet. Which isn't available yet.

I am being a little unfair here, because this bug isn't really the fault of
any of the people doing the AudioWorklet work. But it sucks, and the blame for
this horrible situation lies largely with the people who designed WebAudio
initially. :(

~~~
jancsika
> But it sucks, and the blame for this horrible situation lies largely with
> the people who designed WebAudio initially

Just skimming it a bit, it seems like they tried to make the same kind of
"managed" framework for an audio graph that SVG spec does for vector graphics.
And even if SVGs are janky the static image still succeeds in serving a
purpose. But if you get dropouts or high latency in audio, there isn't much
more it can be used for. (Aside from browser fingerprinting :)

------
diminish
>> Can the ridiculous overeagerness of Web Audio be reversed? Can we bring
back a simple “play audio” API

To be frank, graphics world had some type of standard (OpenGL) long time ago,
next to DirectX. So WebGL had a good example. However in the audio world we
haven't seen a cross platform quasi-standard spec covering Mac, Linux and
Windows. So IMHO, non-web audio lacks also common standards for mixing, sound
engineering, music-making. That's why web audio appears to lack a use case.
IMHO, that smells opportunity.

I use Web Audio, in canvas-WebGL based games where music making is needed. I
understand the issues - we definitely need more than "play" functionality.

~~~
mikepurvis
Whether the API could be used to play MOD files is a good litmus test of its
suitability for a variety of purposes. Covers repeatedly playing samples at
differing volumes and pitches, simultaneously.

I'd rather have a comprehensive API that someone can dumb down than one that's
so crippled as to be unusable beyond very basic functionality.

~~~
rzzzt
A FastTracker 2 player was discussed on HN quite some time ago (spoiler: uses
ScriptProcessorNode):
[https://news.ycombinator.com/item?id=10538791](https://news.ycombinator.com/item?id=10538791)

------
raphlinus
I think things will get a lot better when the underlying enabling technology
is in good shape. The audio engine needs to be running in a real-time thread,
with all communication with the rest of the world in nonblocking IO. There are
lots of ways to do this, but one appealing path is to expose threading and
atomics in wasm; then the techniques can be used for lots of things, not just
audio. Another possibility is to implement Worker.postMessage() in a
nonblocking way. None of this is easy, and will take time.

If we _had_ gone with the Audio Data API, it wouldn't have been satisfying,
because the web platform's compute engine simply could not meet the
requirement of reliably delivering audio samples on schedule. Fortunately,
that is in the process of changing.

Given these constraints, the complexity of building a signal processing graph
(with the signal path happening entirely in native code) is justified, if
those signal processing units are actually useful. I don't think we've seen
the evidence for that.

I'd personally be happy with a much simpler approach based on running wasm in
a real-time thread, and removing (or at least deprecating) the in-built
behavior. It's very hard to specify the behavior of something like
DynamicsCompressorNode precisely enough that people can count on consistent
behavior across browsers. To me, that's a sign perhaps it shouldn't be in the
spec.

Disclaimer: I've worked on some of this stuff, and have been playing with a
port of my DX7 emulator to emscripten. Opinions are my own and not that of my
employer.

~~~
Jasper_
> If we had gone with the Audio Data API, it wouldn't have been satisfying,
> because the web platform's compute engine simply could not meet the
> requirement of reliably delivering audio samples on schedule.

1\. I'm not convinced this is the case. From what I see, GC pauses constitute
the big blockers, rather than event processing and repaints. Introducing an
API that's friendlier to GC would be a huge win here.

2\. We have WebWorkers. What would have prevented a WebWorker from calling new
global.Audio() for the Audio Data API?

~~~
raphlinus
1\. This is going to depend a lot on the app; doing an actual DAW is going to
require some pretty heavy processing. It also depends on the performance goal.
Truly pro audio would be a 10ms end-to-end latency, which is extremely
unforgiving.

2\. Some form of WebWorker is obviously where we're going. But does
postMessage() have the potential to cause delay in the worker that receives
it? (There are ways to solve this but it requires some pretty heavy
engineering)

------
Pxtl
So it sounds like the web audio API is every good at modeling audio as the web
GUI API is at modeling GUIs.

------
throwa34943way
For reference : [https://www.audiotool.com/app](https://www.audiotool.com/app)
( using Flash ).

You just can't do that with the same level of tightness of rhythm on low
hardware with web techs today. Flash was bad yet Flash also opened up insane
possibilities on the web when it comes to multimedia applications that just
can't be matched with Webtechs. ASM.js might fill the gap, but i haven't seen
any equivalent yet.

------
roma1n
I briefly tried Web Audio to implement a Karplus-Strong synthesizer (about the
simplest thing in audio synthesis I guess?).

Without using ScriptProcessorNode, there was no way of tuning the synthesizer
because of the limitation that any loop in the audio graph has a 128 samples
delay at least.

Maybe a more "compilation-oriented" handling of the audio graphs (at the
user's choice) could help overcome this?

------
k__
Are there any good audio APIs out there?

Video AND audio? They good you all covered with nice APIs!

Just audio? You're screwed!

~~~
sporkologist
Audio always seems to be neglected and underfunded, compared with video. BeOS
seems like one of the ones that did it right.

------
_pmf_
Now step back and honestly think about which web API is actually powerful and
nice to use and makes the impression that it has been carefully crafted by a
domain expert.

I cannot think of one.

------
derefr
Question: is the "point" of Web Audio to expose the native hardware-
accelerated functionality of the underlying audio controller, through a
combination of the OS audio driver + shims? Or is it more an attempt to
implement everything in userspace, in a way equivalent to any random C++ DSP
graph library? I've always thought it was the former.

~~~
Jasper_
Most consumer-grade audio hardware really only does playback. We've been doing
software audio since around the turn of the century.

In Chrome's implementation, none of the mixing, DSP, etc. go through the
hardware, and I'm more than certain that's the case for every other browser
out there.

~~~
derefr
Audio controllers do at least do hardware-accelerated _decoding_ of audio
streams in e.g. H.264, though, yes?

But my question was more like: is Web Audio a mess mostly because it's an
attempt to _expose_ the features of the twenty-odd different OS audio backends
on Windows/Mac/Linux, where the odd inclusions and exclusions map to the
things that all the OS audio backends happen to share that Chrome can then
expose?

~~~
symstym
> is Web Audio a mess mostly because it's an attempt to expose the features of
> the twenty-odd different OS audio backends

That is a good guess, but no. The main features of the Web Audio API (built-in
nodes, etc.) are not backed by any kind of OS-level backend, it's all
implemented in software in the browser. The spec design was based on what
someone thought were useful units of audio processing. It's not a
wrapping/adaptation of some pre-existing functionality.

------
mncharity
Web api standardization for VR/AR is _currently_ a work in progress. And it's
been... less than pretty.

So if you've been wanting to try some intervention to make web standards less
poor, or just want to observe how they end up the way they do, here's an
opportunity.

------
creatonez
>you can’t directly draw DOM elements to a canvas without awkwardly porting it
to an SVG

This is not a wart, this is a security feature. Of course, it wouldn't be a
necessary limitation if the web wasn't so complicated, but the web is
complicated.

~~~
Jasper_
What's the security issue in play here?

~~~
creatonez
Just one example. The canvas API can grab the image data on the canvas. If you
could rasterize arbitrary DOM nodes then you could very easily fingerprint
users by, say, checking which fonts are installed. You could also load
external resources such as images and iframes bypassing same-origin policy, so
if your bank's website was configured incorrectly, a malicious site could
steal information by taking screenshots of a canvas.

~~~
Jasper_
You can already draw non-same-origin images to the canvas using drawImage.
This marks a special "origin-clean" flag which is checked when someone tries
to call toDataURI or getImageData on the canvas [0] I would be OK if drawing
any DOM node to the canvas cleared the origin-clean flag.

[0]
[https://html.spec.whatwg.org/multipage/canvas.html](https://html.spec.whatwg.org/multipage/canvas.html)

------
pcsanwald
Kinda tagential to the thread, but what's the best book for an introduction to
audio programming for an experienced, language agnostic coder (java, c, c++,
obj-c, etc)?

~~~
soundwave106
I'm not sure about "best", but I got a lot out of Will Pirkle's two books on
programming synthesizer / effect plugins
([http://www.willpirkle.com/about/books/](http://www.willpirkle.com/about/books/)).

For a more generic guide I've heard a lot of good things about a free (in
electronic form) book called DSPGuide
([http://dspguide.com/](http://dspguide.com/)). Haven't had a chance to dive
into this one, though.

------
rl3
> _It [WebGL] gives you raw access to the GPU ..._

Not to be semantic, but that's technically incorrect. Indeed, if WebGL were to
be supplanted by a lower-level graphics API, that would make a lot of people
happy.[0]

As far as the author's thesis concerning the Web Audio API: I agree that it's
a total piece of shit.

[0]
[https://news.ycombinator.com/item?id=14930824](https://news.ycombinator.com/item?id=14930824)

~~~
rl3
Not to be _pedantic_ , rather.

I've come to suspect that my phone's autocorrect functionality, HN's two-hour
edit window, and my own brain routinely conspire against me to paint a picture
of total idiocy.

------
dmitriid
One word: w3c.

I've said it before, I'll say it again: it exists in a vacuum, and is run by
people who have never done any significant work on the web, with titles like
"Senior Specifications Specialist". Huge chunks of their work is hugely
theoretical (note: not academical, just theoretical) and have no bearing on
the real world.

~~~
pcwalton
The Web Audio API was the work of the Chrome team, not the W3C in isolation.
I'm right there with you as far as W3C criticism is concerned, but they don't
deserve the blame in this case.

~~~
roca
It was really the work of Chris Rogers. If more core Chrome people had been
involved, I suspect things would have worked out better.

------
irascible
I disagree with a lot of the assertions in this blog. You have to suspend some
of your expectations since this is all .js. you can't have a js loop feeding
single samples to a buffer. Js isn't deterministic to that level of
granularity, but overall it's fast enough to generate procedural audio in
chunks if you manage the timing. If you check out some of the three.js 3d
audio demos you can see some pretty cool stuff being done with all those crazy
nodes the auThor is decrying. He'll I wrote a tron game and did the audio
using audio node chains and managed to get something really close to the real
tron cycle sounds, without resorting to sample level tweaking.. and with > 16
bikes emitting procedural audio.. I think more focus on the strengths than
weaknesses is in order.. and if you really want to peg your cpu.. you can
still use emscripten/webasm or similar to generate buffers, if that's your
thing..

~~~
Jasper_
> Js isn't deterministic to that level of granularity

Why not? I linked a test app [0] in my post that generates generate PCM data
on demand, and fast. It works deterministically on all the browsers. Mozilla
certainly implemented AudioData back in 2011 and it was fast enough for them.

> He'll I wrote a tron game and did the audio using audio node chains and
> managed to get something really close to the real tron cycle sounds, without
> resorting to sample level tweaking

Why couldn't this be a high-level userspace library like three.js? Yes, with a
lot of creative energy, you can recreate a lot of sounds, I'm willing to
believe that. But I think a low-level API would have been more useful from the
getgo.

[0]
[http://magcius.github.io/spc.js/spc.html](http://magcius.github.io/spc.js/spc.html)

------
megamindbrian
Web audio can be used with this: [https://wavesurfer-
js.org/](https://wavesurfer-js.org/)

~~~
CharlesW
I'm not making the connection between the link and the point you're trying to
make. Can you elaborate?

~~~
megamindbrian
I was expressing my excitement over something neat which implicitly answers
the query in the title that it was designed for me. People who think wave
analysis is neat. There are at least 3 other posts on here describing the
basic features of wave analysis are just fine. Thanks for the down votes!

------
Camillo
The font on this page is insanely thin on my browser (Chrome, OS X).

------
revelation
The major problem of this API is that they couldn't just copy something
designed by people with actual knowledge, as in WebGL. So it was design by
committee that does so much the application should handle but has so deficient
core capabilities no application can rectify any of it.

~~~
stupidcar
Nope. Web Audio was designed almost entirely by a single person, Chris Rogers,
an engineer with a long history of working on audio for Google, Apple and
Macromedia[1]. Whatever Web Audio's problems, design-by-committee is not their
cause.

[1]
[https://www.linkedin.com/in/diagonal/](https://www.linkedin.com/in/diagonal/)

~~~
bsder
> Web Audio was designed almost entirely by a single person, Chris Rogers, an
> engineer with a long history of working on audio for Google, Apple and
> Macromedia[1].

Who at Apple beat him with a stick to get audio right? Can we get that person
to design the audio API's for the Web and Android?

(Edit: I realized that this was an unfair comment born of my frustration with
Audio APIs from Google.

The real issue driving this is that audio is _still_ a dumpster fire on
Android. So, if he gives web developers access to audio samples, everybody is
going to expect it to work. And, on Android, it will fail miserably. So,
better to isolate audio functions, give them "fuzzy" latency which you can
bury in C code drivers, and hide the fact that audio on Android is a flaming
pile of poo rather than piss off even more developers and get even more bugs
filed against Android's shitty audio.)

------
arc_of_descent
The Web Audio API is designed for web developers who would want to integrate
sound into their web apps. Notifications, etc.

That pre-browser era where we would have sounds for everything. Minimize
window, user logged in, logged out, all that crap.

Also the API has good support for visual. Spectrum analysis. This is pretty
good for an education course to offer for beginners on sound processing.

I wouldn't use it for anything serious like a DAW.

~~~
kevingadd
This explanation doesn't fit, because <audio> already solved all the scenarios
you're describing. Web Audio attempts to solve other problems, and does a bad
job of it.

~~~
arc_of_descent
Yes, Audio does work. But the Web Audio API has some things more to offer for
the simpler stuff. Simple games, timing, effects, detune, etc.

