
Songbird: Spatial Audio Encoding on the Web - runesoerensen
https://google.github.io/songbird/
======
drewbitllama
Hello everyone, thanks for checking out the new repository. I've resolved the
CDN link issues, but feel free to file any more issues at
[https://github.com/Google/songbird/issues](https://github.com/Google/songbird/issues).
Looking forward to seeing all the great stuff you all make with it. :)

~~~
toomim
Thanks! I'm confused about what this is, exactly, from the description:

> Songbird is a JavaScript API that supports real-time spatial audio encoding
> for the Web using Higher-Order Ambisonics (HOA). This is accomplished by
> attached audio input to a Source which has associated spatial object
> parameters. Source objects are attached to a Songbird instance, which models
> the listener as well as the room environment the listener and sources are
> in. Binaurally-rendered ambisonic output is generated using Omnitone, and
> raw ambisonic output is exposed as well.

My confusion is that The Web Audio API [1] also supports real-time spatial
audio for the Web [2]. It looks like Ambisonics is a format that encodes
spatial audio into a fixed set of audio channels, rather than just playing
audio into PannerNodes directly.

Some questions: (1) Is Songbird indeed an alternative to the PannerNode API
like I'm suspecting? (2) If so, why would you want to downmix your audio into
a set of intermediate channels, rather than play each source directly into a
PannerNode? (3) Is there any advantage to using Omnitone, which I suspect does
the HRTFs, rather than using a PannerNode and its HRTFs directly?

[1] [https://developer.mozilla.org/en-
US/docs/Web/API/Web_Audio_A...](https://developer.mozilla.org/en-
US/docs/Web/API/Web_Audio_API)

[2] [https://developer.mozilla.org/en-
US/docs/Web/API/Web_Audio_A...](https://developer.mozilla.org/en-
US/docs/Web/API/Web_Audio_API/Web_audio_spatialization_basics)

~~~
drewbitllama
Hi Toomin,

Thanks for your interest! Let me try to clarify and answer your questions:

1\. Songbird is indeed an enhanced alternative to PannerNode.

2\. It internally works with ambisonics, but outputs stereo (we use Omnitone
internally to render the multichannel audio down into a stereo track).

The general reason people use ambisonics instead of direct HRTF rendering is
because ambisonics allows for head rotation prior to rendering, so the user
can easily turn their head, etc. without you having to adjust all the incoming
sources' hrtfs.

The reason we feel Songbird is an upgrade to PannerNode is three-fold:

One, you can control the quality of the localization/spatialization effect by
adjusting ambisonicOrder (1st to 3rd, atm).

Two, PannerNode is costly... 2 convolutions per source, while songbird is a
fixed number of convolutions irregardless of the number of sources, so it ends
up allowing you to get more for less.

Three, PannerNode doesn't support any sort of room modelling and Songbird
produces spatialized (ambisonic) room reflections and reverberation.

Hope this helps clarify things! :)

Cheers, Drew

~~~
toomim
That's exactly the information I needed! Thank you! It would be great to have
this on the homepage.

~~~
drewbitllama
Noted! Will add a better explanation to the README. :)

------
TheRealPomax
Everytime I see something like this rolling out of Google, all I can think is
"if only this was a real project, and not yet-another-Google-shop-project".
And no, that's not fair towards the people who make it, but then the entire
thing is clearly marked as copyright Google, not copyright the people who
deserve the credit, so Google doesn't even want me to think of this as
something cool made by cool people, but another library pumped out by Google
for the betterment of a market position.

~~~
voiper1
What's bothering you? Google apparently allows people to make awesome stuff
during business time, and then released it under Apache License 2.0, which
seems to be a pretty permissive license
([https://github.com/google/songbird/blob/master/LICENSE](https://github.com/google/songbird/blob/master/LICENSE))

It may not get the attention it deserves, but no project is guaranteed to have
a maintainer-for-life.

~~~
TheRealPomax
Mostly the part where it stays a Google project, and despite Apache licenses,
no one can contribute without signing a CLA that discriminates against any
developer without a phone or a permanent address. "The lawyers insisted" means
I don't like the way you think you're releasing useful code into the world
when you're really presenting something you locked down so hard people need to
give you their direct personal information before you let them help make it
better (Facebook does the same thing, which is why I don't contribute to their
projects despite loving quite a few of them and using them daily).

------
igorgue
OT: This brings me back...
[https://en.wikipedia.org/wiki/Songbird_(software)](https://en.wikipedia.org/wiki/Songbird_\(software\))

~~~
oatmealsnap
Yea...I thought that was gaining a second life. :(

------
bowmessage
Hmm. None of the examples are working for me. "The HTMLMediaElement passed to
createMediaElementSource has a cross-origin resource, the node will output
silence." in the console.

macOs Sierra 10.12.5 Firefox 55 64-bit

~~~
PaulHoule
Worked OK on Microsoft Edge.

The spatial effect is not that bad, and that is speaking as someone who (1)
loves 5.1 and other surround in games, movies and music, and (2) is usually
unimpressed with headphone surround.

~~~
drewbitllama
:)

------
jhurliman
This is great to see! I have a side project to port the old Peep Network
Auralizer to a web based project[1] and the code already has the concept of
mono sounds originating at different points in a 3D space, so this should be
relatively straightforward to integrate. I was aiming for Android/iPhone
compatibility though; is there any multichannel audio support on mobile yet or
would I need a fallback?

[1]
[https://github.com/jhurliman/webpeep](https://github.com/jhurliman/webpeep)

~~~
drewbitllama
Hi!

Songbird renders stereo-out using Omnitone internally, so Android/mobile is
certainly supported. Feel free to file any issues you have at
[https://github.com/Google/songbird/issues](https://github.com/Google/songbird/issues).

Cheers, Drew

------
toomim
How is this better than PannerNode, which is built into HTML5? I can hear the
difference in the demo, but can someone put words to it?

~~~
moolcool
I think part of it is room simulation. I might be wrong, but I don't think
PannerNode supports things like the room materials like in the 2nd demo

------
briankwest
FreeSWITCH can do this in mod_conference using OpenAL, its pretty sweet.

------
adzm
I've had lots of trouble with WebAudio and rendering to a file as output,
though that was in the infancy of the relevant technology. I can't seem to
find any information on this for Songbird nor Omnitone, though, but if it
works it opens up a bunch of possibilities.

------
mnsc
Had my headphones on backwards and got confused when I moved the S and L
around. The S was obviously the "Subject"/me, but I couldn't figure out what
the L was for. I settled for "Loud thing".

~~~
drewbitllama
Haha. Actually, you might have had your headphones on correctly all along. S
was "Source" and L was "Listener". I've added some clarification to the
examples. Thanks for testing it out!

------
dharma1
Are there any FOSS tools for capturing custom HRTFs? With photogrammetry for
instance? I found a paper using a Leap Motion for a custom HRTF capture, but
they didn't publish any code

[https://home.deib.polimi.it/antonacc/pubs/ICASSP_16_2.pdf](https://home.deib.polimi.it/antonacc/pubs/ICASSP_16_2.pdf)

------
macawfish
Woah!!! This is perfect for a project I'd like to do....

------
sabujp
ahh, this is like the thing in realtec audio mixer that lets me change the
room type, but in my browser and in realtime

------
olleromam91
This is neat!

Can you share what techniques you are using for reverberation processing?
Early and diffuse reflections?

~~~
drewbitllama
We're using a room acoustics model that captures early and late reflections
based on the acoustic properties (dimensions and materials) of the room. :)

~~~
science404
Worth mentioning that this is just a box-shaped room. How many early
reflections are calculated?

~~~
drewbitllama
Hi Science404, yes indeed we're launching with the standard shoebox for now,
but obviously we're thinking about the future too. :) Currently we calculate
listener-based 1st-order reflections, optimized for performance. Once the
ecosystem out there gets faster, we can explore fancier methods. ;) You can
see this in EarlyReflections.js

Cheers, Drew

------
mycall
I'm so glad ambisonics is finally getting its day.

