
Show HN: Working on a new network transport for PulseAudio and ALSA - gavv42
https://gavv.github.io/articles/new-network-transport/#
======
gavv42
A brief summary.

I'm working on Roc, a toolkit for real-time streaming over the network. Among
other things, it provides command-line tools and PulseAudio modules that can
be used for home audio. It can be used with PA, with bare ALSA, and with the
macOS CoreAudio.

The main difference from other transports, including PulseAudio TCP and RTP
streaming, is the better quality of service when the latency is low (100 to
300 ms) and the network in unreliable (Wi-Fi). The post explains why and
provides some comparison, usage instructions, and future plans.

There is still a long way to go, and now we're looking for people thoughts and
feedback. Do you find the project useful? How would you use it? What features
would you like to see?

~~~
autopoiesis
I see you've got Opus on your to-do list. I would really appreciate that! I
find Opus (appropriately configured) to be audibly indistinguishable from CD
audio, and it would really help with the bandwidth requirements.

I've always been really excited by the possibilities implied by PulseAudio's
network capabilities, but disappointed by their latency and bandwidth
requirements. Roc + Opus would be amazing.

~~~
zneveu
Check out [https://github.com/eugenehp/trx](https://github.com/eugenehp/trx)
for Opus streaming inspiration, I've played around with their code and found
it easy to work with. Opus would be great with ROC because in case of buffer
over/under runs the codec provides features to mask dropouts based on previous
content. This is critical when using Wi-Fi.

~~~
gavv42
Thanks.

> Opus would be great with ROC because in case of buffer over/under runs the
> codec provides features to mask dropouts based on previous content. This is
> critical when using Wi-Fi.

Are you talking about its PLC or FEC? I didn't test it yet and I'm interested
if people are using both of them with music.

BTW it would be also interesting to combine our FECFRAME support with Opus.

------
summm
Sounds great! What about encryption? If that is used in any environment not in
complete control of the user that should be mandatory. E.g. in an open wifi, a
shared wifi, or directly over the internet. As the protocol is using RTP
anyway, it should be easy to slap on SRTP or DTLS? For the beginning, it may
even be sufficient to use a static symmetric key. Or directly use WebRTC which
has that already included, but what about the FEC then?

~~~
summm
Update: I just discovered SRTP in the advanced features list. Awesome!

~~~
gavv42
Yeah, SRTP is in the list :) It's not the highest priority right now, but I'll
get to that sooner or later (sooner if somebody will be asking for it).

------
kylek
This is cool! I've recently been playing with streaming local audio to another
system (a raspberrypi with a dac). I tried Pulseaudio's builtin way of
streaming, but lag was pretty bad. I found JACK[0] to work well (<20ms) once I
got it configured correctly. Kind of a complicated setup (including getting it
all to be "automatic", systemd unit files and all, realtime kernels, etc), and
not particularly stable. Unfortunately the latency makes up for it.

[0] [http://www.jackaudio.org/](http://www.jackaudio.org/) (is this link dead?
here's a few more-

[https://github.com/jackaudio/](https://github.com/jackaudio/)

[https://en.wikipedia.org/wiki/JACK_Audio_Connection_Kit](https://en.wikipedia.org/wiki/JACK_Audio_Connection_Kit)
)

------
gimes4dieni
This is great staff! Finally there is an Open Source initiative to create a
robust transport for sync audio streaming. Existing Open Source solutions that
come to my mind (like SlimProto, SnapCast, ffmpeg) focus on providing 'a
product' rather than a reusable 'transport'. Few questions: how do you
'capture' PCM stream in case of ALSA? It is straight forward to create a PA
sink and plug it into PA configuration, but I am wondering about pure ALSA.
Disclaimer: I am an author of
[https://www.github.com/gimesketvirtadieni/slimstreamer](https://www.github.com/gimesketvirtadieni/slimstreamer)

~~~
gavv42
Thanks.

> Few questions: how do you 'capture' PCM stream in case of ALSA? It is
> straight forward to create a PA sink and plug it into PA configuration, but
> I am wondering about pure ALSA.

Good question :) Roc does not implement any special capturing code for ALSA,
it just reads from the given device (using SoX currently). The user is
supposed to use something like snd-aloop.

It would be possible to create a custom ALSA plugin I guess, but we have no
plans for that currently.

You're right about the transport vs product part. I would prefer to work on
the transport. And an ALSA plugin would be a product on top of it so it should
be a separate project ideally. Actually, the same is true for our PulseAudio
modules. I hope later we will either submit them to upstream or separate into
a standalone project.

> Disclaimer: I am an author of
> [https://www.github.com/gimesketvirtadieni/slimstreamer](https://www.github.com/gimesketvirtadieni/slimstreamer)

Interesting, didn't see it before.

------
matthew-wegner
Network audio is pretty nifty! I run a Snapcast[0] setup at home, tied into
Home Assistant[0] automation for multi-room audio. Some notes:

\- I have six total audio zones, including my desktop computer.

\- Audio for a room turns on/off with a room. It's neat to walk from my office
into the kitchen, and have the kitchen lights come up and audio follow me in
when the motion detectors fire. Some speakers don't mute when "off", but
change source to a text-to-speech only channel (for i.e. door/window contact
notification, other messages).

\- Everything but my desktop (macOS) are speakers connected to a Raspberry Pi
via USB DAC.

\- One of my motivations here was multi-room audio, but a big one was to
connect a Linux VM's audio output directly to the speakers so I could use the
official Spotify client, instead of a 3rd-party library that will eventually
break.

\- Snapcast is really quite DIY for config, but I could set up other sources--
an Airplay target, a line in target with a cable hanging off the server so
people could plug in devices at a party, etc. I've seen setups online where
people do this, and someone in a room can change that room's "channel" to
another source.

\- Spotify's DRM-as-feature is nice here, because I just use the Spotify
client on my desktop normally, with output coming out elsewhere. I run 700ms
of buffer, which is just low enough that clicking play/pause doesn't feel
broken. I could probably drop it more, since everything is hardwired in the
house.

\- Previous to Snapcast, I just toggled Spotify's source when I walked between
rooms, but there's quite a bit of dead air there, and it's a hassle to setup,
plus multi-room audio sync is nice with people over.

[0] [https://github.com/badaix/snapcast](https://github.com/badaix/snapcast)

[1] [https://www.home-assistant.io/](https://www.home-assistant.io/)

~~~
gavv42
Interesting, thanks for sharing.

------
Frans-Willem
Can this also play to multiple devices (at the same time), while keeping the
audio synchronized ?

~~~
gavv42
Not yet. This is in our roadmap however.

~~~
iforgotpassword
This would be very cool. I've been pondering about how to do this every now
and then (but never worked on any real time stuff, so...)

If you ever get around to adding this I will start building my el cheapo
raspberry pi based sonos clone. ;-)

~~~
nitrogen
Regarding realtime stuff there's a project out there to implement the Ethernet
AVB (audio/video bridging) standard on BeagleBone using PTP (precision time
protocol) for synchronization.

Some of the older network synchronized transports like CobraNet and Dante
might also be interesting for anyone wanting to learn more about this stuff.

------
rayrrr
Looks awesome! Functionally speaking, it reminds me of Snapcast.
Compare/contrast?

~~~
gavv42
Thanks, I didn't know about this project and will definitely look at the
implementation.

Their documentation says they use TCP, which usually means that it won't
handle low latencies on Wi-Fi due to packet losses.

On the other hand, they have service discovery, remote control, and multi-room
synchronization. All three features are planned but not yet supported in Roc.
We'll add the first two in upcoming releases, but the multi-room support
requires a serious research.

Their documentation also says the client can correct time deviations by
playing faster or slower. We use resampling for that instead. I'm wondering
how they can avoid glitches without using a resampler.

One more difference is that they use their own protocols (both for streaming
and control) while Roc relies on standard RFCs.

------
kbumsik
This is a cool project! I always wonder how to measure latency of sound-
related programs. Is there any tools to benchmark latencies?

------
dmos62
You've mentioned PA, ALSA and macOS CoreAudio. That's Linux and Mac. Will
Windows users be able to use this as well somehow?

~~~
gavv42
Currently, no. Windows port is in our roadmap but not a priority right now.
However, if someone would want to maintain it, I'm ready to accept PRs and
help with porting.

