I'm working on Roc, a toolkit for real-time streaming over the network. Among other things, it provides command-line tools and PulseAudio modules that can be used for home audio. It can be used with PA, with bare ALSA, and with the macOS CoreAudio.
The main difference from other transports, including PulseAudio TCP and RTP streaming, is the better quality of service when the latency is low (100 to 300 ms) and the network in unreliable (Wi-Fi). The post explains why and provides some comparison, usage instructions, and future plans.
There is still a long way to go, and now we're looking for people thoughts and feedback. Do you find the project useful? How would you use it? What features would you like to see?
IMO "low latency" should mean low enough than it's very unlikely to be noticed, which most musicians seem to accept as 5ms. (Theoretically, even microsecond level delayed audio can be noticed if mixed with the original signal because of comb filtering.)
Many audio streaming apps requires 1-2 seconds latency (especially on Wi-Fi), that's why I called the 100-300 ms range "low". 100ms is the minimum I've seriously tested on Wi-Fi so far. 300ms is, roughly, the maximum UI delay that feels acceptable (you press "play" and hear the sound).
I'll think about the wording..
> Some media players can delay their audio to account for playback delay in the audio device, if the audio stack supports that. Does Roc support that, or if not, is it on your roadmap?
We have an open issue for implementing correct latency reports in PA modules. When we'll fix it, players that support that feature should automatically start taking the latency into account.
Thanks for reminding me, I'll test this feature specifically.
I've always been really excited by the possibilities implied by PulseAudio's network capabilities, but disappointed by their latency and bandwidth requirements. Roc + Opus would be amazing.
> Opus would be great with ROC because in case of buffer over/under runs the codec provides features to mask dropouts based on previous content. This is critical when using Wi-Fi.
Are you talking about its PLC or FEC? I didn't test it yet and I'm interested if people are using both of them with music.
BTW it would be also interesting to combine our FECFRAME support with Opus.
1) Would this make any difference?
2) Does it currently support online plug-unplug the way RTP works without restarting pulseaudio?
If you have no issues with 1) latency 2) packet losses and 3) clocks difference, that would be no difference, at least until Roc could offer some new encodings.
(If you're using PA, it handles the clocks difference for you. Its RTP transport sometimes worked strange for me, but its "native" tunnels handled it well.)
> Does it currently support online plug-unplug the way RTP works without restarting pulseaudio?
Roc sinks and sink inputs may be loaded and unloaded at any time without restarting PA. But there is no service discovery yet, which means that 1) when a remote sink input appears, sink is not automatically added 2) when a remote sink input disappears, sink is not automatically removed. (We will add this in upcoming releases). Currently the remote sink input can appear and disappear at any time and the local sink will just continue streaming packets to the specified address.
> How far are you with supporting multiple sampling rates
Roc currently supports arbitrary input/output rates but only a single network rate (44100). If the network rate differs from the input/output rate, Roc performs resampling.
We're now finishing the 0.1 release, and I was planning to add support for more network encodings, including more rates, in 0.2. Feel free to file an issue or mail us with a list of encoding/rates you need.
> and multiple receivers?
No support yet. If you use a multicast address, it would probably just work though.
Again, feel free to file an issue and describe what you would expect from such support. I'll be happy to implement it if someone needs it.
Another question is how Roc will interact with your sync part. How do you perform synchronization?
> Does it support h323?
No, and there were no plans yet. But we probably can add support if someone will need it.
I didn't perform serious testing on latencies below 100ms yet. I've added to my todo that we should investigate the minimum supported latency.
 http://www.jackaudio.org/ (is this link dead? here's a few more-
> Few questions: how do you 'capture' PCM stream in case of ALSA? It is straight forward to create a PA sink and plug it into PA configuration, but I am wondering about pure ALSA.
Good question :) Roc does not implement any special capturing code for ALSA, it just reads from the given device (using SoX currently). The user is supposed to use something like snd-aloop.
It would be possible to create a custom ALSA plugin I guess, but we have no plans for that currently.
You're right about the transport vs product part. I would prefer to work on the transport. And an ALSA plugin would be a product on top of it so it should be a separate project ideally. Actually, the same is true for our PulseAudio modules. I hope later we will either submit them to upstream or separate into a standalone project.
> Disclaimer: I am an author of https://www.github.com/gimesketvirtadieni/slimstreamer
Interesting, didn't see it before.
- I have six total audio zones, including my desktop computer.
- Audio for a room turns on/off with a room. It's neat to walk from my office into the kitchen, and have the kitchen lights come up and audio follow me in when the motion detectors fire. Some speakers don't mute when "off", but change source to a text-to-speech only channel (for i.e. door/window contact notification, other messages).
- Everything but my desktop (macOS) are speakers connected to a Raspberry Pi via USB DAC.
- One of my motivations here was multi-room audio, but a big one was to connect a Linux VM's audio output directly to the speakers so I could use the official Spotify client, instead of a 3rd-party library that will eventually break.
- Snapcast is really quite DIY for config, but I could set up other sources--an Airplay target, a line in target with a cable hanging off the server so people could plug in devices at a party, etc. I've seen setups online where people do this, and someone in a room can change that room's "channel" to another source.
- Spotify's DRM-as-feature is nice here, because I just use the Spotify client on my desktop normally, with output coming out elsewhere. I run 700ms of buffer, which is just low enough that clicking play/pause doesn't feel broken. I could probably drop it more, since everything is hardwired in the house.
- Previous to Snapcast, I just toggled Spotify's source when I walked between rooms, but there's quite a bit of dead air there, and it's a hassle to setup, plus multi-room audio sync is nice with people over.
If you ever get around to adding this I will start building my el cheapo raspberry pi based sonos clone. ;-)
Some of the older network synchronized transports like CobraNet and Dante might also be interesting for anyone wanting to learn more about this stuff.
> is there a reason to duplicate the work ?
I don't know yet. When the time comes to implementation we'll look whether we can re-use either the code or ideas or maybe instead integrate Roc into GStreamer as a network transport (actually I was thinking about it already and there is an item in the roadmap for it).
Their documentation says they use TCP, which usually means that it won't handle low latencies on Wi-Fi due to packet losses.
On the other hand, they have service discovery, remote control, and multi-room synchronization. All three features are planned but not yet supported in Roc. We'll add the first two in upcoming releases, but the multi-room support requires a serious research.
Their documentation also says the client can correct time deviations by playing faster or slower. We use resampling for that instead. I'm wondering how they can avoid glitches without using a resampler.
One more difference is that they use their own protocols (both for streaming and control) while Roc relies on standard RFCs.