You can read more about how it works here => https://sub.live - but essentially the latency we achieve means I can jam between UK and Denmark with <20ms !
I also added video because it really helps to see the other person.
If you're interested - the tech stack is :
* Electron embedding a Svelte app
* Websockets to a node server to handle the users/room
* C++ sub-process that handles the UDP networking and audio
BTW It's also free to use for both Windows and Mac.
Some friends and I get together roughly weekly using Jamulus. We're playing mostly jazz standards. We have gotten latency down into the 20 ms range, when we are all in the same town. That's the ping time. Add another ~ 40 ms that seems to be eaten up by my local computer and wired home network.
With that said, online jamming is hard work. I'm the bassist. Fortunately I can set my double bass aside and use my electric bass, so I'm not hearing my acoustic and delayed sound at once. The wind instruments and singer, not so lucky. For the bassist it's a constant chore to hold the tempo, leaving little room for anything expressive.
I'm glad to be doing this, beats not playing at all, but yet there is still no substitute for playing together in person.
(Full disclosure: Weepy and I played with Jamulus then he had a crack at building sub.live)
Maybe you have a weak computer? Having 40ms being eaten up by your local network would be absolutely bananas, the latency should be closer to 1-3ms if not lower. If I ping 22.214.171.124 it's not until the switch outside the city I live in that the latency comes up to above 10ms, so something sounds wrong/broken in your setup if it's really your local LAN having 40ms latency.
We're all analog musicians playing alcohol powered instruments. ;-) Some of us are techies, others not. While I'm a techie, that side of my life has always been somewhat separate from the musical side, so I haven't paid that much attention to the digital technology until the pandemic came along. But I'm happy to learn, especially if something simple can make it work better.
Yes, sub-ms latency is hard. But the infrastructure exists.
The developer states that no other software was suitable, and also that it's the first of its kind. Both of those statements are not accurate: there is nothing innovative in sublive that isn't in any of the apps listed below. Props to the developer though for scratching an itch.
As stated in the post, the audio uses a custom C++ UDP solution. As far as I know it's the first video calling app with very low latency audio.
- Given the score and what each person has just played predict what the next few sounds are going to be
- Given a high frame rate video stream of a person predict what the next note to be played
In the same way that Nvidia has extremely low bandwidth but high resolution video enabled by face keypoint tracking and facial reconstruction / puppeteering maybe there's a place for prediction and/or sound reconstruction from extremely low bitrate streams.*
* Obviously not the exact usecase here since a premium is being placed on not processing, but still fun to think about.
"We introduce a system in which a humanoid robot executes commands before it actually receives them, so that the visual feedback appears to be synchronized to the operator, whereas the robot executed the commands in the past. To do so, the robot continuously predicts future commands by querying a machine learning model that is trained on past trajectories and conditioned on the last received commands. In our experiments, an operator was able to successfully control a humanoid robot (32 degrees of freedom) with stochastic delays up to 2 seconds in several whole-body manipulation tasks, including reaching different targets, picking up, and placing a box at distinct locations."
I'm going to share this with my musician friends and see what they have to say. I know they had challenges jamming at the start of the pandemic lockdown that stemmed from a) variance in internet connection strengths and b) innate latency in Zoom.
All the best & kudos for launching!
"Lost packets can be replaced with loss concealment by calling the decoder with a null pointer and zero length for the missing packet."
I'm curious about the networking/UDP side of this. How are you handling retransmits? Do you treat this like a video game would and just keep sending the latest data? Or doing something more advanced like forward error correction?
(and that's likely because it's libre software and the developer doesn't really do marketing)
That said, I'm not sure if you plan to start charging.
Really I just want to see how people use it and figure it out from there.
Open sourcing is more to enable people to be able to build the thing in the future with new libraries. OSes move forward every day and something that runs/builds today might not be able to run/build tomorrow, so open sourcing makes sure it's possible to change libraries/API usage if needed.
I turned off FEC as it adds latency.
Jitter buffer right now is just user controlled. A bit lame, - I should make it automatic, but need to get the right heuristic. With a LAN connection, the buffer can be as low as one or two packets.
I don't use any packet redundancy either and there's no ARQ as if you have to ask for a retransmit you've already lost the war!
How about you ?
The system seems to have the settings right for realtime collab. In fact I found it much better for chatting too if you can guarantee they have a headset and good mic - I hate current-gen AEC algorithms. Interested to see how it fares in a 3+ participant setting.
I had a go at jamming with my brother yesterday. We had to adjust down the delay to 2ms (I think default was 5ms) in order to counteract the "lagging" effect you get when you lock into the remote beat. Once we had that tuned it worked really well. We would occasionally suffer burst losses, but you can play through it and it's still synced afterward which is the best you can do with an unreliable network.
Some feature requests:
- Join room by name (we were confused that there was a different code than the name we chose)
- Auto register new audio devices when you plug them in/edit in settings.
- An easier way to join with multiple audio inputs. We had trouble setting up our instrument, but also adding a mic for chatting.
- Local recording of your end with some shared sync markers, that you could manually or automatically sync up post-hoc
The use case I use WebRTC for is slightly different, where quality >> latency , so we have it tuned to the other end and run the full gambit of FEC, ARQ, packet redundancy etc. Interesting to hear about other's solutions though!
You can also send the backing track ahead of time a little so that you both hear it at the same time!
Do you mean that you want to mix the inputs from multiple devices ? This is possible - but you do get a little bit of extra latency of course.
It's supposed to refresh the audio devices when you "mouseenter" the select box, so perhaps there's a big there.
Multitrack recording with a bigger buffer would be great.
Good you found the buffer settings. I'm surprised that you found a significant difference though between 2ms and 5ms?
Also DAWs don’t typically allow you to interface with multiple audio devices. There are ways of doing it but they have major downsides.
Solved the problem by setting up a Jamulus server on AWS in Montreal. Both ISPs provide low-latency connections to Montreal, much better than one mile across town!
Of course each participant has to use ethernet rather than WiFi and has to use a low-latency audio device, not a laptop sound card.
"The only clear benefit to it..."? As I suggested, a significant benefit of the Jamulus model is that any of the clients can be "thin"; most of the computation is done on the server. You may be able to improve on this but disparaging it with silly criticisms isn't going to help you; there are thousands maybe millions of satisfied Jamulus users around the world.
But if any of the O(N^2) P2P links has unacceptable latency, eliminating the server would be counter-productive because the client-to-server connection may have less latency if the server is located appropriately.
If anyone has tried playing a processing heavy software synth on a pc in real-time you will have experienced how unplayable it is as soon as latency goes beyond 10-20ms - you can't play music if there is noticeable delay between your fingers and ears, and we are much more sensitive to sound latency, it would be the same problem trying to play in time with each other.
I like the idea, but feel like the internet isn't there yet for the majority of users, and latency hasn't exactly been improving at a great pace. I expect it will still be useful in internet rich areas like city to city.
With fiber, you can hit the exchange point in 1-2ms. So I suppose this tool would work well fine over fiber assuming everyone is in the same city. I've even heard of people successfully using protocols like Dante (professional audio over IP intended for LAN) over gig fiber lines as well.
From my experience, a couple milliseconds of latency is fine for keyboard or guitar processing, but anything more than that starts to mess me up. There are artists who insist on using a full analog chain for monitoring for that reason, and refuse to use digital mixers or digital wireless systems that can add several ms of latency.
I thought that ISPs generally run fiber for most of the distance, and coax is only used for the last few hundred meters. The speed of signal propagation is actually faster in copper than fiber, but nobody (afaik?) does long runs over copper so it’s a moot point.
Some differences in what I was planning and accompanying thoughts:
Clojurescript. I do like that it’s using Svelte, I just wanted more idiomatic support for datalog stuff for the purpose of building metadata-driven music theory tools. Svelte is super cool though, and is my go to JS tool right now. There’s always Datalevin, a portable datalog implementation that I found recently. Currently I’m using a locally running XTDB instance for development, but for the final shippable I may switch to Datalevin. If anyone is interested in doing some similar you could try XTDB over http or figure out a nice way to interface with Datalevin from other languages.
Electron -> Tauri. Better native feel and the ability to hit Rust code directly. May not be worth it for this project since it seems like C++ is being used for some stuff. But for me Rust is a better fit. As a side note I think the Tauri team is working on support for interchangeable back ends, so soon you could replace Rust with Go or whatever. Tauri also makes including accompanying binaries easy. Not that I’m saying electron doesn’t, I have no idea.
Capturing ASIO streams. Super important for getting good sound for most people, allowing people to play audio through interfaces and mixers while still capturing it. I’m not expert in ASIO or audio streaming, but from my understanding capturing ASIO streams directly is tricky. Reastream (a reaper plugin) is the only thing I’ve found that lets this happen, and sadly it doesn’t work well with other DAWs. Why this would be useful IMO: people can stream audio while still listening to the processed output through whatever means they already do. Guitarists could process audio in a DAW or plugin and both listen to and stream that audio. People using DAWs can stream the output of the DAWs master channel without compromising how they listen to it. I’m not saying sub.live doesn’t accomplish that, I just think it’s important either way. Typically this is the missing link that makes other methods of audio streaming difficult.
Open source. Makes me sad that it isn’t. Could have been a good building block and I definitely would have tried to be involved right away. Feel free to correct me if this is actually open source and I just misunderstood.
I haven't ruled out opensourcing, but honestly I already have limited time and in my experience open source takes _more_ time commitment (I get that you will get free help eventually).
I'm making a VST plugin to stream output from a DAW.
Problem with Tauri is that you have to support the native browser, rather than just chrome, so it's more work to build and maintain.
Good luck with your project!
It’s pretty solid given that the speed-of-light would like 3.3ms to travel that same distance.
Streaming at home via wifi is already crappy, how should this ever work with more distance?
Otherwise, find friends that you have visibility with and setup a P2P WiFi via antennas/radios, you'll get insanely good latency.
Ping is also a singular piece of detail. Ping between two mediums can change by 1 ms but have a huge effect on throughput and jitter? Which affects streaming over the Internet a great deal despite not being so obviously linked.
WiFi is indeed a convenient tech used all around the world... for convenience, not performance. If you're gaming, if you need security, if you need reliable and consistent performance, Ethernet is pretty ubiquitous. I have couple of wires from Bell router to gaming and music computer at home, wifi for other usages.
When added to your extremely sarcastic tone to honest attempts to assist... what are you actually trying to achieve/learn here?
Trying to convince everyone else that something as simple as plugging a wire in, is horribly onerous, when they do it every day or so, is probably going to be an uphill battle, especially if it's for something people want. Where jamming with friends over the Internet ranks in your list of desires, vs never ever having to do this horrible act of plugging in a wire, is on you.
There's any number of things WiFi isn't the perfect answer for.
Ultimately, in today's world, if you want to jam with musicians not in your room, some non zero effort will be required. Presumably you have setup you audio interface, music interface, microphones, software, mixer, midi and asio, etc etc etc... Honestly, for me at least, Ethernet is the least inconvenient or nerdy or annoying aspect of home studio :-)
For me, cables are a huge issue, because my PC is in a room that's not directly connected to my appartement, where my internet connection is located.
"Here's some ways in which it can work using a slightly different but still as widely used commodity technology" (wired LAN)
"No, that is unreasonable, wires are unreasonable"
Like, it's SUCH a simple solution: use wires. That's how I can stream games from my computer to my parent's house at 1080p60 with only about 30ms latency over domestic broadband connection, for example.
Also, sometimes you can get really lucky with wireless on 5ghz - see streaming in VR as a prime example. The developer of Virtual Desktop for the Quest has done some great work getting a really solid low latency VR feel; I've been able to stream games from rent-a-VM ShadowPC servers without feeling the lag.
OK, so pro audio is a different beast and is even more sensitive to latency than VR, but here people are having fun with this tech and clearly it's working for them!
Indeed. Perhaps we don't worry about those people anymore. If you are interested in a high quality experience but don't want to satisfy the prerequisites, you are not entitled.
WiFi is never going to provide a reliable, ultra-low-latency streaming experience for the average person. This isn't a "we can but we won't" type of problem where, with just a bit of extra effort, we can somehow magically solve all of the deficits.
Yes, there are scenarios where WiFi is indistinguishable from ethernet, but those are exceedingly rare in my experience.
I have 14ms between my Quest/PC and it feels sluggish.
> We've been able to achieve sub 20ms latency from UK to Denmark - more than 1000km !
That is pretty good latency for what it does, it's not just ICMP packages but sending/receiving audio and then all that comes with it.
How are you connecting your Quest and PC? Sounds awful, my latency to friends outside the city has a shorter latency than that and they are more than 10km away from me.
All I've seen in shorter distances was usually worse :(
1 mile away, both of us have AT&T fiber, we're seeing 10-15ms response times (packet size ~1400).
100 miles away, both on AT&T fiber clocked in around 18-20ms
1 mile away, one on AT&T fiber the other on Comcast gigabit had response times around 24ms
100 miles away, one on AT&T fiber the other on Comcast gigabit had response times around 35ms
i'm hard pressed to believe you will find an american ISP that prioritizes low latency. verizon 5g home actually did some magic with their non-standard hardware where i was seeing 9ms to my in-town datacenter and 16ms to my out-of-state datacenter. when they replaced the hardware with the standards-compliant generation of hardware my latency increased to 30ms in-town (they also changed towers so that could be part of the problem, plus congestion during prime time)
i just moved where i am 6 blocks away from my friend who i did the 1 mile test with. i will retest this afternoon and update this comment
Otherwise, using Ethernet + Fiber connection should be good enough for most. Only if you really wanna lower the latency should you invest in additional hardware.
Instead of trying to reduce latency (which you simply cannot do beyond speed of light) endlesss implements a shared “multiplayer” 8 track looper.
Jammers add layers to the loop, which has a clock and supports sync via ableton live, external audio input, etc.
Plus, every addition to the looper is saved/versioned. You can move backwards through loops and export the audio stems. Its a great creative tool, and quick way to build on ideas with friends.
It’s way better than any of the live internet jamming software I’ve used, by far.
> The NINJAM client records and streams synchronized
> intervals of music between participants. Just as the
> interval finishes recording, it begins playing on
> everyone else's client. So when you play through an
> interval, you're playing along with the previous
> interval of everybody else, and they're playing along
> with your previous interval.