Firstly, thanks for your work on what is really a great project. Can we set stereo=1 in the SDP and also the bandwidth constraint? That would make it ideal for this use case.
For music quality webRTC you need 3 things: disable audio processing, stereo=1 in the SDP and a way to limit bandwidth usage so it doesn't saturate the available bandwidth and create errors.
Disabling video is also really the best thing to do when recording for this reason (bandwidth saturation), and also Chromium will give you much superior experience. Safari and Firefox isn't quite there yet: Safari can't let you choose your output device and lacks some other useful features, and Firefox doesn't yet seem to allow stereo Opus, maybe that's changed since I tested. Microsoft Edge is now Chromium so you're good to go.
Firefox has supported stereo opus for a very long time (four years at least?). We know it works, it's used by medical professionals for their job and they wrote a message a few month thanking us for this feature, that doesn't seem to work on other browser (according to them, but I see tickets open on chromium).
Of course all the chain has to be stereo, that goes without saying: input signal is stereo, negotiation has been done in stereo, having enough bandwidth is important (otherwise opus goes mono), and then playback has to be on stereo hardware (but that's the easy part).
We hold regular flute meetings and play together. In this quarantine time we wanted to meet online, but if we play all at once, it seems I cannot hear everyone else at the same time. I guess it is as if everyone was shouting over everyone else, which is not the case when you have a meeting where usually only one person speaks at a time
Will this also fix this issue? So everyone will be able to hear everyone?
You won't be able to play together because of latency.
You will think you are in time with someone, but you will react when you hear/see them on your screen, which is maybe .15 seconds after they actually made the sound/movement. And then they will hear/see your reaction .15 seconds later again.
If all participants have good internet and are geographically close it should theoretically be possible to have delay not much greater than rtt/2 for everybody.
With rtt < 20ms that should make musical performances possible. After all, sound only travels less than four meters in 10ms. So this is just like singing in a choir (with more visual delay - but that can be solved by having a conductor).
Unfortunately I'm not aware of any software making that a practical reality, even with ftth.
You're assuming that network latency is the only latency that's involved here, but a huge latency source is the audio codec. Opus adds ~20ms latency, and that's the most low latency codec that's widely supported at the moment. You can see a comparison here: https://www.opus-codec.org/comparison/
There are all sorts of other latency that need to be taken into consideration too, and unfortunately in practice those do add up to live music being unplayable on pretty much any network.
There's a really interesting project called NINJAM https://www.cockos.com/ninjam/ which is designed for live music jam sessions. It flips this fundamental constraint on its head - instead of being real-time, it streams everyone else's output delayed by one bar (theoretically any interval >RTT I guess?). I haven't tried it, but it's a really cool idea.
Of course there's a ton of other potential sources of delay that make my fantasy hard to achieve, probably already starting at the typical USB microphones (in headsets/cameras).
20ms rtt through e.g. opus on a loopback network interface is already decidedly non-trivial to archive with "normal" hardware. When you do have low-latency devices, it becomes easy, but not everyone has those.
Musicians building digital audio workstations commonly have to replace the whole software stack to get audio latency down to an acceptable (<10ms) level: JACK instead of PulseAudio, a Linux kernel recompiled with custom options for low latency, other software reconfigured to use the JACK APIs, and so on. Sometimes they can't use whatever standard audio hardware. (And remember that USB polling frequency is normally only 100 Hz: 10 ms worst-case by itself.)
Minimizing latency is certainly technically feasible, it's just hard for stupid reasons.
I haven't tried it yet, but sofasession.com seems optimized for this. Using wired Ethernet instead of WiFi can go a long way, from what I've heard. Has anyone here tried it?
Depends on the type of music, something slow and choral can easily deal with high latencies, while something quick, rhythmic and precise can't be harder to deal with.
Has anyone tried Mumble for this? It's very low latency but I can't find exactly how low the latency is. It ofcourse depends also on your internet connection and other settings but the base latency that comes from buffering the sound before sending. Mumble also has lots of settings for sound quality and different sound formats so might work for music if you try all the settings.
Mumble has a setting for the audio buffer size and in fact they make you set it during initial configuration. It works great, has low latency and doesn't use much bandwidth (I hosted a server on a 1Mbps DSL connection for several people back in the days).
Latency is not that big problem I'd say. We play a music where it does not matter that much, sometimes just playing one long tone for the length of everyone's breath.
I just would like to hear everybody at the same time, but what I hear is always one person's sound getting preference over others. Or sounds just alternate randomly based on the volume, I'd guess.
Musicians already deal with that kind of issue when doing particular kinds of performance (e.g. famously at Wagner's festival opera house, where the orchestra is in a deep pit below the singers).
That's not how it works. In fact there are multiple algorithms depending on the browser, it's not defined in the spec. The most used one currently would be AEC3 from Google, which is quite a bit more advanced than what you describe.
https://meet.jit.si/YourRoonNameHere#config.disableAP=true