Hacker News new | past | comments | ask | show | jobs | submit login
PipeWire: The Linux audio/video bus (lwn.net)
457 points by darwi on March 3, 2021 | hide | past | favorite | 200 comments

I've been trying out the latest master builds of pipewire recently and have been pretty impressed with it:

* My bluetooth headset can now use the HFP profile with the mSBC codec (16 kHz sample rate) instead of the terrible CVSD codec (8 kHz sample rate) with the basic HSP profile.

* Higher quality A2DP stereo codecs, like LDAC, also work.

* AVDTP 1.3 delay reporting works (!!) to delay the video for perfect A/V sync.

* DMA-BUF based screen recording works with OBS + obs-xdg-portal + pipewire (for 60fps game capture).

For my use cases, the only things still missing are automatic switching between A2DP and HFP bluetooth profiles and support for AVRCP absolute volume (so that the OS changes the headset's hardware volume instead of having a separate software volume).

If you use Arch Linux, it's really easy to just drop this right in from official packages as a replacement for Pulse/ALSA [0] and start using it. I've been running it for about a month and everything seems to work exactly as I expect it to. I honestly notice no difference other than the pulse audio input/output picker extension I had been using seems confused now (the native GNOME sound control panel applet works just fine though).

On the video front I use obs-xdg-portal for Wayland screen capture as well - finally there's a good story for doing this! You even get a nifty permission dialogue in GNOME. You have to launch OBS in forced wayland mode with 'QT_QPA_PLATFORM=wayland obs'

[0] https://wiki.archlinux.org/index.php/Pipewire#Audio

Interesting... I had the exact opposite experience just a few days ago. After multiple reboots, reinstalls, deleting all configs and every other troubleshooting step I could think of, pipewire-pulse still showed no devices whatsoever. Switching back to pulseaudio brought everything back immediately.

As for xdg-desktop-portal screensharing, while I'm glad to see at least some standard way of screen capture on Wayland and in theory, permissions are cool, it's still a bad situation. Because each window capture needs explicit permission, dynamically capturing windows is basically impossible and proper movable region capture is tedious and confusing at best. (also d-bus just feels...grosss..but that's obviously very subjective)

Thanks for this, I was wondering if a future Arch update would just auto install this and I would be left wondering what happened when it broke. I am going to remember your post here and try to upgrade to it soon!

Did you get 2—way (in & out) 16khz Bluetooth to work? Am I right that this isn't possible?

I believe when the HFP profile is used and mSBC is supported, then mSBC is used for both the input and output.

What in particular do you want/ expect from AVRCP here?

I can't tell, but it sounds like the two things it might do for me are:

1. Allow my software stack to remember and restore levels on a hardware device, so maybe my big on-ear headset is super loud compared to the cheap earbuds I use, lets have the software notice they're different and put the hardware output levels back where they were on each device when it sees them. This avoids the device (which presumably is battery powered) needing to correctly remember how it was set up. I would like this.

2. Try to avoid noise from analogue amplifier stages in the headset by only using as much amplification as is strictly needed for current volume settings, which in turn involves guessing how linear (or not) that amplifier's performance is, or makes my level controls needlessly coarse. I don't want this, I'll put up with it but mostly by finding settings where it's not too annoying and swearing every time it trips up.

For me, I'd like/expect pipewire to lock the software volume at 100% and adjust only the hardware volume if AVRCP absolute volume is supported by the head. (This seems to be how Android and Windows behave.) I mainly care about keeping everything in sync so that no matter if I adjust the volume using the keyboard, GUI volume slider, or buttons on the headset itself, it'll always behave the same. A lot of the time, I end up needing to hit both the volume up keyboard shortcut and the volume up button on my headset because one of the two is already at the max.

There's some work on getting AVRCP absolute volume implemented here: https://gitlab.freedesktop.org/pipewire/pipewire/-/issues/46...

I believe your first point should automatically work once this is implemented. Pipewire seems to already support remembering the volume per bluetooth device (though it's just the software volume that's being remembered right now).

"My bluetooth headset can now use the HFP profile with the mSBC codec (16 kHz sample rate) instead of the terrible CVSD codec (8 kHz sample rate) with the basic HSP profile."

What configuration did you have to do to get this to work? I'm also using pipewire built from the latest master (PipeWire 0.3.22).

In `/etc/pipewire/media-session.d/bluez-monitor.conf`, I uncommented:

    bluez5.msbc-support = true
    bluez5.sbc-xq-support = true
You'll also need to make sure you're on a kernel supporting the WBS fallback patch[1], which should be the case if you have the latest 5.10 kernel or the 5.12 development kernel.

You can check if it's working by running `info all` in pw-cli. It'll mention if the bluez codec is "mSBC".

[1] https://patchwork.kernel.org/project/bluetooth/patch/2020121...

Would there be improvements for remote audio streaming over pulseaudio with ssh?

I'm not sure about this one. I haven't tried streaming audio over the network with either pulseaudio or pipewire.

It appears Fedora 34 (next version) will make it default.

Why would this affect bluetooth codecs?

I'm not super familiar with the pipewire internals, but I believe pipewire is the daemon responsible for talking to bluez/bluetoothd and ensuring that the audio stream is encoded with a codec that the headset supports.

For example, this is the PR that enabled mSBC support for the HFP profile: https://gitlab.freedesktop.org/pipewire/pipewire/-/merge_req...

Is that X11 or Wayland?

I just installed Pipewire in Arch Linux running GNOME on Wayland, and I finally have working screensharing in Firefox: I can share e.g. GNOME Terminal (a Wayland-native non-Xwayland app) in a video meeting, which I wasn't able to do without Pipewire.

This was all with Wayland. I haven't tried using pipewire with X11.

> including the raw Linux ALSA sound API, which typically allows only one application to access the sound card.

If the raw in that sentence is not meant as a special qualifier and this is meant as a statement about ALSA in general, this is wrong. I recently read up on this to confirm my memory was right when reading a similar statement. In fact, ALSA had just a very short period where typically only one application accessed the sound card. After that, dmix was enabled by default. Supporting multiple applications was actually the big advantage of ALSA compared to OSS, which at the time really did support only one application per soundcard (without hardware mixing, which broke away at that time). I'm not sure why this seems to be remembered so wrongly?

> Speaking of transitions, Fedora 8's own switch to PulseAudio in late 2007 was not a smooth one. Longtime Linux users still remember having the daemon branded as the software that will break your audio.

This wasn't just a Fedora problem. Ubuntu when making the switch also broke Audio on countless of systems. I was active as supporter in a Ubuntu support forum at that time and we got flooded with help requests. My very own system did not work with Pulseaudio when I tried to switch, that was years later. I still use only ALSA because of that experience. At that time Pulseaudio was garbage, it should never have been used then. It only got acceptable later - but still has bugs and issues.

That said, PipeWire has a better vibe than Pulseaudio did. It intends to replace a system that never worked flawlessly, seems to focus on compatibility, and the apparent endorsement from the JACK-developers also does not hurt. User reports I have seen so far have been positive, though I'm not deep into support forums anymore. Maybe this can at least replace Pulseaudio, that would be a win. I'm cautiously optimistic about this one.

The ALSA interface can actually refer to two different things:

1. The ALSA kernel interface

2. The interface provided by libasound2

The former is basically the device files living in /dev/snd, this interface is hardware dependent and whether you can or cannot send multiple streams to the sound card all depends on the actual underlying hardware and driver support.

The later is actually a shared library that when linked into your application exposes "virtual" devices (such as `default`, or `plughw:0` ...), these devices are defined through plugins. The actual configuration of these virtual devices is defined in `/etc/asound.conf` and `~/.asoundrc`. This is typically where dmix is defined/used. Which means that if you have any application that does not use libasound2 or uses a different libasound2 version, you are in trouble.

p.s. Pulseaudio implements alsa API compatibility by exporting an alsa device plugin to reroute audio from all applications making use of libasound2 (except itself).

I've found https://www.volkerschatz.com/noise/alsa.html to be a good overview of ALSA's architecture (though many of the things mentioned, like ogg123, are no longer in common use).

> In fact, ALSA had just a very short period where typically only one application accessed the sound card.

For some definition of 'short period'. Software mixing via dmix worked for me, but at the time I've heard for years that dmix was broken for many other people. Not sure whether things are better nowadays.

The breakage seems to be caused by hardware bugs. Various authors had the stance that they refuse, on principle, to work around hardware bugs. I guess I understand the technical purism, but as a user that attitude was unhelpful: there was no way to get sound working other than to ditch your laptop, and hope that the next one doesn't have hardware bugs. In practice, it seems a large number of hardware had bugs. Are things better nowadays?

> For some definition of 'short period'.

According to https://alsa.opensrc.org/Dmix, enabled by default since 1.0.9rc2. https://www.alsa-project.org/wiki/Main_Page_News shows that was 2005. Alsa 1.0.1 release was 2004. So it's only short when counting from then on, project start was 1998. But https://www.linuxjournal.com/article/6735 for example called it new 2004, so I don't think it was much of a default choice before then.

I had a sound card that would not work with OSS in 2002ish, so I guess define "default choice." Even though it was technically disabled by default, I had to enable it to get sound working.

> I guess define "default choice."

Default choice: The choice that is made with no user configuration.

> My very own system did not work with Pulseaudio when I tried to switch, that was years later. I still use only ALSA because of that experience. At that time Pulseaudio was garbage, it should never have been used then. It only got acceptable later - but still has bugs and issues

In the interest of balance, PulseAudio was a huge improvement for me.

I remember playing SuperTux on my laptop. After the switch to PulseAudio, the sound was flawless. Before that, on ALSA, the audio was dominated by continuous 'popping' noises—as if buffers were underrunning.

> the apparent endorsement from the JACK-developers also does not hurt.

Indeed, it seems a better UX to only require one sound daemon, instead of having to switch for pro work.

My understanding back in the mid 2000s when I first got into Linux was that OSS only let you play one audio stream at once, and the whole point of ALSA was that it let multiple applications access the sound card.

I guess I could be remembering that wrong, but I know I was listening to multiple audio streams long before PulseAudio came onto the scene.

OSS in non-free versions supported multiple sources. Linux sound guys decided that instead of fixing the free OSS they would write ALSA. They never really worked out all the bugs around mixing before pulseaudio took over.

It's more than 20 years later and still I don't understand these complaints. ALSA was designed to have a broader API than OSS, and it has supported OSS emulation for quite some time. What else could have been done when OSS went non-free?

Same what FreeBSD has done: keep developing Open Source OSS. One implementation going non-free doesn't affect other implementations of the same API.

Didn't they do that by developing ALSA OSS emulation? That is effectively another implementation of the same API.

Libalsa is there in FreeBSD ports, but it's only for backward compatibility with Linux, and it's only userspace parts. The kernel implements OSS API.

I mean Linux, it does the same but reversed: the kernel implements the ALSA API, and libaoss is provided for backwards compatibility in userspace with OSS. What else should they have done? The ALSA API is not the same as OSS and has a different set of features.

Feature set doesn’t really depend on the API, FreeBSD implements all that functionality despite using OSS.

It sounds like Linux and BSD both have an implementation for both APIs, only implemented in different ways? If so, I don't much see where the problem is with this solution.

They don’t. And the problem is that replacing a popular API with a proprietary one, like ALSA, creates a lot of work for everyone, for no reason other than NIH.

PulseAudio and PipeWire embody the correct approach. The problem with OSS is, if you're using anything resembling modern audio formats, it risks introducing floating-point code into the kernel, which is a hard "no" in mainline Linux. So if you need software mixing, it should be done in a user-space daemon that has exclusive access to the sound card.

OSS was the last time I had Linux audio in a state that I would call "basically working".

Jop, definitely. I still remember this clearly because I was into gaming. And while I could play some games with wine, Counter-Strike iirc, my friends used Teamspeak. Teamspeak had a proprietary linux version, but it used OSS. Before `aoss` became a thing (or maybe just known to me) there was no way of having teamspeak on and ingame sound, and teamspeak needed to be started before the game.

Only using alsa fixed this, mumble I think then became a good alternative for a short while.

> I'm not sure why this seems to be remembered so wrongly?

It didn't work reliably on all chipsets/soundcards.

I don't remember this at all, but this might explain that. Or maybe a distribution like debian stable shipped with an outdated ALSA version, taken from the short period between release and dmix. Or just disabled dmix. Would love if someone remembered specifics.

I kinda assumed people mix up Alsa and OSS or don't remember anymore what actually did and what did not work before Pulseaudio was introduced.

In the early 00's (before PulseAudio), my desktop had an old SoundBlaster Live PCI card that was pretty common around the turn of the millennium. ALSA dmix Just Worked with that one.

Any other hardware I encountered required some kind of software mixing, IIRC. Not that my experience was extensive, but I got the impression that hardware or driver support for dmix wasn't that common.

> Any other hardware I encountered required some kind of software mixing, IIRC.

Yes, that was dmix :) And it fits the timeline, hardware mixing was killed off back then by soundcard vendors/microsoft, iirc.

Hardware mixing was killed, because it turned out that it is more efficient to mix several streams with CPU and move just a single one via the bus. It was also more flexible and without weird limits - for example, GUS could mix 14 streams at 44,1 kHz, and if you went above (up to 32), the frequency of each stream went down.

Dmix was that software mixing; in the early 00's there were some cards capable of mixing in hardware and dmix was not used for them.

Okay, I guess I've got it wrong then. Thanks for the clarification.

Indeed. It also glitches like hell in case of any system load.

That's why PulseAudio and PipeWire run as real-time. Or they should.

Back when Pulse was new and I was running Gentoo I used to help other Gentoo users get their real-time settings correct. I believe we used rtprio in limits.conf. I don't recall when RTKit became a thing.

If your sound daemon is running as real-time and still missing deadlines then there's something wrong with your system hardware. Or I suppose, the sound source feeding the pulseaudio daemon is not getting enough CPU time.

I think they indeed should run at a (low) real-time priority, but only if they are limited to a fraction of total available CPU power, with say cgroups or similar.

Otherwise they can easily lock up the system, and that should not be the default configuration.

> My very own system did not work with Pulseaudio when I tried to switch, that was years later. I still use only ALSA because of that experience. At that time Pulseaudio was garbage, it should never have been used then. It only got acceptable later - but still has bugs and issues.

I remember the transition to PulseAudio. Initially, most things were broken, and we still had some applications that worked only with OSS so the whole audio suite in Linux was a mess. I remember that I already switched from Fedora (very broken)/Ubuntu (slightly less broken) to Arch Linux, and for sometime I kept using ALSA too.

Eventually, between switching Desktop Environments (I think Gnome already used PulseAudio by default them, while KDE was optional but also recommended(?) PA), I decided to try PulseAudio and was surprised how much better the situation was afterwards (also, OSS eventually died completely in Linux systems, so I stopped using OSS emulation afterwards).

With the time it was getting better and better until PulseAudio just worked. And getting audio output nowadays is much more complex (Bluetooth, HDMI Audio, network streaming, etc). So yeah, while I understand why PipeWire exists (and I am planning a migration after the next NixOS release that will bring multiple PipeWire changes), I am still gladly that PulseAudio was created.

I’m using pipewire in NixOS unstable and it’s working very well. I know they are working in integrating new pipewire configuration with Nix configuration.

I am just waiting the release of the next NixOS stable version, since the integration in the current stable version (20.09) is still lacking some important features.

Which features do you have in mind? I use NixOS as well, and I’ve been thinking of switching to PipeWire.

I can't switch without SELinux support.

I used the PulseAudio simple API only to find out that it didn't work. Only when I changed to the normal API it started working.

Trivia: Some sound cards (eg pas16) showed up as multiple devices using OSS too and you could output pcm audio to 2 simultaneously.

Trivia, FreeBSD looked at Alsa and decided they were using a rewrite as an excuse to not fix OSS and so they dug in and fixed OSS so it worked.

You know how some people say C thinks every computer is just a fancier PDP-11 ?

OSS thinks every computer's audio playback devices are just fancier CT Soundblasters.

If you've got a typical generic PC with Intel HDA or AC'97 (or indeed, an actual PCI Soundblaster) and what you'd like to have happen is that you can run your VoIP software, and listen to MP3s and also have new mails go "bing!" then "all the world is a Soundblaster 16" is close enough.

But if you've got a USB headset you plug in to take those VoIP calls, and a pair of bluetooth earbuds for listening to music, but you want new mail noises on speakers - you will start to struggle. These things aren't enough like a Soundblaster, neither in terms of how they're actually working nor in terms of how you expect to use them.

PulseAudio does a much better job there, because it's the right shape, and now it seems PipeWire is a better shape for modern environments where it's not necessarily OK for the software that plays smooth jazz from the Internet to be technically capable of listening in on your video conference calls.

PulseAudio sits above the level of OSS/ALSA to do that magic. Layers are often a good things when it separates concerns and makes the whole easier to handle.

Another point against dmix: I would be surprised if it worked with sandboxes like Flatpak. That may be another reason why major desktop-oriented distros like Fedora Workstation haven't embraced it.

How do dmix and pulseaudio do IPC?

Dmix seems rather limited and doesn't come automatically setup for all audio devices. [1]

[1]: https://alsa.opensrc.org/Dmix

That's a very old wiki page with decades old workaround for decade old issues. I'm not saying you are wrong, but if you take this impression solely from that wiki page you are likely mislead. Afaik this always works and did for many years - but I might be wrong and always only lucky with all the systems where I tested it?

I think you've hit one of the most painful topics of ALSA: Its documentation.

Check out the "Perfect Setup" for Pulse Audio: https://www.freedesktop.org/wiki/Software/PulseAudio/Documen...

Sounds on linux is a shitshow.

Yeah I'm aware that PA has relatively a lot of documentation, but I'd still like to see ALSA documented. Even if a tiny bit more.

Yes, you can do an incredible amount of very useful stuff with ALSA and it's asoundrc files.

Sadly that logic is quite opaque, poorly documented, and produces mysterious error messages or, worse, no error at all.

Just tried it on NixOS, had no idea it was so fleshed out already! Thought it'd be full of bugs but was pleasantly surprised, it just worked. No issues with compatibility, extremely low latency and has JACK and PulseAudio shims, so everything works out of the box, including pro audio stuff like Reaper and Ardour. And thanks to the JACK shim I can patch around the outputs with qjackctl. This is compared to JACK, which I never managed to properly combine with PulseAudio.

But does it work with the OSS shim for alsa shim for pulseaudio shim for jack shim for pipewire?

Jokes aside my first reaction upon hearing about pipewire was "oh no, not yet an other Linux audio API" but maybe a miracle will happen and it'll be the Chosen One.

I know that audio is hard but man the situation on Linux is such a terrible mess, not in small part because everybody reinvents the wheel instead of fixing the existing solutions. Jack is definitely the sanest of them all in my experience (haven't played with pipewire) but it's also not the most widely supported so I often run into frustrating issues with the compatibility layers.

> not in small part because everybody reinvents the wheel instead of fixing the existing solutions.

I'm using all of these reinventions. Wayland, systemd, flatpak, btrfs and soon pipewire. I'm absolutely loving linux right now. Everything works so nice in a way it will never on a distro with legacy tools. Some of these projects like flatpak have a few rough edges but the future is very bright for them and most problems seem very short term rather than architectural.

I also liked JACK the best. But was frustrated with compatibility layers.

Pipewire lets me use all the JACK tooling, but without needing a special compat layer to manage it. so for now, I'm pretty excited

Still havent figured out how to get anything working for video.

I gave JACK a try more than a decade ago, and I remember how cool it was to be able to pipe the audio from one application to the input of another (unrelated) app, possibly adding effects or transformations in between. But JACK never became "mainstream" so I never got to use it for anything serious, but I miss the flexibility it offered even for non professional use-cases. What I wonder is if PipeWire will allow this kind of routing or patching of audio streams as well.

It does, exactly the same way as JACK, and you can even do it with pulseaudio apps! I could pipe audio from a firefox tab through guitarix (guitar amp emulator) into a second firefox tab if I wanted to. With just JACK or just Pulse this wouldn't be possible. And if I understand it correctly, it should work for video streams too. I'm imagining piping a screenshare through OBS before going into discord or something, should be very useful.

You can do that in Pulse, by using null sinks and monitor sources.

I see what you mean, still a lot more complicated than just dragging a "wire" in qjackctl though.


It's not super polished, but you can do similar wire dragging here.

I'm running PipeWire on Arch and I can do it through `pw-jack carla`[1]. You can do surprisingly advanced stuff through the JACK compatibility.

[1] https://i.imgur.com/EFUxR41.png

Hey, it looks nice! But, are you required to be running JACK? I mean, isn't this routing capability part of the PipeWire's core?

No, I don't have jackd running. Launching something with `pw-jack` (AFAIK) makes that use the PipeWire libraries instead of the normal JACK ones. I think PipeWire has it's own internal graph that's compatible with ALSA, PulseAudio, and JACK, and that's how it can work with any program. Here's the output of `pw-dot` which is a PipeWire tool that dumps the graph: https://i.imgur.com/OMZCcmC.png

Thanks for the datapoint. I've been following https://github.com/NixOS/nixpkgs/issues/102547 and considering trying it out for a while.

Did you just set services.pipewire.pulse.enable=true?


My major concern is that I use PulseEffects as a key component of my setup so I'll need to check if that works well with PipeWire. But the only way to be sure is to try it!

I also have Pipewire running on NixOS. This is what I recommend configuring:

  services.pipewire = {
    enable = true;
    alsa.enable = true;
    alsa.support32Bit = true;
    jack.enable = true;
    pulse.enable = true;
    socketActivation = true;
That allows me to run pretty much any application that uses ALSA, JACK, or PulseAudio.

That gave me "services.pipewire.alsa" does not exist on 20.09; does this require unstable?

I believe so, I am on the unstable channel.

https://github.com/wwmm/pulseeffects#note-for-users-that-did... Pulseeffect has only support for Pipewire since version 5 and Pulseaudio in the legacy Pulseaudio branch

I didn't get to try it under PipeWire on my Arch laptop before that died the other day, but a friend had said PulseEffects is no longer such a massive CPU hog under PipeWire, so much so that they run it all the time now.

Interesting. I have PulseEffects running all of the time on PulseAudio and don't notice much CPU usage. However maybe that is because I only apply effects to the mic and it seems to disable itself when nothing is recording.

I just tried it on NixOS and the new version is already packaged and working.

> extremely low latency

How low is "extremely low", especially compared to JACK that I'm currently using when doing music production?

What latency do you get using JACK?

Not in front of the right computer, but I think it's around 15ms for a full roundtrip.

>This is compared to JACK, which I never managed to properly combine with PulseAudio.

Yeah making PulseAudio play nice with JACK seems to be tricky. Over time I configured it in four different environments (different Linux Distributions and/or Versions) and for each of them I had to do things (at least slightly) differently to get them to work.

Using JACK apps to route between PulseAudio apps under PipeWire is magic, as is being able to turn on external DACs after login and still be able to use them with JACK apps without restarting any software. Also PulseAudio not randomly switching to the wrong sample rate when I open pavucontrol is a blessing. (And it's so easy to setup, at least on Arch Linux.).

I have come to describe PW as like a superset of JACK and PulseAudio.

Also to note, #pipewire is very active on freenode, and wtay regularly drops into #lad.

I've first read about PipeWire about two months ago and I'd really love to try it. But! My setup is working and I'm really not the kind of person who likes to unnecessarily tamper with a smoothly running system. So I'll probably try it the next time I need to do a fresh install. Promise!

> Yeah making PulseAudio play nice with JACK seems to be tricky.

for me https://github.com/brummer10/pajackconnect has worked flawlessly... but I've switched to pipewire and I'm not looking back !

I found it tricky at first but got a mostly smooth setup on two machines now with wide variety of uses.

Definitely for day-to-day use the Ubuntu Studio app has actually been the most helpful (direct control / visibility into the Jack <-> PA bridging is great), or a combo of qjackctl and Carla for more Audio-focused stuff.

It's great to see pipewire coming along, pulseaudio development seems to (to a spectator) to have been a little.. well..



While that seems like a huge cluster, it does kind of seem that the patch rejections come from a set of principles. If the patches would improve Bluetooth audio at the expense of breaking existing features, saying "we don't break existing features" is a valid position to hold.

I don't think the issue was breaking existing features, it was a classic 'perfect being the enemy of good' situation. PA maintainers wanted to support dynamically loading the involved codecs due to potential (but not particularly well demonstrated) concerns about licensing and inclusion in certain distros. But they didn't really have the manpower to actually do this (and especially they disagreed with the person actually doing the work on how to go about doing this), so they just sat on an MR for ages while the situation improved for no-one (worst case it gets merged and then disabled at compile time by some distros). Meanwhile frustrations arose because the contributer wanted to help users and the maintainers just seemed like a roadblock to doing this, and the maintainers utterly failed to de-escalate the situation.

Isn't this probably because most developers have shifted focus on PipeWire? I thought both came from more-or-less the same community?

This is so sad. I had no idea.

Well whenever I report issues about pulse audio, the response I get is "this is fixed on pipewire". Seems like the development community has moved on and its time for the users to move too.

> Second, D-Bus was replaced as the IPC protocol. Instead, a native fully asynchronous protocol that was inspired by Wayland — without the XML serialization part — was implemented over Unix-domain sockets. Taymans wanted a protocol that is simple and hard-realtime safe.

I'm surprised to read this; I was under the impression that D-Bus was the de jure path forward for interprocess communication like this. That's not to say I'm disappointed - the simpler, Unix-y style of domain sockets sounds much more in the style of what I hope for in a Linux service. I've written a little bit of D-Bus code and it always felt very ceremonial as opposed to "send bytes to this path".

Are there any discussions somewhere about this s/D-Bus/domain socket/ trend, which the article implies is a broader movement given Wayland's similar decision as well?

D-Bus isn't suitable for realtime, to get that to work would require additional changes within the D-Bus daemon to add realtime scheduling, and even with all that, it would still introduce latency because it requires an extra context switch from client -> dbus-daemon -> pipewire. Maybe they could have re-used the D-Bus wire format? That's the only bit that might have been suitable.

To this day I still don't understand why messages are routed through dbus-daemon instead of just using FD-passing to establish the p2p connection directly. I remember we were using D-Bus on WebOS @ Palm & a coworker rewrote the DBus internals (keeping the same API) to do just that & the performance win was significant (at least 10 years ago).

Among other things, pushing everything through the message bus allows for global message ordering, and security policies down to the individual message. Rewriting the internals would work in an embedded situation like that where every application is linking against the same version of libdbus, but that is not really the case on a desktop system, where there are multiple different D-Bus protocol implementations.

If applications have hard performance requirements, most D-Bus implementations do have support for sending peer-to-peer messages, but applications have to set up and manage the socket themselves.

Also lets you restart either end of the connection transparently to the other end.

With fd passing, if the daemon I'm talking to dies or restarts my fd is now stale and i have to get another.

Also allows starting things on demand similar to inetd.

Also allows transparent multicast.

So yeah, fd passing would be faster, but routing through the daemon is easier.

I didn't mention those because in theory a lot of that could be done by the library, or done by the daemon before passing off the fd for a peer-to-peer connection. (If a connection dies, the library would transparently handle that by sending a request back to the daemon for another connection, etc) But of course another thing that having a message bus allows you to do is reduce the amount of fds that a client has to poll on to just one for the bus socket.

That sounds great! D-bus always seemed to me like a byzantine overcomplicated design, and i was pleasantly surprised when i saw Wayland protocol with its simple design.

Maybe you haven't read Havoc Pennington's posts on why dbus was designed the way it was and the problems it solves. Start here: https://news.ycombinator.com/item?id=8649459

Dbus is about the simplest approach that solves the issues that need addressing.

Well, D-Bus was originally designed to solve the problem of... a message bus. So you can pass messages down the bus and multiple consumers can see it, you can call "into" other bus services as a kind of RPC, etc. Even today, there's no real alternative, natively-built solution to the message bus problem for Linux. There have been various proposals to solve this directly in Linux (e.g. multicast AF_UNIX, bus1, k-dbus) but they've all hit various snags or been rejected by upstream. It's something Linux has always really lacked as an IPC primitive. The biggest is multicast; as far as I know there's just no good way to write a message once and have it appear atomically for N listeners, without D-Bus...

Now, the more general model of moving towards "domain sockets" and doing things like giving handles to file descriptors by transporting them over sockets, etc can all be traced back to the ideas of "capability-oriented security". The idea behind capability oriented security is very simple: if you want to perform an operation on some object, you need a handle to that object. Easy!

For example, consider rmdir(2). It just takes a filepath. This isn't capability-secure, because it requires ambient authority: you simply refer to a thing by name and the kernel figures out if you have access, based on the filesystem permissions of the object. But this can lead to all kinds of huge ramifications; filesystem race conditions, for instance, almost always come down to exploiting ambient authority.

In contrast, in a capability oriented design, rmdir would take a file descriptor that pointed to a directory. And you can only produce or create this file descriptor either A) from a more general, permissive file descriptor or B) on behalf of someone else (e.g. a privileged program passes a file descriptor it created to you over a socket.... sound familiar, all of a sudden?) And this file descriptor is permanent, immutable, and cannot be turned into "another" descriptor of any kind that is more permissive. A file descriptor can only become "more restrictive" and never "more permissive" — a property called "capability monotonicity." You can extend this idea basically as much as you want. Capabilities (glorified file descriptors) can be extremely granular.

As an example, you might obtain a capability to your homedir (let's say every process, on startup, has such a capability.) Then you could turn that into a capability for access to `$HOME/tmp`. And from that, you could turn it into a read-only capability. And from that, you could turn it into a read-only capability for exactly one file. Now, you can hand that capability to, say, gzip as its input file. Gzip can now never read from any other file on the whole system, no matter if it was exploited or ran malicious code.

For the record, this kind of model is what Google Chrome used from the beginning. As an example, rendering processes in Chrome, the process that determines how to render a "thing" on the screen, don't actually talk to OpenGL contexts or your GPU at all; they actually write command buffers over sockets to a separate process that manages the context. Rendering logic is a browser is extremely security sensitive since it is based exactly on potentially untrusted input. (This might have changed over time, but I believe it was true at one point.)

There's one problem with capability oriented design: once you learn about it, everything else is obviously, painfully broken and inadequate. Because then you start realizing things like "Oh, my password manager could actually rm -rf my entire homedir or read my ssh key, and it shouldn't be able to do that, honestly" or "Why the hell can an exploit for zlib result in my whole system being compromised" and it's because our entire permission model for modern Unix is built on a 1970s model that had vastly different assumptions about how programs are composed to create a usable computing system.

In any case, Linux is moving more and more towards adopting a capability-based models for userspace. Such a design is absolutely necessary for a future where sandboxing is a key feature (Flatpak, AppImage, etc.) I think the kernel actually has enough features now to where you could reasonably write a userspace library, similar to libcapsicum for FreeBSD, which would allow you to program with this model quite easily.

Capabilities are an intensely underused design pattern.

> Well, D-Bus was originally designed to solve the problem of... a message bus.

I guess no one has ever told me what the "message bus problem" actually is. I get sending messages -- a very useful way of structuring computation, but why do I want a message _bus_?

I get wanting service discovery, but don't see why that means a bus. I get wanting RPC, but don't see why that means a bus.

Heck, I don't even know what "have it appear atomically for N listeners" means if you have less than N CPUs or the listeners are scheduled separately. Or why that's a good thing. Did you just mean globally consistent ordering of all messages?

> The biggest is multicast; as far as I know there's just no good way to write a message once and have it appear atomically for N listeners, without D-Bus...

I once wrote a proof of concept that uses the file system to do this. Basically, writers write their message as a file to a directory that readers watch via inotify. When done in a RAM based file system like tmpfs, you need not even touch the disk. There are security and permission snags that I hadn't thought of and it may be difficult if not totally infeasible to work in production, but yeah... the file system is pretty much the traditional one-to-many communication channel.

If you use shared memory, then you can use interprocess futexes for signaling, no need for inotify. That's pretty much how posix mq_open is implemented.

> There's one problem with capability oriented design: once you learn about it, everything else is obviously, painfully broken and inadequate.

oh yes!

I wonder if localhost-only UDP multicast can be an usablesubstitute for the missing AF_UNIX multicast.

I don't know of any discussions on it, but I like it. Client-server architectures seem like a Good Thing, and I'm growing to like the idea of a small handful of core "system busses" that can interoperate with each other.

The problem with these busses is that each hand rolls its own security primitives and policies.

This sort of thing is better handled by the kernel, with filesystem device file permissions. As a bonus, you save context switching into the bus userspace process on the fast path. So, “the unix way” is simpler, faster and more secure.

File permissions are completely insufficient to achieve the kind of design that PipeWire is aiming for. None of the problems outlined in the article regarding PulseAudio (e.g. the ability of applications to interfere or snoop on each other, or requesting code being loaded into a shared address space that have unlimited access) can be easily handled with file permissions at all. The model is simply not expressive enough; no amount of hand-wringing about context switching will change that. This is one of the first things addressed in the article and it's very simple to see how file permissions aren't good enough to solve it.

That won't work here, the design of pipewire is to allow for things like a confirmation dialog appearing when an application tries to use the microphone or webcam, the application can then get temporary access to the device. That is a security policy that isn't really easy to do with filesystem device file permissions.

The fact that PipeWire has the potential to replace both PulseAudio (for consumer audio) and Jack (for pro audio) with a unified solution is very exciting.

Particularly if you are on the pro audio side. Consumer audio can ignore pro-audio for the most part. However everyone on pro-audio needs to do something consumer audio once in a while, if only to run a web browser.

Exactly. If you're working on a project in Ardour and want to quickly watch a video on YouTube, stopping Jack and starting PulseAudio was a bit of a pain the last time I did that.

Yes, you can configure PulseAudio as a Jack client, but the session handling is also a bit messy. (I used to have a PA -> Jack setup on my work computer just so I could use the Calf equalizer / compressor plugins for listening to music. I dropped it again after a while, because session handling and restoring wasn't always working properly. But that was around 6-7 years ago, maybe it would work better nowadays.)

What is pro audio? I'm naive when it comes to this.

Setups for professional music production. This usually requires JACK (https://jackaudio.org/) as audio server, which allows synchronizing and connecting multiple applications.

For example, you can have Ardour (https://ardour.org/) as DAW, but use another application like Hydrogen (http://hydrogen-music.org/) for creating drum samples. JACK connects the two applications using a virtual patchbay that allows using Hydrogen as an Input for Ardour. Essentially any application can be an input and/or an output.

JACK also provides synchronization using a "master clock", so that Hydrogen starts playing as soon as you hit the "record" button in Ardour.

Many people also use a Linux kernel optimized for low latency audio.

With PulseAudio, these things are not possible. On the other hand, consumer applications like web browsers don't usually offer direct JACK support. So bridging is necessary, by using PulseAudio as a JACK input.

This is very informative. Thank you for clearing some of this up for me. I didn't realize audio was this complex.

This is gonna be huge imo. I'm a linux veteran at this point and I can't get JACK to work without an hour of fiddling every time.

>JACK applications are supported through a re-implementation of the JACK client libraries and the pw-jack tool if both native and PipeWire JACK libraries are installed in parallel

>unlike JACK, PipeWire uses timer-based audio scheduling. A dynamically reconfigurable timer is used for scheduling wake-ups to fill the audio buffer instead of depending on a constant rate of sound card interrupts. Beside the power-saving benefits, this allows the audio daemon to provide dynamic latency: higher for power-saving and consumer-grade audio like music playback; low for latency-sensitive workloads like professional audio.

That's pretty interesting. It sounds like it's backwards compatible with jack programs but uses timer based scheduling similar to pulseaudio. Can you actually get the same low levels of latency needed for audio production without realtime scheduling?

JACK's used over pulse for professional audio typically because of its realtime scheduling. How does pipewire provide low enough latency for recording or other audio production using timer based scheduling?

Does anyone have any experience using pipewire for music recording or production?

It would be nice to have one sound server, instead of three layered on top of eachother precariously, if it works well for music production.

Without audio buffer rewinding, you're going to have to suffer random stutters and jumpiness every time your system comes under heavy load. Your system does an NMI because you plugged the power cable in? Your audio will glitch. It will also mean you won't be able to sit with an idle CPU while playing music - the audio daemon will have to wake up to reload buffers multiple times per second, killing battery life unacceptably for playing audiobooks on a phone...

Saying rewindable audio is a non-feature might simplify the codebase, but if it makes it work badly for most use cases, it ought to be rethought.

One of the goals is low latency realtime audio to take the place of Jack. That requires small buffers frequently filled. I doubt that's very power hungry on today's systems. Also handling other tasks can be pushed on another core. So far it works better for me than Pulse did.

It did talk about being adaptive. So if you are just listening to music it should be able to use large buffers. However if you switch to something with low-latency demands it can start using smaller buffers.

My main concern is that without rewriting how can you handle pressing play or pause? Sure, that music isn't realtime and can use large buffers but if I start playing something else, or stop the music I still want it to be responsive which may require remixing.

A 100ms delay between hitting pause and the pause happening is plenty fast for that use case. The same delay for pro audio mixing is way too long.

Except a typical desktop system is usually a mix of low latency and high latency audio streams. You're playing music, and you're typing on a 'clacky' virtual keyboard. The user doesn't want 100ms of lag with each finger tap till they hear the audible feedback. Yet when no typing is happening, the CPU doesn't want to be waking up 10x per second just to fill audio buffers.

The solution is to fill a 5 minute buffer with 5 minutes of your MP3 and send the CPU to sleep, and then if the user taps the keyboard, rewind that buffer, mix in the 'clack' sound effect, and then continue.

In some sense it’s worse on modern systems. Modern systems are pretty good at using very little power when idle, but they can take a while to become idle. Regularly waking up hurts quite a bit.

I'm not sure I understand this. Why can't you just increase buffer sizes and write more data to them to avoid frequency of wake ups?

Edit: does this help? https://gitlab.freedesktop.org/pipewire/pipewire/-/wikis/FAQ...

Because sometimes latency matters:

- You want a "ding" sound within Xms of the time a user performs some action.

- You want a volume change to happen within Yms of the user pressing the volume up/down keys

Without buffer rewinding, your buffer size when playing music cannot be longer than the smallest of such requirements.

With buffer rewinding, your buffer size can be very long when playing media, and if a latency-sensitive event happens, you throw away the buffer. This reduces wakeups and increases the batch size for mixing, which is good for battery life.

The PipeWire people seem fairly smart, so they are probably aware of this, but I'd like to see power numbers on say a big-little ARM system of PW compared to PA.

Low latency is a goal. From video conferencing to music recording it's very important.

Except that professional music recording is a domain in which low latency always trumps power consumption. So if a system based on such a tradeoff fails even once due to complexity of buffer rewinding or whatever, the professional musician loses.

Hell, Jack could be re-implemented as power-hungry, ridiculous blockchain tech and if it resulted in round-trip latency / 2 professional musicians would still use it.

Edit: added "complexity of" for clarification

In practice it can't though. You cannot do complex computation and still meet low latency as complex computation takes time which adds to latency. Also pros intend to use their computer, so something that complex leaves less CPU free for the other things they are trying to do.

in practice the only way this comes into play is pros are willing to fix their CPU frequency, while non-pros are willing to suffer slightly longer latency in exchange for their CPU scaling speed depending on how busy it is. It is "easy" to detect if CPU scaling is an allowed setting and if so increase buffer sizes to work around that.

Even for non-pros, low latency audio is important. You can detect delays in sound pretty quickly.

I think supporting the low latency usecase is a goal, but not the only one. As far as I understand it pipewire provides configurable latency.

Eh, just process all audio in a once-a-day batch job at 2am, it'll be great!

> Why can't you just increase buffer sizes and write more data to them to avoid frequency of wake ups?

Because then software volume will take too long to apply, and a new stream will have to wait (again, too long) until everything currently in the queue has been played. PulseAudio tried to solve this with rewinds, but failed to do it correctly. ALSA plugins other than hw and the simplest ones (that just shuffle data around without any processing) also contain a lot of code that attempt to process rewinds but the end result is wrong.

Here's a decent (and corny) explanation video from Google on the matter:


> Saying rewindable audio is a non-feature might simplify the codebase, but if it makes it work badly for most use cases, it ought to be rethought.

Rewindable audio is a non-feature because nobody except two or three people in the world (specifically, David Henningsson, Christopher Snowhill and maybe Georg Chini) can write non-trivial DSP code that correctly supports rewinds.

PipeWire has worked very well for me both as a drop-in replacement for PulseAudio and to enable screen sharing on Wayland.

It also worked as drop in replacement for PulseAudio for me, except all my audio now had stutters and pops. I ended up going back to Pulse.

I got suggestions that I could go tweak buffer sizes stuff in a config file somewhere, but for my simple desktop use case I'd rather my audio just sounds right out of the box.

Hopefully this sort of thing gets straightened out, because having to muck with config files to make my sound server actually work is like going back to working directly with ALSA or OSS.

I had a couple little issues as well when I switched over a couple months ago, but they just fell away over the ensuing weeks of updates until there's nothing left at the moment. Give it another try sometime soon.

I can confirm that a issue causing my audio to completely drop at random points resolved about 1 month ago and now everything works perfectly.

FWIW I had issues with it on debian 10, until I built and installed it from master. It was a bit smoother on debian 11 but I couldn't get bluetooth to work.

This looks promising for Linux audio. I spent some time investigating the state of Linux audio servers a while back while diagnosing Bluetooth headset quality issues and ultimately opened this bug: https://bugs.launchpad.net/ubuntu/+source/pulseaudio/+bug/18...

Sounds like a lot of lessons have been learned from JACK, PulseAudio etc that have been factored in to the architecture of PipeWire. Maybe it really is the true coming of reliable Linux audio :)

Nah, another audio daemon is not what Linux needs IMO. This should be merged into the kernel, especially since process isolation is one of the stated goals. Running hard realtime stuff in a user space that is designed to not provide useful guarantees related to hard deadlines is brave, but ultimately somewhat foolish.

I know that there are arguments against having high quality audio rate resampling inside the kernel that are routinely brought up to block any kind of useful sound mixing and routing inside the kernel. But I think that all necessary resampling can easily be provided as part of the user space API wrapper that hands buffers off to the kernel. And the mixing can be handled in integer maths, including some postprocessing. Device specific corrections (e.g. output volume dependent equalization) can also fit into the kernel audio subsystem if so desired.

AFAIK, Windows runs part of the Audio subsystem outside the kernel, but these processes get special treatment by the scheduler to meet deadlines. And the system is built in a way that applications have no way to touch these implementation details. On Linux, the first thing audio daemons do is break the kernel provided interface and forcing applications to become aware of yet another audio API that may or may not be present.

This is just my general opinion on how the design of the Linux audio system is lacking. I am aware that it's probably not a terribly popular opinion. No need to hate me for it.

[End of rambling.]

Resampling in userspace and then sending it to the kernel is how it already works.. in ALSA. The only real problem with how ALSA does things is that you can't just switch the output (for example sound card to hdmi) for a running stream. PA solves this by basically being a network package router (bus, switch, "sound daemon", however you want to call it). PulseVideo^H PipeWire, from little i cared to look, is basically the same thing.

Another problem with ALSA, as well as PA, is that you can't change the device settings (sampling rate, bitrate, buffer size and shape) without basically restarting all audio. (note: you can't reeealy do it anyway as multiple programs could want different rates, buffers, and such)

In my opinion, the proper way to do audio would be to do it in the kernel and to have one (just one) daemon that controls the state of the system. That would require resampling in the kernel for almost all audio hardware. Resampling is not a problem really. Yes, resampling should be fixed-point, and not just because the kernel doesn't want floating point math in it. Controlling volume is a cheap multiply(or divide), mixing streams is just an addition (bout with saturation, ofc).

Special cases are one program streaming to another (ala JACK), and stuff like bluetooth or audio over the network. Those should be in userspace, for the most part. Oh, and studio hardware, as they often have special hardware switches, DSP-s, or whatever.

Sincerely; I doubt i could do it (and even if i could, nobody would care and the Fedoras would say "no, we are doing what ~we~ want"). So i gave up a long while ago. And i doubt anybody else would fight up that hill to do it properly. Half-assed solutions usually prevail, especially if presented as full-ass (as most don't know better).

PS Video is a series of bitmaps, just as audio is a series of samples. They are already in memory (system or gpu). Treating either of them as a networking problem is the wrong way of thinking, IMO. Only thing that matters is timing.

PPS And transparency. A user should always easily be able to see when a stream is being resampled, where it is going, etc, etc. And should be able to change anything relating to that stream, and to the hardware, in flight via a GUI.

Putting this into the kernel won't solve anything that isn't already solved with things like the Linux realtime patch. The way this works is that the applications themselves need to have a realtime thread to fill their buffer, and the audio daemon has to be able to schedule them at the right time, so it's not just the daemon that needs to have special treatment from the scheduler.

Also keep in mind that these audio daemons work as an IPC to route sound between applications and over the network, not just to audio hardware. Even if you put a new API in the kernel that did the graph processing and routing there, you would still likely need a daemon for all the other things.

It would solve the needless IPC, cache trashing, priority scheduling (since it becomes a kernel thread, instead of a userspace thread), and other busywork.

Would it? Linux does support realtime priority scheduling, JACK has worked this way for years. The thing is you need userspace realtime threads for because that is what the clients need to use, it's not enough to change just the mixing thread into a kernel thread.

But one of the goals of this is to be able to handle Video and audio together. (This enables an easier API for ensuring audio and video remain in sync with each other, which can be tricky in some scenarios when both use totally seperate APIs.)

The other main goal is to simultaneously support both pro-audio flows like JACK, and consumer flows like PulseAudio without all the headaches caused by trying to run both of those together.

Lastly PipeWire is specifically designed to support the protocols of basically all existing audio daemons. So if the new APIs provide no benefit to your program, then you might as well just ignore it, and continue to use PulseAudio APIs or JACK APIs or the ESD APIs or the ALSA APIs or ... (you get the idea).

Now you are not wrong that audio is a real time task, and that there are advantages to running part of it kernel side (especially if low latency is desired, since the main way to mitigate issues from scheduling uncertainties is to use large buffers, which is the opposite of low latency).

On the other hand, I'm not sure an API like you propose will work as needed. For example, There really are cases where sources A, B, C and D need to be output to devices W, X, Y, and Z, but with different mixes for each, some of which might need delays added, effects (like reverb, compression, application of frequency equalization curves, etc) applied, and I have not even mentioned yet that device W is not a physical device, but actually the audio feed for a video stream to be encoded and transmitted live.

Try designing something that can handle all of that kernel side. Some of it you will have no chance of running in kernel mode obviously. That typically implies that everything before it in the audio pipeline ought to get done in user mode. Otherwise the kernel mode to user mode transition has most of the scheduling concerns that a full user-space audio pipeline implementation has. For things like per output device effects that would imply basically the whole pipeline be in user mode.

The whole thing is a very thorny issue with no perfect solutions, just a whole load of different potential tradeoffs. Moving more into kernel mode may the a sensible tradeoff for some scenarios, yet for others that kernel side implementation may be unusable, and just contributing more complexity to the endless array of possible audio APIs.

> AFAIK, Windows runs part of the Audio subsystem outside the kernel, but these processes get special treatment by the scheduler to meet deadlines.

Assigning deadline-based scheduling priorities to the pipewire daemon wouldn't do the same job?

Isn't the deadline realtime scheduler optional? How many distros do actually ship it in their default kernels? I honestly didn't manage to keep track of this.

The deadline scheduler is upstream, see "man 7 sched" for a description: https://man7.org/linux/man-pages/man7/sched.7.html

What is not upstream (yet) is the PREEMPT_RT patch which makes all kernel threads fully preemptible.

Crossing the streams a bit, I'm wondering if there's enough grunt in EBPF to do mixing and resampling.

> Running hard realtime stuff in a user space that is designed to not provide useful guarantees related to hard deadlines is brave, but ultimately somewhat foolish.

So every VST/Virtual instrument in a DAW or for live performance should be running in the kernel? Because that's definitely a fresh take.

I only read this article, so I'm still fuzzy on the exact technical details, but couldn't a system like pipewire eventually be adopted into the kernel after it has proven itself adequate? Or is that not a thing the kernel does?

Probably not. Kernel handles the hardware. User-space deals with things like routing, mixing, resampling, fx, etc. Having that functionality outside of the kernel offers a lot more flexibility. Despite people chafing at the user-space audio API churn, it does allow advancements that would be much more difficult to do if implemented in the kernel.

Does it also work with WINE and manages MIDI devices or it's audio only? From the project page on Gitlab it seems it doesn't; apologies for hijacking the thread if that's the case, but I'm out of ideas.

I'm currently looking for a last resort before reinstalling everything since probably after an apt upgrade all native Linux software kept working perfectly with all my MIDI devices while all WINE applications simply stopped detecting them, no matter the software or WINE version used. No error messages, they suddenly just disappeared from every WINE application but kept working under native Linux. Audio still works fine in WINE software, they just can't be used with MIDI devices because according to them I have none. WINE and applications reinstalls didn't work.

I'm currently trying pipewire on openSUSE Tumbleweed. I'm very impressed with it so far.

(After realizing it was broken, because I didn't have the pipewire-alsa package installed => No audio devices) The pulse drop-in worked flawlessly out of the box. I'd had some isssues with the jack drop-in libraries tho. (metalic voice, basically not useable) To fix this, I had to change the sample rate in /etc/pipewire/pipewire.cfg from the default 48000 to 44100.

Have you been using some GUI to be able to control volume, etc? I've been holding out on replacing PA with PW on Tumbleweed until [1] is resolved so I can continue using pavucontrol (pavucontrol's deps will be satisfied by pipewire-pulseaudio and not only pulseaudio), which is currently blocked at [2] being accepted. Probably another week or so.

[1]: https://bugzilla.opensuse.org/show_bug.cgi?id=1182730

[2]: https://build.opensuse.org/request/show/875208

I'm just thinking of all the disparate use cases for Linux audio, all the disparate types of inputs/outputs, complex device types involved, etc.

But then I think about pro-audio:

* gotta go fast

* devices don't suddenly appear and disappear after boot

* hey Paul Davis-- isn't the current consensus that people just wanna run a single pro-audio software environment and run anything else they need as plugins within that environment? (As opposed to running a bazillion different applications and gluing them together with Jack?)

So for pro-audio, rather than dev'ing more generic solutions to rule all the generic solutions (and hoping pro-audio still fits one of the generic-inside-generic nestings), wouldn't time be better spent creating a dead simple round-trip audio latency test GUI (and/or API), picking a reference distro, testing various alsa configurations to measure which one is most reliable at the lowest latency, and publishing the results?

Perhaps start with most popular high-end devices, then work your way down from there...

Or has someone done this already?

Can't pro audio also mean plugging in your laptop at a nightclub and performing? Why is pro audio limited to things as unchanging as a permanent recording studio? Someone making tunes on their laptop in their bedroom can also require 'pro audio'.

But there's pro audio in a studio, and then there's people like me who occassionally record stuff on our normal desktop systems and find it annoying to remember / lookup how to switch audio stacks.

This is an exciting development. As someone who has supported the desktop use of Linux audio by community radio users I've found it very frustrating at times how things don't work. I remember a decade ago going to a presentation on Linux audio at Ohio Linux fest and the recently I decided to dive in and see what the best solution to coming up with a user friendly and fool proof audio setup (easier said than done). I found that JACK is still too complicated to setup for novices and pulse audio can just be inconsistent. So pipewire seems like it has a lot of potential and I'm excited that people are working on this. It'll perhaps make Linux audio better able to compete with coreAudio and whatever audio subsystem Windows uses. I especially appreciate that the flexibility and modularity allows both professional and consumer applications. The future is bright.

The real link: https://pipewire.org/

In this case, I disagree. LWN is always well worth reading.

pipewire already works surprisingly well. Even bluetooth source and sink works.

Only problems I had so far are:

* sometimes bluetooth devices would connect but not output audio, have to restart pipewire.

* sometimes pipewire gets confusing and doesn't assign audio outputs properly. (shows up in pavucontrol as "Unknown output")

What does real world latency look like with Pipewire?

Is it comparable to jackd when used with something like Ardour?

Just tried recording from my laptop's mic - the reported latency from Ardour was 5,3ms (with 256 samples/buffer). It crashed when I went down to 128 samples/buffer.

It has a jack and pulseaudio api wrapper. I use it because it means I don't need to muck around with configuring pulse and jack to work nicely together.

Does anyone know if pipewire has its own audio protocol for applications, as well as taking the place of JACK and Pulse? Or will future applications still just decide whether to talk to "JACK" or "Pulseaudio"? (Both actually being pipewire)

It does. The support for Pulse and JACK APIs is to ease adoption.

I wanted to give this a spin, but it’s seemingly not packaged in a meaningful way on Ubuntu yet. That is, there is no pipewire-pulse, pipewire-jack, etc.

Oh well. Maybe next version?

I tried to get it running a few weeks ago on Ubuntu, but gave up. It's pretty simple on Arch or Fedora currently, but Ubuntu seems to be lacking the necessary packages. Hopefully soon.

I just tried building it from source, and in the end I got everything right... On paper.

I got pulseaudio wrapped by pipewire-pulse, and applications launched and acted as if they produced audio (as opposed to being blocked), but I still couldn't get any sound.

Granted I have a complicated setup with a laptop with 2 HDMI outputs, 2 USB-soundcards, built-in headphone adapter and a built-in speaker.

Obviously that setup is going to take some configuration to get right, but with PulseAudio I could set it up pretty quickly with pavucontrol.

No such luck with PipeWire ... yet. I guess it will get there sooner or later :)

Can somebody please elaborate what does it mean for the user who installed `pulseaudio` once long ago and never had to bother about audio at all?

I'm curious as well.

I've seen lots of folks talking about pipewire, but I'm a simple audio user - I want software mixing and audio out via a headphone jack and that's all.

I'm pretty sure for most folks we'll just wait until our distro decides to move over, it'll happen in the background, and we'll not notice or care.

I actually read the whole thing as I have been wondering what this new word is, PipeWire, for a while now. I actually understood like 30% of this and think I'll get even more out of it in the future having read this.

Does anyone know if PipeWire can do audio streaming like PulseAudio can? I had a rather nice setup using a raspberry pi and an old stereo system a while back that I'd like to replicate.

TCP sockets appear to still be supported by pipewire-pulse.

I hope this fixes my issues with bluetooth on Linux. When I'm on battery the audio breaks all the time. I've tried all sorts of obscure config tweaks with Pulseaudio.

This looks that you're using very aggressive power settings in the kernel (powertop, tlp?), and doesn't seem related to PulseAudio at all.

I hope KDE will implement direct Pipewire support for general audio controls, to avoid going through the PulseAudio plugin.

Everybody had trouble with PulseAudio, even people who liked it in principle.

LP wasn't joking about breaking sound: things did break, many, many times for many, many people, for years. And, almost always the only information readily available about what went wrong was just sound no longer coming out, or going in. And, almost always the reliable fix was to delete PA.

But it really was often a consequence of something broken outside of PA. That doesn't mean there was always nothing the PA developers could do, and often they did. The only way it all ended up working as well as it does today--pretty well--is that those things finally got done, and bulldozed through the distro release pipelines. The result was that we gradually stopped needing to delete PA.

Gstreamer crashed all the damn time, for a very long time, too. I never saw PA crash much.

The thing is, all that most of us wanted, almost all the time, was for exactly one program to operate on sound at any time, with exactly one input device and one output device. UI warbling and meeping was never a high-value process. Mixing was most of the time an unnecessary complication and source of latency. The only complicated thing most of us ever wanted was to change routing to and from a headset when it was plugged or unplugged. ALSA was often wholly good enough at that.

To this day, I have UI warbling and meeping turned off, not because it is still broken or might crash gstreamer, but because it is a net-negative feature. I am happiest that it is mostly easy to turn off. (I wish I could make my phone not scritch every damn time it sees a new wifi hub.)

Pipewire benefits from things fixed to make PA work, so I have expectations that the transition will be quicker. But Pipewire is (like PA and Systemd) coded in a language that makes correct code much harder to write than buggy, insecure code; and Pipewire relies on not always necessarily especially mature kernel facilities. Those are both risk factors. I would be happier if Pipewire were coded in modern C++ (Rust is--let's be honest, at least with ourselves!--not portable enough yet), for reliability and security. I would be happier if it used only mature kernel features in its core operations, and dodgy new stuff only where needed for correspondingly dodgy Bluetooth configurations that nobody, seriously, expects ever to work anyway.

What would go a long way to smoothing the transition would be a way to see, graphically, where it has stopped working. The graph in the article, annotated in real time with flow rates, sample rates, bit depths, buffer depths, and attenuation figures, would give us a hint about what is failing, with a finer resolution than "damn Pipewire". If we had such a thing for PA, it might have generated less animosity.

For the vast majority of audio needs Rust is portable enough to cover x86, ARM, MIPS-based platforms. Probably 99% of users that need this kind audio/video multiplexing.

99% is not portable enough. Of 1B targets, 99% leaves 10 million targets unsupported. If each person has 100 targets (e.g. 30+ in your car), that means it doesn't work on something at least half of us depend on; and many others are only Tier 3. This is why I say "let's be honest, at least with ourselves". Pipewire that depends on Rust would be DOA. But Modern C++ works everywhere, and provides 90% of the safety, plus stuff Rust doesn't, yet, and other stuff Rust never will; and better performance than C.

I have never had problems with audio on Linux. What problems does this solve?

Better sandboxing, basically.

A system was needed for video, turns out it was a good fit for audio.

Audio and video aren't that different, TBH (audio just has more alpha/blending rules, and lower tolerance on missed frames; video has higher bandwidth requirements). Wouldn't surprise me if both pipelines eventually completely converge. Both "need" compositors anyways.

Consumer audio already works reasonably well but this apparently has massive improvements for bluetooth, especially the HFP profile which is used when using the built in headphones mic.

The main benefit imo is to pro audio so you don't need to configure separate tools and manually swap between pulse and jack every time you want pro audio.

It also manages permissions to record audio and the screen for wayland users.

Same thing that ALSA, esd, Pulseaudio and Phonon solved: the previous incarnation itched.

This is giving me xkcd "Standards" vibes.


I hope I'm wrong. There is a lot of potential to do better in that realm.

Given that it supports ALSA, PulseAudio, and JACK, I don't think it's like that at all. Assuming it works, it subsists of both all the other standards and a new one, keeping existing applications working with its own new advantages.

Yeah but: the key differentiator to me is supplying drop-in replacements / adapters for all the other standards from early on in the process. This is why it isn't #927 I say...

I believe Mars has the largest percentage by planet of linux machines with working sound.

Is this audio/video bus a result of the space program?

It is driven mainly by automotive uses. Modern instrument clusters in all cars are running Linux, and need to handle sound and video streams of terrifying variety, including dashcams, back-up cams, sirius radio, phone bluetooth, and more to come, directed to various display devices including actual screens, the instrument cluster, speakers, and phone calls.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact