"Whereas with LV2, you can actually say: 'Here is the spec for this extension. And it's written down and this is how it works'. So I think this is a great thing and I don't think it'll go away."
Isn't this merely a matter of documentation? See for example the REAPER VST 2.x extensions (which you've mentioned): https://www.reaper.fm/sdk/vst/vst_ext.php
With VST 3.x (which has a COM-like API) you can even add entire custom interfaces which the plugin resp. the host can query.
LV2 uses URI's to identify extensions and there is an expectation that the URI is based on a URL controlled by the extension's proponents. Namespacing, put simply.
VST3. Cough. Choke. Do you know that the SDK Steinberg distributes for VST3 is larger than the entire Ardour codebase?
I tend to disagree. What else is the purpose of the effVendorSpecific/effCanDo and audioMasterVendorSpecific/audioMasterCanDo opcodes?
> two organizations could "define" an extension that uses the same integer value for the audiomaster callback
Yes, this is indeed a problem.
> LV2 uses URI's to identify extensions
And VST3 uses COM like interfaces with GUIDs. I think it's just two different ways to achieve the same thing.
> Do you know that the SDK Steinberg distributes for VST3 is larger than the entire Ardour codebase?
:-) 99% of the VST3 "SDK" is just unnecessary cruft, the actual plugin interface is pretty small (although not as small as VST2). I've written a cross-platform VST3 host, and "pluginterfaces" is really all you need (https://github.com/steinbergmedia/vst3_pluginterfaces). I don't think the actual VST3 plugin API is significantly larger than LV2.
I have mixed feelings about VST3. I see some of the advantages, but some design decision are just awful. In fact, some things like multi-channel support is even worse than with VST2 (see https://github.com/steinbergmedia/vst3sdk/issues/28).
But anyway, Robin Gareus has already implemented the bulk of VST3 support inside Ardour so we should that emerge sometime (7.0 or before).
Me neither :-) A C API should still be the weapon of choice.
> Robin Gareus has already implemented the bulk of VST3
Cool! I remember I had a short e-mail conversation with Robin about the VST3 SDK last year after he kindly helped me to get his lv2vst plugin to work in my host. Looks like he started to work on the VST3 implementation shortly after that :-)
It's for Pure Data and Supercollider.
GitHub mirror: https://github.com/spacechild1/vstplugin
And thanks to you and @prokoudine for this great series of interviews.
I was very surprised to read:
> But we do now generally tell new users "You don't have to use JACK. And in fact, if you don't use JACK, your initial experience is going to be a lot easier". That's particularly true for MIDI devices. Most people using JACK2 have to go through some extra loops to actually get hardware to show up. Whereas if they use the ALSA backend on Linux, it just works.
> So JACK will be there, we will suggest and make it more and more obvious that JACK is not the obvious thing for you to use.
I recently helped a friend setting up his Linux laptop to record audio (USB interface, mics to record acoustic instruments). I installed Ubuntu 20.04 and used the great Ubuntu Studio tools to setup JACK . It's still a pain, as you mention, to save/restore session states for my friend, and to setup the settings for latency Vs. xruns.
My friend doesn't need to route audio from one program to another, so I guess he could just use ALSA directly, but then how can he monitor/optimize the latency?
You say that for end users, ALSA just works, and you tend to encourage it for newbies. What are the tradeoffs? What kind of latency penalty am I taking by using ALSA vs JACK? Can it scale to as many channels?
What's your preferred JACK frontend? The UbuntuStudio UI has some nice bridging set up out of the box, so you can e.g. play Youtube while Ardour or Reaper is active. QJackCtl certainly works (and offers a UI around the patchbay you describe) but it doesn't have that feature at least.
Does Pipewire aim to solve the problem of persistent device names? I had a bit of an adventure figuring out how to set up ALSA so that it didn't constantly give new numbers to my audio interface, USB midi IO, control surface, etc (which was painful, because it meant having to reconfigure the DAWs after each reboot.) Once I found the right howto, it wasn't too bad, but it'd be nice if that was taken care of.
Do you have any background knowledge on Bitwig's weird MIDI setup? They support JACK audio, but not JACK midi, so you have to do this weird juggling act to bridge midi to ALSA, which was too much friction for me to bother with (and so I don't use Bitwig.)
Finally, is there much difference between audio interfaces in terms of latency as long as they are USB class compliant? If there is, are there any good references that you know of? Or is it a case of experimentation?
I use QJackCtl. I use an ALSA loopback device to bridge between PulseAudio and JACK. Documented at http://ardour.org/jack-n-pulse.html
I don't know the answer to the question about Pipewire. However, devices names: https://jackaudio.org/faq/device_naming.html
We have had users complaining about Ardour in the context of Bitwig's wierd MIDI, indeed. I've never spoken to anyone involved in Bitwig development. I could make guesses about why they did what they did, but they would just be guesses. It is certainly less than ideal.
USB audio on Linux sadly incurs an extra latency penalty due to a kernel-side buffer whose size varies every time the device is opened/started. This means that you will not see constant latency numbers for any USB audio interface on Linux (there are some attempts being made to fix this). However, other than that, they are all functionally equivalent, certainly from a latency perspective. I would give MOTU's recent devices a shout out purely because they've done all device configuration (the "device panel") via a web browser, and thus removed the one barrier that exists for a number of devices on Linux - the audio/MIDI side works, but you cannot configure it since that requires a dedicated Windows/macOS tool. MOTU were the first company I know of to do this, and despite their virulently anti-Linux attitude in the past, it really makes Linux a first class platform for their newer devices (ignoring a few SNAFUs with the firmware, unfortunately).
I should've linked to it directly perhaps :)
What are your thoughts on MPE and Midi 2.0 in Ardour?
I have not looked at MIDI 2.0 yet in any level of detail, amazing at that may seem. I don't think MIDI 2.0 really brings much to the table for most users of MIDI, and what it does bring is largely addressed by MPE. However, given some of the deeper changes, it would make sense that whenever we try to tackle the "MPE model", we also pay attention to what MIDI 2.0 requires at the same time.
Having a midi device be able to describe and name its parameters and then be able to ... let me quote soundonsound -
"""MIDI‑CI will allow DAWs to discover a lot more about external gear than they can at the moment, and might even allow editing panels to be built automatically."""
I have a lot of gear that will never speak midi 2.0 but my assumption is some clever person will make a proxy (with a big list of existing gear mappings) that would allow me to select that gear and then expose a midi 2.0 interface to the DAW or a keyboard controller that was midi 2.0
This would really be huge for me and just make life a lot better around the studio.
It has gone nowhere.
There are a number of reasons why, but I would wager that the most important is that people simply do not need this kind of complexity. It is the kind of thing that OEMs can use to build cool devices, rather than the kind of things the overwhelming majority of users ever wants to deal with.
Note that the MIDNAM specification also does a significant part of what you describe above, and although it's not true to say that it has gone nowhere, the industry essentially abandoned it rather than push it forward.
I apologize for sound cynical or skeptical about MIDI 2.0. MIDI 1.0 was a technical and social engineering miracle. But having been around the audio tech industry for 20+ years, I'm not optimistic that the MMA's processes are capable of replicating what was so good about MIDI 1.0 or avoiding the screw-ups they've been involved with since then.
Midi 2.0 seems like its about a sweet spot that will solve real problems.
I'm looking at the midnam files that come with Ardour...
It seems like these are focused on just named patches? I see the spec can do more like parameter names but it doesn't seem like that's been the focus? Is there a bigger pile of work somewhere?
Thanks, I might poke at this a bit and see what I can get working in Ardour.
But "is Linux finally ready for audio stuff like mac?" is a meaningless question because there are as many different workflows and definitions of "audio stuff" as there are styles of music.
Also, in this context, even the term "Linux" is not well-defined. You probably mean "Linux in a conventional desktop or laptop computer", but there are many high end digital audio mixing devices you could buy from any major audio tech company that run Linux internally. That's not what you meant, probably, but it's still Linux, thus making the term a bit unclear in this context.
Paul: I’m curious which “industry / workflow” is likely to be the best driver of improvements in Linux audio. Is film production or post-production the best bet?
(My assumption is that as long as musicians “grow up” on macOS and Logic and so on, you’re fighting with the “perennial year of the Linux desktop”. That wouldn’t apply to audio production teams who don’t need to also use their machines for their personal life)
Consumer/desktop audio on Linux will only advance (to the extent that it needs to - it already works quite well) because of people just sitting down and doing the work. Nothing is "driving" this forward.
The problems really arise at the junction between the two worlds: think bedroom/basement music/video production on your one laptop. This is the part that macOS/CoreAudio gets so right - there's no difference in any of this for consumer apps or "pro" apps. On Linux, you have to grapple with the awkward disconnect between these two "workflows". It's not THAT hard, but it's not THAT easy either. And this affects developers just as much as it affects users: which APIs to use?
Well, you probably need an application which has real-time requirements. This has always been the most difficult aspect of audio processing pipelines.
Real time audio programming within applications tends to be a problem, mostly because too many developers have never read or do not understand:
I haven’t exactly pestered everyone with a notepad and pencil taking a survey so I do not have a citation. I’d be VERY curious to hear a credible citation to the contrary showing that Macs are growing in use or even maintaining steady market share. That’s not what I see out in the world.
Ok, the situation in studios might be different (but I would curious to see actual numbers). In a live show you want reliable low latency audio (especially if you play VSTis on a keyboard or do any kind of live processing) and macOS is very good at this, better than Windows I would say. Dismissing this as "fashion" is a bit simplistic.
I'm a computer musician and audio programmer. Although I use Windows most of the time, I would be the first one to admit that the audio situation on Windows is a total mess. As Paul said, on macOS you have a single API and it just works (tm). On Windows, we needed an external company (Steinberg) to come up with a usable solution for low latency audio (ASIO).
I tried editing lots of files, installing jack, uninstalling jack, googling, and bashing my head against the table for hours. Nothing helped. Is this impossible or is there a buffer setting I could tweak somewhere?
Ask this question at linuxmusicians.com and you will likely get some more helpful answers.