Under the hood, this is essentially just a Python wrapper around JUCE (https://juce.com), a comprehensive C++ library for building audio applications. We at Spotify needed a Python library that could load VSTs and process audio extremely quickly for machine learning research, but all of the popular solutions we found either shelled out to command line tools like sox/ffmpeg, or had non-thread-safe bindings to C libraries. Pedalboard was built for speed and stability first, but turned out to be useful in a lot of other contexts as well.
I was just about to look for a library to layer 2 tracks (a text-to-speech "voice" track, and a background music track) and add compression to the resulting audio.
A few questions if you don't mind:
- Pedalboard seems more suited to process one layer at a time, correct? I would be doing muxing/layering (i.e. automating the gain of each layer) elsewhere?
- Do you have a Python library recommendation to mux and add silence in audio files/objects? pydub seems to be ffmpeg-based. Is that a better option than a pure-Python implementation such as SoundFile?
That's correct: Pedalboard just adds effects to audio, but doesn't have any notion of layers (or multiple tracks, etc). It uses the Numpy/Librosa/pysoundfile convention of representing audio as floating-point Numpy arrays.
Mixing two tracks together could be done pretty easily by loading the audio into memory (e.g.: with soundfile.read), adding the signals together (`track_a * 0.5 + track_b * 0.5`), then writing the result back out again.
Adding silence or changing the relative timings of the tracks is a bit more complex, but not by much: the hardest part might be figuring out how long your output file needs to be, then figuring out the offsets to use in each buffer (i.e.: `output[start:end] += gain * track_a[:end - start]`).
For layers, I could have an array that represents "gain automation" for each layer, and then let numpy do `track_a * gain_a + track_b * (1-gain_a)` for the whole output in one go.
And I'd create silences by inserting 0's (and making sure that I'm inserting them after a zero crossing point to avoid clicks)
I'm prone to NIH :-) but I'll also try to see if something like this exists. But at least -- it's clearly do-able/prototype-able!
(Fun fact: the Echo Nest's Remix API was what got me interested in writing code way back in high school. Now, more than a decade later, I'm the tech lead for the team that owns what's left of it. I still can't believe that sometimes.)
Pedalboard would also be usable in situations that are tolerant of high latency and jitter, though, given that all audio gets handed back to Python (which is both garbage collected and has a global interpreter lock) after processing is complete.
There's definitely a lack of cross platform VST host (without the need to use a DAW).
Also can Pedalboard support VST GUIs?
Pedalboard doesn't support GUIs at the moment, but there's an issue on GitHub to track that: https://github.com/spotify/pedalboard/issues/8
> Machine Learning (ML): Pedalboard makes the process of data augmentation for audio dramatically faster and produces more realistic results ... Pedalboard has been thoroughly tested in high-performance and high-reliability ML use cases at Spotify, and is used heavily with TensorFlow.
What are the actual use cases internally at Spotify and for the public here?
> Applying a VST3® or Audio Unit plugin no longer requires launching your DAW, importing audio, and exporting it; a couple of lines of code can do it all in one command, or as part of a larger workflow.
I wonder how many content creators are more comfortable with Python than with a DAW or Audacity?
> Artists, musicians, and producers with a bit of Python knowledge can use Pedalboard to produce new creative effects that would be extremely time consuming and difficult to produce in a DAW.
Googling "how to add reverb" yields Audacity as the first option. A free, open source tool available on Linux+Win+Mac. In what world is it easier to do this in Python for Artists, musicians and producers?
As a music producer that's well versed in Python myself (even if I hadn't switched to producing almost entirely out-of-the-box and on modular/hardware synths) I'd much rather just apply basic effects like these in a DAW/Audacity, where accessing and patching a live audio stream is much easier than figuring out how to do that in Python and only being able to apply effects to .wav files rather than live audio.
As for creators, maybe not a large fraction of music creators are coders, but there's certainly an intersection in that venn diagram, though I have no idea how large it is. And I imagine this could be used to create other tools that don't require coding.
Clearly, most of the time it makes more sense to apply FX interactively in your DAW of choice, but I find it useful to programmatically modify audio sometimes. For example, I've written quick scripts using sox and other tools to normalize/resample audio, as well as slice loops. I could see being able to add other fx such as compression or maybe even reverb programatically could be occasionally useful.
I think you would be surprised to know how large that middle spot in the venn diagram is
A way to do such manipulation that is both convenient to use from Python (a major programming language in the field and well tied in to the major frameworks) and performant is extremely welcome.
This opens possibilities such as version control, collaboration via PR, the regular coding workflow etc.
(I am dabbling with music and Elixir + Rust at the moment, and definitely interested by what Pedalboard brings, including programmatic VST hosting etc).
Is also not really true. There are plenty of scriptable VST hosts, and libraries. BASS (the library) for instance has been around for ages and I've used it to host VSTs in script workflows.
This useful for me, both for ML and for adding effects through sounds. Would much rather use python than a DAW. I know enough signal processing to prefer running code I can inspect rather than using some opaque GUI.
This opens it up potential for a simple GUI. For a basic user, drag and drop an audio file and flip virtual switches. Or, easier integration into a mobile "podcast creator" app.
Somebody who had been with the company for a long time predicted that the broadcast world was going to end up demanding a box with just 3 buttons:
[ That was worse ]
[ That was better ]
[ Try something else ]
And now, 12 years later ...
REAPER (the DAW) has API's in Lua and Python.
This would be really useful, because you'd be able to programmatically process audio on tracks using pre-made effects, and let users create and share VST settings presets.
So users could write scripts which apply chains of FX directly to audio clips on tracks, grabbing the audio sources/files from the DAW programmatically. And a community library of useful FX chains could emerge from this.
Oh man -- Imagine a graphical node-based UI where the user can place nodes as FX and route the audio file through a series of FX nodes, with tunable params.
This is entirely doable!!
cat sound.wav | distortion | reverb | aplay
alias fx="sox - -t wav -"
cat foo.wav | fx overdrive | fx reverb | play -
(note that in practice you'd directly use sox's play command to apply effects as it's certainly muuuuuuuuuuuch more efficient than to spin up a ton of processes which'll read from stdin/stdout)
$ cat sound.wav | pedalboard --compressor --compressor-threshold-db=-50 --gain --gain-db=30 | aplay
You can't really do it without that, because sound.wav contains both actual audio data and "metadata".
In the real world however, almost nobody who has done this sort of thing actually wants to do it that way. The processing always has a lot of parameters and you are going to want to play with them based on the actual contents of sound.wav. Doing this in realtime (listening while fiddling) is much more efficient than repeatedly processing then listening.
I thought about doing it, but don't need it that badly and you know, so many ideas so little time!
I was very excited to see this , but with a GPL license I can't use it in my projects .
That makes sense.
I definitely do appreciate it, I couldn't figure out JUCE when I tired to use it.
Well then, they could:
* use Faust or Soul
* use existing plugins in LV2, VST3 or AU formats
* write a new plugin in LV2, VST3 or AU formats
* use SuperCollider, or PureData or any of more than a dozen live-coding languages
* use VCV Rack or Reaktor or any of at least half-dozen other modular environments to build new processing pathways.
Oh wait ...
So it's not actually for programmers at all, its for people "with a bit of Python knowledge".
OK, maybe I'm being a bit too sarcastic. I just get riled up by the breathless BS in the marketing copy for this sort of thing.
It's a plugin host, with the ability to add your own python code to the processing pathway. Nothing wrong with that, but there's no need to overstate its novelty or breadth.
[ EDIT: if I hadn't admitted to my own over-snarkiness, would you still have downvoted my attempt to point out other long-available approaches for the apparent use-case? ]