
DDSP: Differentiable Digital Signal Processing - matt_d
https://github.com/magenta/ddsp
======
abaga129
Wow what a coincidence! I got excited because I thought someone had posted my
project which is also called Ddsp! Mine stands for D Digital Signal Processing
since it is written in D.

Just going to shamelessly post it here.
[https://github.com/ctrecordings/ddsp](https://github.com/ctrecordings/ddsp)

It's always great to see more DSP projects though!

~~~
jessejengel
Sorry about that :), definitely an interesting coincidence. Funny enough we
actually were thinking of naming dDSP (as in derivative of DSP) but had
already submitted the ICLR paper so just stuck with all caps (also because
python is all lowercase).

------
ssalazar
arXiv link for more context:
[https://arxiv.org/abs/2001.04643](https://arxiv.org/abs/2001.04643)

and blog posting with audio examples:
[https://magenta.tensorflow.org/ddsp](https://magenta.tensorflow.org/ddsp)

------
PieSquared
Are you folks planning on extending this to speech? I'm always been
disappointed by how speech vocoder networks aren't built with any great
inductive biases for waveform generation (besides very long receptive fields),
and have desperately wanted something like this tuned for speech. It'd be
great if a DSP-based architecture could be shown to outperform WaveNet /
Parallel WaveNet / WaveRNN / WaveFlow / etc, and I'd love to use that in our
own work. (There's been some attempts based on source-filter models like the
"neural source filter (NSF) network", but nothing's caught on as best as I can
tell.)

------
unlinked_dll
I have a few of questions for the authors:

\- (w.r.t time varying FIRs) How did your results compare to traditional
NLMS/adaptive approaches? Were you able to achieve similar results with fewer
CPU cycles/lower filter order?

\- (also w.r.t FIRs) Have you looked at your approach as more
general/nonlinear model of adaptive filtering?

\- How do you deal with highly correlated parameters in your models?

\- (w.r.t dereverberation) How does your approach compare in fidelity and
performance to homomorphic filtering approaches for deconvolution?

~~~
jessejengel
Hi I'm Jesse, one of the authors, thanks for the interesting questions!

\- In terms fo the FIRs, I think you can think of this as a form of more
general/nonlinear filter modeling. The difference being I think that you can
have a filter as one of several components, and adapt them all jointly to
achieve some task (which itself can be more flexibly defined (different
losses, adversarial etc.). The filter itself is still just LTV-FIR, but it's
being controlled nonlinearly. We only have examined synthesis so far, but
other signal processing problems like denoising are definitely good
directions. The "effects" processors are designed for this.

\- It's true neural networks often learned correlated parameters but it
usually is of less significance because they operate in an overparameterized
"interpolative" regime, which has a lot of interesting ongoing research trying
to understand it.

\- We didn't do a quantitative comparison, but in general the tradeoffs will
be different. Dereverberation by a modular generative model will only sound as
good as the generative model itself, so artifacts will be from not modeling
the source properly. However, if you learn a good model, the dereverberation
should be essentially perfect (you can losslessly apply different reverb),
although that's a big if.

~~~
unlinked_dll
Thanks for the reply! This work is fascinating and while I'm not a python guy
I'm going to play with your library a bunch.

I do think you should investigate comparisons to adaptive FIRs much more. This
field is critical to the design of low power medical devices like hearing
aids, which need feedback reduction, echo cancellation, and the like with
minimal filter orders.

My question on correlated parameters was a bit more abstract. Often in the
design of classical audio signal processors for creative applications you find
that the user space parameters can be correlated, which map to more design
space parameters that are even more correlated, and down to implementation
level parameters which are even more correlated. For example in a filter
designed by frequency sampling, the adjacent bins of an FFT are highly
correlated in their I/O and I was curious if you optimized a bit by taking a
DCT or similar approach for reparameterization like you'd find in calculating
MFCCs and the like. It's really tough to design ML approaches for creative
signal processing that are better than traditional methods due to this nature,
humans learn and adapt to correlations very quickly, machines not so much when
dealing with oscillation and ripple. Many local extrema in the parameter space
and all that.

~~~
AstralStorm
Adaptive IIR would be more interesting, as automatically controlling and
designing those filters in a stable way is rather hard. And they're both
differentiable and power efficient. Especially anything that is not a biquad
series, and because they have recursion related computational noise, which the
ANN should be able to optimize out.

------
dbetteridge
Fantastic project!

For someone looking to learn more about signals and audio processing DFTs FFTs
etc any good material people can recommend? I've completely forgotten my Uni
CompEng Signals and want to revisit it.

~~~
anonytrary
I don't have links, but I have some advice that may be contrary to what a CS
major might say. I recommend against a CS-first approach. Focus on theoretical
fundamentals. Take "DFT" and "FFT" out of your lexicon -- those are not
important right now. Forget about the the D and the P in DSP. Focus on the S:
signals. Signals are just waves. Understand the mathematical basics of waves
-- bonus points if you study differential equations whose solutions are waves.

Once you're pretty confident with your understanding of wave math, next focus
on linear algebra. Linear algebra sounds like a strange requirement for
understanding signals, but it actually is fundamental. Fourier series should
then just "click" for you -- a Fourier transform is basically a change of
basis in a vector space.

Bonus points if you learn about infinite-dimensional vector spaces, what Dirac
delta functions are and what they mean. For example, why is it that a delta
function has infinite spectral energy? Either you have no idea, or you find
the answer obvious. There is no in between. Good DSP folks can answer
theoretical questions like this without thinking. Once you understand how to
think about waves in the time basis and the frequency basis, you will be well
equipped to understand pretty much everything in DSP.

~~~
p1esk
What’s wave math?

~~~
nicwilson
Fourier transforms and more generally convolutions.

~~~
AstralStorm
So remove DFT but use Fourier still?

No, real wave maths is continuous wavelets and those are not as useful in
processing. (DFT is a kind of wavelet transform, just limited.)

And then you have the advanced tensor wave math which is used in physics but
rarely in sound processing. I bet you didn't have evaluating Schrodinger's and
symmetric solutions was what you had in mind.

You'll get much more mileage out of statistics and discrete mathematics, plus
control theory and optimization theory.

~~~
nicwilson
There are many different types of Fourier transforms (i.e. generalisations and
specifications(?)) useful for signal analysis, not just the (classical full
spectrum) FT. E.g. the fractional FT (useful for signal separation), laplace
transform (used all over the place in control for linear(ised) systems and
analysis).

> Schrödinger

Well momentum is the FT of position so...

Also note the OP's question is "Whats wave math?" not "why don't use the
(D|F)FT?" which I'm still surprised was suggested: Its a tool, use it when
appropriate, use something else when its not.

~~~
p1esk
What would be a good practical project to learn about DSP? Preferably in
Python, and preferably relevant to music somehow.

~~~
nicwilson
Well I did most of my signals analysis with an oscilloscope circuitry and a
function (read: waveform) generator. You could very easily get computer
versions of those and substitute the function generator for some music. But
I'd consult the elders of the internet for better ideas.

------
zburatorul
Do you think DDSP could be used for feature engineering/discovery? My setting
is a time series that I want to do regression on. But it's not clear a priory
what features of the series have good predictive power. I imagine making the
DDSP the first layer of a FNN and the gradients helping me identify the right
filters to use to extract important features from my data.

~~~
jessejengel
DDSP modules are helpful in situations where you want to impose some level of
interpretability and modularity. Most also don't have parameters themselves,
but must have them provided by another network or variable. So you could
imagine for instance feeding your data through a NN that then predicts filter
coefficients, then running the same data through a filter with those
coefficients (if you wanted to enforce time-varying linearity for
interpretability let's say).

~~~
dr_kiszonka
I don't know much about DSP, neural nets, and audio, but I am really intrigued
by this project. If you have a second, could you give a simple example of how
this approach could be applied to problems outside of audio?

------
jeeceebees
I don't really understand why I'd use gin. From the example ipynbs it looks
like pretty much the same amount of code but in a gin file and then it
spookily fills in parameters for you in python. Why is this useful?

