
Eigentechno – Principal Component Analysis applied to electronic music - umutisik
https://www.math.uci.edu/~isik/posts/Eigentechno.html
======
highd
This is really, really awesome. Some other ideas that would be really cool:

1) Do key detection and pitch-shift all the loops to a common key before
processing. That might make more of the melodies come through the
eigenvectors.

2) Visualize the loop point cloud - maybe with like 10-50 dimensions of PCA
followed by 2 or 3 dimensions T-SNE.

3) Maybe some form of earlier dimensionality reduction? I.e. you could do a
short-time fourier transform and then threshold and bin frequencies - then
invert the transform to reduce the sounds to their basic characteristics. That
make make it so not all of the first 20 or so eigenvectors all have slightly
different kick drums. For example, if you have kick drums with fundamentals at
21Hz, 22Hz, 23Hz, 24Hz etc. those will each require two eigenvectors to
represent the sine or cosine phases of the signal - but if you could "project"
every kick drum sound so they were close to linearly related then PCA could
isolate them with fewer eigenvectors.

I would love to play with some sort of live music generation system based on
this - really, really interesting ideas. And goes to show what can be built
with traditional data analysis techniques and a clever idea!

EDIT: Also if you uploaded your preprocessed data I think that would be really
amazing.

~~~
umutisik
Thank you for these great suggestions. I will make the dataset available
eventually. It's about 15 gigs so I just need some time to do it right. I will
let you know when I have it released.

~~~
aw3c2
If you struggle to find hosting (and even if not), archive.org would be a
great place to put it for longevity.

------
umutisik
Hi, I am the author of the article. I would appreciate any
comments/questions/suggestions.

Update: I switched the audio files to Soundcloud now so everything should work
again.

Eigencrisis! It looks like the UCI servers are having trouble. I uploaded the
audio files to the following Soundcloud playlist and I am working on embedding
the soundcloud player in place of the files.

[https://soundcloud.com/umutisik/sets/eigentechno-1](https://soundcloud.com/umutisik/sets/eigentechno-1)

Sorry for the inconvenience.

~~~
Steuard
First and foremost: This is really cool, and thank you for sharing! (It's also
the first explanation of the terms in the matrix equation for SVD that I've
happened across that has really clocked for me: much appreciated.)

Here's my worry, speaking as someone who knows physics but not all that much
audio processing. I feel like this approach will inevitably put a _whole_ lot
of (unhelpful!) emphasis on the detailed phase information of the various
sounds. The near-perfect cancellation of the average track illustrates this:
two loops of the exact same bass drum beat offset by a fraction of a second
will be treated as orthogonal or even opposite by this algorithm, but they're
essentially identical as perceived by the listener.

Conceptually, I imagine what you'd want is some way of encoding the various
loops whose average came out sounding like a real average, rather than as
nearly silent due to phase cancellation. My first instinct is to say "take the
FFT of each loop _first_ , and then run your PCA on that". Maybe that's not
the right answer: like I said, I'm not an audio processing expert. But I
suspect that right now your analysis is spending a huge fraction of its effort
effectively trying to get the first drum beat to happen at precisely the right
fraction of a second, and separately to get the second drum beat to happen at
precisely the right fraction of a second, and separately the third, and the
fourth, and so on. And heaven help you if the different loops' bass drums are
tuned to marginally different notes.

Edit: Only after writing this did it sink in that the top overall comment's
remarks (by "highd") about earlier dimensional reduction were getting at the
same issue. I'll leave this here, just in case the different framing is
useful.

~~~
umutisik
That's my thinking too. PCA completely ignores the phase shift symmetry and
treats a phase-shifted sine wave as a different thing.

One could definitely get something better if they took the audio-loops as
translation-invariant. I thought about trying to do a symmetry-invariant (or
'group-action-invariant') PCA but I could not find a good way of doing that.
More advanced methods, e.g. 1-d convolutional networks, WaveNet etc. do have
this translation-invariance built into them.

Interestingly, the second eigenvector is not just a phase-shifted version of
the first eigenvector but contains slightly higher frequency information as
well, but I agree totally with what you said.

~~~
thanatropism
Google "persistent homology" and "Rips complex" for a _completely_ different
approach.

------
sideshowb
Life imitating parody?
[http://archive.museumoftechno.org/exhibition_detail.php?id=6](http://archive.museumoftechno.org/exhibition_detail.php?id=6)

(see also
[http://archive.museumoftechno.org/exhibition_detail.php?id=4](http://archive.museumoftechno.org/exhibition_detail.php?id=4)
)

------
Kenji
_Even though the sampling loses information about the wave; from the sampled
data, your computer’s digital-to-analog converter can perfectly reconstruct
the portion of the sound that contains all the frequencies up to half of the
sampling frequency. This is the Nyquist Theorem. So we are O.K._

Pedantic note: This is only true in a theoretical setting where you have
infinitely precise samples. Since each sample has only so-and-so many bits,
and calculations also are lossy, you cannot reconstruct the wave perfectly.
But it's good enough. There are also formats with more bits per sample to
increase quality.

~~~
umutisik
Thank you. That's a good point which was not clear at all in my sentence.

------
empath75
Beautiful work and I think he should probably make a vst or something out of.
Producers are always looking for new ways to break sounds apart and put them
back together.

~~~
anigbrowl
FFT filters and the like are already a thing, I'm not sure what extra
practical benefit you think this would bring. Nevertheless it's a good start
and well-documented.

~~~
bitL
PCA can decompose sound to more accurate base functions (if you look at
covariance), so if you combine a few of them together with different
coefficients, you might be able to generate some cool sounds much faster than
with traditional methods. Later you might feed this to a RNN and perhaps
compose brand new good sounding songs automatically, make a new Vocaloid band
in Japan and go on a tour, generating new music on each performance ;-)

~~~
anigbrowl
I am actually very interested in its application to musical patterns, ie the
actual notes rather than the audio. I think there's already a tool that uses
this to generate rich and musically-correct MIDI on the fly but I'm having
trouble remembering the name/manufacturer now. Future Retro maybe.

------
hammock
I don't get it. I guess he is deconstructing music by PCA instead of frequency
or time. What is a practical application or interpretation of this? (not being
critical, just trying to understand the motivation)

~~~
umutisik
Thank you for your question. At this point, this is just to see how PCA works
on this kind of data-set. Normally, PCA is used to reduce the dimension of the
data in a way that loses less information. It can be useful for saving
computation on classification tasks for example.

~~~
hammock
Its a cool exploration for curiosity's sake. Seems weird to use it vs Fft
though because while some have said PCA is supposed to be "more optimal" , we
have prior knowledge that the music is actually constructed along the FFT
dimensions. So I wouldnt expect PCA to be any better

~~~
highd
FFT and PCA are totally different. FFT is a fixed linear transform - PCA finds
the linear projection to k-space that retains the most variance from the
original dataset. If you projected to the top k frequencies you'd just get a
couple of tones, while PCA finds linear combinations of the original signal
that are most "descriptive" in a sense.

Also human hearing is much closer to a wavelet transform or STFT than FFT.

------
gtani
I was starting on the opposite tack, tho haven't got far,
synthesizing/sampling kicks, snares and bass lines from scratch, which is
prett tricky, like a kick is a sine wave swept down, play with ADSR envelope,
saturate/compress, then you typically layer 3 separate kick tracks by phase
aligning and more post processing. There's a lot to getting a good sounding
kick with presence, attack, body.

The reference on this is Rick Snoman's Dance Music Manual, 3rd ed.

~~~
bitL
IMO you are limiting yourself by being in the analog mental model - ADSR,
oscillators etc. are what made analog circuits generate sound but in the
digital world they aren't necessary anymore and you can view all of it as just
(periodic) function compositions in different spaces and model far beyond what
analog can reach. Some new VSTis completely abandoned analog-style already.

~~~
bitwize
Yes, but how you will achieve that fat synth sound without analog?

~~~
sideshowb
How will you define fat? (I'm guessing that wasn't a serious comment!)

------
platz
also, as an alternative to PCA, try ICA (independant component analysis). This
will attempt to find non-gaussian sources in the data and might be worth
seeing if it can carve out more interesting portions of the sound.

~~~
sjg007
Try a NN

~~~
platz
good luck with that

------
adamnemecek
I'm working full time on a new DAW that should make writing music a lot faster
and easier. Current DAWs don't really understand music theory or the creative
process. Also the note input process and experimentation is extremely time
consuming and the DAW never helps. Current DAW : my thing = Windows Notepad :
IDE. The HN audience is definitely one of my core groups.

Sign up here
[https://docs.google.com/forms/d/1-aQzVbkbGwv2BMQsvuoneOUPgyr...](https://docs.google.com/forms/d/1-aQzVbkbGwv2BMQsvuoneOUPgyrc6HRl-
DjVwHZxKvo)

And ill ping you when it's released.

~~~
still_grokking
Current DAWs don't really understand music theory, that's right.

But there are such things as Synfire Pro by cognitone for example, which on
the other hand is unusable as DAW exactly like all the other software for
music composition.

I would love to see an usable combination of both software categories
eventually.

One question left: Your software won't be FOSS, or will it?

~~~
adamnemecek
I'm aware of synfire, I definitely took some inspiration from it but mine will
have a faster workflow, will be cheaper (synfire pro is like a thousand usd),
will look visually better etc.

It won't be Foss no

------
smortaz
Very nice. Suggestion: Would be great to put the Jupyter notebooks on
[https://notebooks.azure.com](https://notebooks.azure.com) so we can actually
run/edit/play with them. Zot!

------
nom
down for me. Here is a copy, albeit useless because the audio files don't
work:
[https://webcache.googleusercontent.com/search?q=cache:https%...](https://webcache.googleusercontent.com/search?q=cache:https%3A%2F%2Fwww.math.uci.edu%2F~isik%2Fposts%2FEigentechno.html)

Edit: audiofiles suddenly work, but still can't access the website

------
sjg007
I would think this distills into a sine wave... based on the moog.

