

Algorithmic Music Generation With Recurrent Neural Networks [video] - rndn
https://m.youtube.com/watch?v=0VTI1BBLydE

======
crucialfelix
The funny thing is that the only really good ones are the first few where they
claim its just random noise. The later ones just sound like a crappy radio.

With images the technique works because we like looking at the dense
artifacts. Millions of dog heads essentially copy and pasted onto any
appendage that looks like it should be a head. It looks like drugs and
overwhelms in the same way.

If you just take white noise and throw it through a tuned filter bank (which
is in essence and in final effect what they are doing here) then you just get
crappy audio.

The more standard and successful use of NN in composition is the use it on
pitch series and compositional forms. Like feeding it all of Beethoven and
then getting it to generate similar compositions. That's been going on for
decades. You can do it with that kind of data.

But the thing about pop and electronic music is that the easily machine
observable elements are not very interesting. Listen to the 4/4 kick and snare
pattern in the video. Its boring as hell. (Other tracks can be just a kick and
snare and they are amazing and we celebrate them as classics and play them for
20 years. Machines will never understand why)

What's great and essential are things like spatial relationships between
elements in the mix: how does the surge of the compressed synth/guitar cause
the beat to tumble outward and stir you up ? after a series of peaks in a
synth melody then the next time it pulls back creating a space that pulls at
your heart strings. you create a negative space that the listener goes into.
playing with listeners expectations based on what songs, conventions and
tropes they already know and respond to.

~~~
sweezyjeezy
The image stuff works because we have found a way to model a good prior for it
: convolution layers are basically enforcing some positional invariance and
locality constraints on what our model believes the world looks like. Without
this very strict prior, image recognition with neural networks just wouldn't
really work.

We haven't found a way to enforce a good prior for temporal data like sound
yet.

~~~
TheOtherHobbes
I realised a while back a lot of computer music is really computer science
music - it's people who know something about computers but not much about the
art of music, playing with relatively trivial algorithms to create music-like
results.

There's also the academic musical equivalent - music professors using stock
faddy techniques like serialism or (currently) number and group theory.

It's not that this is an impossible problem. It's more that the set of people
who can code machine learning algorithms _and_ understand music theory _and_
are creative enough to invent new algorithmic techniques _and_ to create more-
than-listenable music is incredibly small - double figures, if that.

So progress in non-trivial computer music has been incredibly slow. The DSP
side has been far more successful, because DSP is - in most ways - a much
simpler problem.

~~~
ThomPete
Music is (dis)harmonies over rythmic patterns. There isn't anything inherently
artistic about humans that computers can't replicate with time even the
ability to compose an original song. That is besides the lives of humans and
their appearance and history which is important but not the only factor.

The irony is that musicians are actually striving for, but failing at,
reaching the perfection level that computers have.

And so for computers to sound more human like they have algorithms that make
them more "sloppy"

What composition algorithms lack is not the ability to compose like humans but
a life that will give them angels and a story.

Then again a lot of music is really formulaic anyway and computers are used
for most of it. There is nothing in a few years that will hinder some sort of
computer star to be born. But it's probably never going to connect with us the
same way another human can. Not for now at least.

~~~
TheOtherHobbes
I think that's a good example of what I'm saying - just because _you_ don't
understand the details doesn't mean professional musicians and composers don't
have much deeper insight into music than you do.

If you think music is [list of numbers] that can be made more "human" with a
bit of timing randomisation, then of course it's all perfectly
straightforward.

In reality there's rather more happening.

>What composition algorithms lack is not the ability to compose like humans
but a life that will give them angels and a story.

No, the music basically sucks as music. The number of people willing to listen
to it voluntarily without being paid to - usually as students or academics -
is vanishingly small.

The story part only becomes relevant after that problem is solved.

And while it's true that music is formulaic, it's also true that computer
music hasn't yet worked out how to copy all the details of the formulas -
never mind produce original and memorable new formulas from scratch.

The best formula copier is probably Cope's EMI, and that sounds exactly like
what it is - a slightly confused cut-and-paste cliche machine, not a human
composer with a point to make.

~~~
ThomPete
I think you are having it the wrong way around.

Music becomes meaningful in the listeners mind, and the things that make it
meaningful is both that it's formulaic (structure) and whatever the performer
instills in the listenter.

------
mpdehaan2
I'm not 100% up to speed on my AI, but this sounds about like what you'd get
with random variations on a signal, where the neural net is the "which sounds
like X" filter, and picks one of the two to survive. But that would be using
both some form of a genetic algorithm (details TBD?) and the neural net as the
checker. But is it?

If they aren't doing it that way, I'd be interested in hearing how it's
evolving the signal in that given direction - and also how that filter works
(what libraries does it use?).

Sounds like it hit some sort of local maxima, so this system won't ever
produce the original song, but something a percentage of the way toward it.

I'm a bit more interested in algorithmic composition, but this could be
interesting if trying to blend genres. For a long time I've wanted to build a
program that could produce essentially an infinite song morphing between
genres with lots of tunable parameters.

------
msamwald
It would be interesting to know how novel those sequences are (obviously, the
outcome would be far less impressive if what we hear is basically a looped,
noisy sample of a song that already exists).

------
m-i-l
Not much information in the video on how this was achieved, but a quick search
for "gruv algorithmic music generation" returns the following:
[https://sites.google.com/site/anayebihomepage/cs224dfinalpro...](https://sites.google.com/site/anayebihomepage/cs224dfinalproject)
. Extract:

 _We compare the performance of two different types of recurrent neural
networks (RNNs) for the task of algorithmic music generation, with audio
waveforms as input (as opposed to the standard MIDI). In particular, we focus
on RNNs that have a sophisticated gating mechanism, namely, the Long Short-
Term Memory (LSTM) network and the recently introduced Gated Recurrent Unit
(GRU). Our results indicate that the generated outputs of the LSTM network
were significantly more musically plausible than those of the GRU._

------
leaveyou
Another promising field is RNN applied to TED talks:
youtube.com/watch?v=-OodHtJ1saY

------
acd
I think it would sound better if we thought the neural network to play notes
and music theory.

------
anentropic
This is not music. Music is not simply organised sound. Music is a cultural
practise.

~~~
yyyyes
This is not a cultural practice?

------
durbin
Source code link?

