
Generating classical music with LSTM neural networks - narenst
https://blog.floydhub.com/generating-classical-music-with-neural-networks/
======
TaupeRanger
Always appreciate new attempts at music gen. As with every attempt thus far,
even the hand picked selections sound like random nonsense locked to diatonics
within a particular key, and no real harmony or counterpoint to speak of (and
that's the "good" output, they never let you hear just any old output, it's
always 'listen to this handful I selected, the rest may be total garbage').

~~~
raptorraver
What if we would train computers to compose same way as we teach composition
students: renaissance counterpoint, fugues of Bach and harmonic structuring of
classical era (sonata form)?

~~~
TaupeRanger
Unfortunately, as most people in computational creativity will tell you,
teaching/learning "rules" is only a tiny sliver of the problem. But even
getting a computer to "understand" those things in a way that would allow it
to apply them to the act of composing is vastly beyond our current
understanding.

------
evrydayhustling
I found the part about notewise vs chordwise encodings very interesting!

Ages ago I was a sequencer geek (Impulse Tracker!) while also noodling around
with guitar, and I noticed something strange: I made music I liked a lot more
when I composed on guitar and transposed onto the sequencer afterwards. After
a lot of experimentation, I realized that the constraints on what my hands
could do on guitar were (of course) having a huge impact on what I _tried_ to
do when composing -- and struggling with the constraint was helping me make
music I liked more.

I like a vision for practical machine learning where we spend less time on
plumbing and more time thinking about the kinds of constraints (e.g. through
input encoding) that enable "creativity" on the part of the machine.

~~~
mcleavey
That's so interesting - you're totally right that setting constraints often
leads to really creative ideas. It reminds me of the "crab canons" by Mozart
and Bach:
[https://en.wikipedia.org/wiki/Crab_canon](https://en.wikipedia.org/wiki/Crab_canon)
.

I also think there's room for other creative encodings for music - possibly
expanding these notewise/chordwise ideas, or possibly going in a totally new
direction. It's fascinating to me how much the generations are affected by the
encoding.

~~~
evrydayhustling
Another fun direction is to generalize the kinds of constraints we put on our
own instruments! I had a chance to play with that in a graduate class by
implementing an API for midi generation where you set chord fingerings and
strum patterns independently for a guitar of [N] strings.

Of course, I had to "play" the guitar myself by writing song sequences in
those terms... it would be terrific to see what an AI could do with a notation
scheme representing, say, a 20 string guitar or a 30 foot long flute.

------
citilife
We're actually using a similar technique (quite a bit more complex) to
generate synthetic data for applications:

[https://medium.com/capital-one-tech/why-you-dont-
necessarily...](https://medium.com/capital-one-tech/why-you-dont-necessarily-
need-data-for-data-science-48d7bf503074)

IMO this is eventually going to replace a lot of tasks. This for example, can
dynamically generate elevator music (or music in an office). The system we
built can generate synthetic data for testing and sharing samples of datasets.
Eventually, we'll have entirely synthetically generated videos,
advertisements, and more.

In 50 years, entire movies may be generated.

~~~
isoprophlex
Is anyone currently affected by a lack of elevator music, be it due to
financial reasons or any other reason, and does your approach solve this?

I hope you'll agree that you gotta find a better, more sympathetic example if
you want to sell your generative algo's...

~~~
saaaaaam
Yes. Mood Media is a company that bought Muzak, Inc - the original elevator
music company (and the reason we sometimes talk about disposable music like
this as “muzak”). They are a substantial business now owned by private equity.
They acquired Muzak for soemthing like $300m a few years back.

Background music is actually quite difficult, commercially. Someone needs to
write and arrange it, and they need to be paid - either royalties each time it
is played which is why a lot of companies don’t use “known” music for
telephone hold and so on - it’s too expensive. If it’s not on a royalty basis
then the writer needs to be bought out - which can be expensive.

So having algorithmically generated music is actually really interesting
because there is potentially no author to be paid. This is actually an
emerging area of music copyright law. If an algorithm writes music who owns
the copyright to that music? The computer? Probably not, not a legal person.
The people who wrote the algorithms? Possibly - but did they actually create
the music? Or does no one own it - meaning anyone can use it without payment?
If a label commissions an algorithm to write hits who owns the music
publishing?

~~~
isoprophlex
Thanks for changing my mind on this, I was looking at it in an overly
simplistic way

~~~
saaaaaam
You are welcome!

------
londons_explore
The issue that 'rests are so common, we need to remove them or the algorithm
would just predict rests all the time' shows the flaw with this approach.

If there is some pattern in your data, and your algorithm, rather than
replicating something similar to the pattern, just outputs the most likley
value at any point in time, then it is never going to work as you hope. Rests
are a symptom of this, and fixing them doesn't fix the underlying issue.

There are a bunch of solutions to this, but adversarial models do a good job
of approximating a probability distribution like this.

~~~
gwern
> There are a bunch of solutions to this, but adversarial models do a good job
> of approximating a probability distribution like this.

The problem is GANs on sequence data still stink compared to max-likelihood:
they train far more slowly, more unstably, and still don't generate decent
sequences compared to a char-rnn with a bit of temperature tuning & beam
search. They _should_ be better for precisely the reason you say, but they
aren't.

------
visarga
I am struck at the quality of music neural nets can generate today. Just a few
years ago it was much worse - the notes would make sense for 2-3 seconds and
then they would just drift into another direction. And using the Transformer
for music is an intriguing idea.

Edit: apparently someone has already implemented music generation with the
Transformer. Samples: [https://storage.googleapis.com/music-
transformer/index.html](https://storage.googleapis.com/music-
transformer/index.html)

------
big_t
In the answer to "Wait, what's a rest?", I'm intrigued by the definition of
"...any time step where you don’t play any _new_ notes." (emphasis mine)

Why not have each time step contain all pitches that should sound during that
time step (so starting a new quarter note and continuing a half note would
both appear in the same time step)? Then at the end of generating the music,
perform some post-processing to get the note lengths. Would the approach in
the interview having any significant advantages to this approach? (I suppose
you do lose the ability to rearticulate a pitch with my idea)

------
microtherion
I got 3 out of 4 correct. In the first two questions, the AI seemed easy to
identify because of alien rhythmic patterns, not really because of melodic
content. In the 3rd, the AI was identifiable because the piece, while
pleasant, seemed to lack a plausible development of the idea (but this is
something that easily could be ascribed to a second rate human composer). The
one I got wrong, the AI composition was pretty good, and the human one had
exactly the alien rhythmic patterns that to me were a giveaway for an AI
composition. Weird composer or bad performance?

Do you have any examples of jazz compositions by your software? Would be very
interested in hearing that.

~~~
whatrocks
There's a short snippet of a jazz composition from Clara near the top of the
post.

~~~
microtherion
Ah, thanks! Not entirely natural, but could marginally be passed off as
"Thelonious Monk, having drunk one Espresso too many".

------
nilanp
I'm really curious how much effort there was in building up the data set -
before, training the model before you got to "music"

Reading the steps feels like 9 months to a year before you got to credible
music.

What kept you going in the belief this would work. I can think of 20 reasons
why this shouldn't work - hence its "surprising" that it does. Its quite
easily something you could have worked on for 5 years with no results.

reading your background - it also sounds like your time would be tightly
constrained hence figuring out where to deploy it - you need to have some
conviction you'll have success

------
jcoffland
Awesome work Christine! I've only ever heard you play classical music in
concert. Any plans to perform bits of your AI generated music live. Perhaps
with Ensemble SF?

Also, I noticed your data format has a flag for instrument type. Have you
considered generating for voice? Obviously a very different beast but it seems
the same principals could apply. It would be important to restrict the music
to a model of what a human is capable of to make it singable. Adding physical
constraints to the piano generated music might also be interesting. Fingers
are so long and there are usually only ten.

------
skissane
Has anyone done work on automated evaluation of the quality of a musical
composition? Possibly by training a neural network, or maybe even just by
designing some heuristic rules which try to capture what elements make music
pleasing to humans?

Then, could you train a neural network (or a genetic algorithm, or whatever)
to compose music that is assigned a high quality score by such a composition
quality evaluator?

~~~
big_t
I actually just recently took a shot at something very similar to this for my
undergrad thesis! [0]

I used genetic algorithms to generate 4 measure melodies, using a long short-
term memory (LSTM) neural network to determine the fitness of melodies. I
trained the LSTM on snippets of music by J.S. Bach. It was able to distinguish
between random noise notes and actual music quite well, and to a somewhat
lesser degree between Bach and other composers.

The melodies it produced were...mixed in quality. I really liked some of them,
but quite often it would get stuck at some local maxima of the fitness and
couldn't mutate its way to something better.

[0] [https://github.com/ThomasMatlak/is-
software/tree/master/gene...](https://github.com/ThomasMatlak/is-
software/tree/master/geneticAlgorithms)

------
citnaj
>"More recently, there is a shift towards using a Transformer architecture,
and right now I’m experimenting with that as well."

I'm really curious- any early results to share on that? Attention really does
make a big difference on a lot of things (including work I've done so I know
first hand). It should improve the coherence of the entire music piece in
theory at least, right?

~~~
mcleavey
Transformer is working _really_ well- I'm very excited. I'll probably be
sharing results soon. Yes, the attention makes a huge difference & the pieces
are both more creative and more coherent.

~~~
gwern
Have you considered using 'learning from human preferences' as the loss
function in addition to the Transformers? That was another OpenAI project, and
it seems tailor-made for music generation: what is more 'I know it when I hear
it' than music quality?

------
gleenn
Reminds me of this pretty cool music generation talk at StrangeLoop. I can't
find his own site, but here's the SL page:
[https://www.thestrangeloop.com/2018/making-machines-that-
mak...](https://www.thestrangeloop.com/2018/making-machines-that-make-
music.html)

------
scottlocklin
LZW does a creditable job as well.

[https://arxiv.org/pdf/1107.0051.pdf](https://arxiv.org/pdf/1107.0051.pdf)

------
EADGBE
Better not have any parallel fifths in there or I swear to god my Theory
professor will come out of his grave and berate the AI for it.

------
p1esk
Hi Christine, what are your thoughts on using reinforcement learning for music
generation? Has anyone tried that at OpenAI?

------
robbiemitchell
Need some Chopin to help train the full 88 keys!

~~~
mcleavey
Haha, very true. It was Charlie's question in this interview that made me
realize the 62 key limit was an old fix that I no longer needed, so now I'm
trying out expanding my dataset and also expanding to the full 88 keys!

