
Classical music generation with recurrent neural networks - hexahedria
http://www.hexahedria.com/2015/08/03/composing-music-with-recurrent-neural-networks/
======
pierrec
As a student of classical I'd like to point out what I believe to be the main
in-humanness in the generated music that makes it stray from "real" classical.

Some have cried violation of counterpoint, but I don't think that is such an
issue here. CP is mostly about how notes are linked together on a small scale,
and this is rather the network's strong point: it seems to concentrate on the
associativity between notes and between chords, and any given moment seems to
be composed with a vision that only extends to a few bars (or even just a few
beats) around that moment.

The main problem is therefore large-scale structure. For one, recurrence of
melodies (at key moments and transpositions) is crucial to creating the
emotional value of classical music. None of this appears here.

And secondly, possibly the greatest shortfall of the present neural network
(I'm ignoring performance, of course) is the harmonic structure. Classical
music, let's say later than medieval and earlier than late-romantic in style,
generally has the harmonic structure of a recursive cadence. Harmonic cadences
are what give emotional power to harmony, but this NN is painfully incapable
of creating any.

That being said, I don't think this problem is inherent to the approach of
creating music with NNs. Right now it sounds like what you'd get with a well-
crafted Markov chain, but NNs _can_ go beyond this, and this article is
exactly the kind of thing that will instigate this evolution.

~~~
bane
I agree, I kind of feel like there should be a couple different layers of
generation.

A "planning" layer that lays out the song plan (ABACABBA, etc.)

A composing layer that fills in those sections. And maybe even generates some
slight differences between the same-named sections for variety.

A performance layer that plays it back with a simulation of human performance
metrics (slight jitter to note placement, emotive crescendos, suggestive
variations in note-length, etc.).

~~~
visarga
Maybe this kind of thing can also be learned by a secondary NN. It just needs
to be trained with data collected over large scale sections of the example
music.

But this NN doesn't solve the greatest problem in Classical music: that only
3% of people take the time to appreciate it.

~~~
p1mrx
Let's design a neural network to appreciate classical music, then spawn a few
billion instances. Greatest problem: solved.

~~~
octatoan
[http://xkcd.com/1546/](http://xkcd.com/1546/) comes to mind.

------
mathetic
It is hard to look at this post and its results and not remind ourselves Lady
Lovelace's quote from nearly 200 years ago.

"[The Analytical Engine] might act upon other things besides number, were
objects found whose mutual fundamental relations could be expressed by those
of the abstract science of operations, and which should be also susceptible of
adaptations to the action of the operating notation and mechanism of the
engine...

Supposing, for instance, that the fundamental relations of pitched sounds in
the science of harmony and of musical composition were susceptible of such
expression and adaptations, the engine might compose elaborate and scientific
pieces of music of any degree of complexity or extent."[0]

Edit: s/300/200/. Thanks to icebraining I stand corrected.

[0]
[https://en.wikipedia.org/wiki/Ada_Lovelace#Conceptual_leap](https://en.wikipedia.org/wiki/Ada_Lovelace#Conceptual_leap)

~~~
mafribe
It's worth pointing out that the idea to have automata make music predates
Lovelace. Music making automata were a staple of the renaissance. For example
the mathematician and astronomer Johannes Kepler, when visiting the
"Kunstkammer" of Rudolf II in 1598, was amazed at an automaton representing a
drummer who could "beat his drum with greater self-assurance than a live one"
[1].

Incidentally, Kepler corresponded with Wilhelm Schickard on the latter's
"arithmeticum organum", the first ever proper mechanical calulator (could do
addition, subtraction, multiplication and division).

Automating creativity was very much an idea with much currency in the
renaissance. Indeed some of the key advances in mechanical automata, which
later evolved into computers, where driven by the desire to automate
creativity [2]. The "conceptual leap" that some people lazily ascribe to
Lovelace, wasn't hers!

[1] Jessica Wolfe, "Humanism, machinery, and Renaissance literature".

[2] Douglas Summers Stay, "Machinamenta: The thousand year quest to build a
creative machine". Associated blog:
[http://machinamenta.blogspot.com](http://machinamenta.blogspot.com)

------
paraschopra
This is fantastic! I've been meaning to make an RNN to generate EDM.

For begineers, I'd like to summarize some excellent resources to start:

\- Coursera course by Andrew Ng (he explains everything magically. The best
course I've done online)
[https://class.coursera.org/ml-003/lecture](https://class.coursera.org/ml-003/lecture)

\- Neural Networks and Deep Learning neuralnetworksanddeeplearning.com/ (I
highly recommend trying to write your own backprop and MNIST dataset
classifier. I wrote in JS and gave me a lot of confidence)

\- Oxford ML class (2015)
[https://www.cs.ox.ac.uk/people/nando.defreitas/machinelearni...](https://www.cs.ox.ac.uk/people/nando.defreitas/machinelearning/)
(perhaps the the most recent MOOC on ML online. Things are progressing so fast
in deep learning that it's worthwhile to do multiple ML courses to get
different perspectives)

\- I also enjoyed [http://karpathy.github.io/2015/05/21/rnn-
effectiveness/](http://karpathy.github.io/2015/05/21/rnn-effectiveness/) for
inspiration and the video
[https://www.youtube.com/watch?v=xKt21ucdBY0](https://www.youtube.com/watch?v=xKt21ucdBY0)
is a FANTASTIC summary. HN earlier pointed our the course videos are out
[http://cs224d.stanford.edu/syllabus.html](http://cs224d.stanford.edu/syllabus.html)

\- Again for inspiration into big challenges, this talk from Andrew Ng is a
good one
[https://www.youtube.com/watch?v=W15K9PegQt0](https://www.youtube.com/watch?v=W15K9PegQt0)

~~~
david-given
When char-rnn was announced I deep some crude hacking and started throwing
MIDI files at it. It worked surprisingly well (although, TBH, still quite
badly), and it very quickly discovered avant-garde prog jazz:

[https://soundcloud.com/david-given/sets/procedural-music-
via...](https://soundcloud.com/david-given/sets/procedural-music-via-char-rnn)

------
state
When people started switching to online streaming services, like Spotify, I
always hoped that something like this would replace them. This isn't quite
there yet, and all we really have is the 'classical station', but wouldn't it
be great if there were just continuously generated genre-specific streams like
this?

In the future we can call it 'radio'.

Edit: Not sure why this seems to be attracting downvotes. Just to clarify: I'm
completely serious. I wish I could just tune in to this neural net for the
next few hours, and it strikes me as a perfect form of what we call radio now.

~~~
pierrec
xoxos, who's created some of the nicest algorithmic music generators out
there, asks us in a post [1] to imagine a future where creators of algorithms
are considered in the same way as today's musicians, and their algorithms are
considered much like today's recorded music. The audience/performers, instead
of playing an audio file, could "play" an algorithm.

I think we're still a few generations away from such a thing becoming
mainstream, but I'd love to be proven wrong!

[1]:
[http://www.kvraudio.com/forum/viewtopic.php?p=5210459#p52104...](http://www.kvraudio.com/forum/viewtopic.php?p=5210459#p5210459)

~~~
sixdimensional
When you said "play an algorithm", it reminded me of this idea I had where I
wanted to play back the execution of a running program and map the assembly or
IL to notes/frequencies/instruments/sound. Literally, "playing the code as if
it were music".

Then I had this strange thought - what if you could monitor the cacophony of
your running systems, and detect a problem, or a certain event, just by the
presence of a particular audio theme or tune. I bet an infinite loop would be
pretty annoying and obvious. Just as long as the server getting overloaded
doesn't sound like getting rick-rolled ("Never Gonna Give You Up" by Rick
Astley).

~~~
unimpressive
People did this with old computers that emitted lots of radio interference
like the PDP-1. It was possible to debug in exactly the way you're describing
by listening to a radio held up to the CPU.

[https://www.youtube.com/watch?v=XtKd-
TlYGuA](https://www.youtube.com/watch?v=XtKd-TlYGuA)

------
kastnerkyle
This is really cool stuff - the network structure reminds me a lot of Graves'
MDRNN[1] and Grid LSTM[2], as well as some work I helped with (ReNet [3])

I wonder if the structure over frequency/time is too "regular" \- in general
for sound the frequency correlation and the time correlation are on wildly
different scales.

Also if you are looking to go farther you might reconsider adding NADE or RBM
[4] on top, or latent variables in the hiddens[5][6] to add more
stochasticity.

There was some alternate work by Kratarth Goel extending RNN-RBM to LSTM and
DBN, it might give you some ideas to look at [7]. I know when we messed with
bidirectional LSTM + DBN for midi generation it lead to this kind of
"jumbled/dissonant" sound you seem to be having - don't know what to make of
it here. You might consider bi-directionality over the notes, though it makes
the generation way more annoying.

Awesome work! I will definitely be sharing around and checking out your code.

[1] [http://arxiv.org/pdf/0705.2011.pdf](http://arxiv.org/pdf/0705.2011.pdf)

[2] [http://arxiv.org/abs/1507.01526](http://arxiv.org/abs/1507.01526)

[3] [http://arxiv.org/abs/1505.00393](http://arxiv.org/abs/1505.00393)

[4] [http://www-etud.iro.umontreal.ca/~boulanni/ICASSP2013.pdf](http://www-
etud.iro.umontreal.ca/~boulanni/ICASSP2013.pdf)

[5] [http://arxiv.org/abs/1411.7610](http://arxiv.org/abs/1411.7610)

[6] [http://arxiv.org/abs/1506.02216](http://arxiv.org/abs/1506.02216)

[7] [http://arxiv.org/pdf/1412.6093.pdf](http://arxiv.org/pdf/1412.6093.pdf)

------
Xcelerate
The biggest reason I want this kind of thing to succeed is to see a piece of
artificially-generated music pass the "musical Turing test". If you play a
piece of generated music for someone, (no matter how good it is), as long as
you tell them that an algorithm produced it, you'll always have _that guy_ who
tells you "it doesn't have the 'soul' that _real_ music has".

So make them prove it. Put them through numerous statistically significant
double-blind tests with human-vs-computer generated music. I think at some
point in the near future no one will be able to legitimately claim that
there's anything "magical" about human-produced art.

~~~
pierrec
I'm pretty sure David Cope has passed that point already with his algorithmic
techniques. However, the kind of formal double-blind that you suggest would be
difficult to put together in a way that everyone agrees upon unanimously. It's
also a problem that most of his algorithmically created music has never been
performed and only exists as poor mock-ups of a performance.

------
aidos
There's so much to love about this post - and I've only just glossed over the
details.

I'm really impressed with the quality of the music that's come out of this.

It feels like there's not that much in the way of dynamics in it (every note
is hit with the same force) - is that right? I suspect that these pieces,
played by a professional who could add more of the human element to the feel,
would sound really good. Obviously, that's sort of against the point, but then
again, Ravel wasn't much of a pianist (apparently!) but he could compose
amazing music – so it's not totally cheating.

~~~
hexahedria
Yeah, I simplified it by not using any dynamics in the generation process. I
could probably add some version of dynamics using the MIDI velocity, actually,
but I haven't done that yet. Also, I generated the mp3 files from MIDI using
GarageBand, which doesn't help with the flat dynamics.

------
rhema
This is really a great application of RNNs. Could you comment on how long it
took to train, once everything was set up?

Also, how long does it take to generate one of the songs, using the AWS
instance you describe in the article?

~~~
hexahedria
I trained it for about 24 hours, although there didn't seem to be that much
improvement after the first 12 or so. Generating a song with the trained
network actually happens almost in real time. I'm tempted to try to make it
continuously generate new music and stream it, but even the small cost for the
instance would start to add up, so I haven't actually tried setting that up
yet.

Another interesting data point: the learned set of weights ends up being about
15MB.

~~~
karpathy
Nicely done but wait, 15MB?? Clearly the model isn't big enough :) Are you in
under or overfitting regime?

As I was listening through the samples it seemed to me that it would start out
quite energetic and then converge on repetitive, slow chords. Any ideas on why
this could be happening? Or perhaps it's not true.

Also, you should label the samples with numbers so that it's possible to refer
to them easily. I liked 4th from bottom quite a bit in the beginning.

~~~
strebler
What's wrong with 15MB? A Googlenet model is only ~50MB while being able to
recognize objects in real world images with good accuracy.

~~~
conceit
Yes, and if you look at the images deep dream generates from that data, it's
not even close. This isn't about recognizing music.

------
sc00ter
Not quite classical - "jazzical" perhaps? There is a freeform nature to the
music that is more Free Jazz than classical, but the underlying classical
style of the input is still readily apparent.

It might be interesting to feed it Jazz instead. Jazz in Jazz out.

~~~
fezz
Oddly enough it has the characteristics of bad jazz, like prono music but
classical.

------
espadrine
> _It ended up costing about $5 for all of the setup, experimentation, and
> actual training._

This is the most amazing piece of information. I bet there are tons of low-
hanging fruits that, thanks to the openness of academic papers on the subject,
cheap hardware and computational power, can provide phenomenal results in the
field, even for hobbyists.

------
crucialfelix
I'll read this properly in the morning, but my first impression: I think if
you want it to be classical music then it has to obey the rules of
counterpoint, and the pitches should wander and resolve according to those
rules. Its the polyphony and interaction between the voices that sounds wrong.
I'm not sure that figuring out counterpoint is a suitable job for an NN to
figure out.

~~~
TheOtherHobbes
Yes. Sorry, but as a music geek, this is actually pretty terrible - not even
close to bad first year composition student pastiche, and a long way short of
David Cope's EMI, which is probably the current state of the art.

I wish coders who are trying to do something in a creative domain would _learn
the basics_ and not just assume they can throw some simple algorithms at an
artform and get anything close to an acceptable result.

No one is going to take a coder seriously if they can't code fizzbuzz.

Here's a thing to know: all the arts have their own equivalents. If you don't
know what they are, learn them. Then maybe you can start thinking about non-
toy algorithms and data structures that are going to impress an audience that
cares more about quality of output than implementation details.

Most people who start working with domain-specific knowledge find it's _much_
harder than they think.

~~~
habitue
This isn't interesting because it generates the best algorithmic music, it's
interesting because he trained it on data, and didn't implicitly encode (very
much) music theory into it. In other words, the fact that it sounds musical at
all is due to the power of the neural network, and not to a carefully human-
curated set of music theoretical constraints, onto which an RNG selects the
remaining free variables. It's learning music from reading music and it's
generating music based on a model constructed from the data itself.

~~~
TheOtherHobbes
But that's not actually true. It's a note sequence mash-up machine, not a
music theory machine.

That's the point I'm making. You'd get similar-sounding results by taking
semi-random snippets of the source data and splicing them together with a tiny
bit of glue logic.

The NN is more or less doing that anyway, but by more roundabout means.

It's a _long_ way from there to being able to say that it has a non-trivial
model of classical theory.

------
ciconia
The title of the post as it appears here on HN is misleading. Actually the
original title does not mention classical music, and the original post as a
whole mentions the word _classical_ exactly once, when referring to the
Classical Piano Midi Page[0], from which he took the material he used to train
the neural network. So in a way the entire discussion here is misguided.

The post is interesting I guess for its discussion of neural networks, but I
fear attempting to train an AI system for 24 hours to produce anything which
purports to imitate an artform that was developed by countless generations of
human beings, is a bit pretentious.

As an aside, I find it a bit alarming that people are seemingly so eager to
generalize the term "classical music", without taking into account that it
refers to a field which is almost infinite in its diversity of forms and
styles.

[0] [http://www.piano-midi.de/](http://www.piano-midi.de/)

~~~
pierrec
I don't think the conversation is misguided. The meaning of a word varies with
context, and this neural network was not trained on, say, Hindustani classical
or twelve-tone music.

The vagueness of "classical music" can be annoying sometimes, but in this case
people approximately agree on a contextual definition. In the top comment, a
precision is brought: "later than medieval and earlier than late-romantic in
style", which encompasses almost all of the set used to train the neural
network (and consequently, the style that the network is trying to reproduce).
Sure, it's a broad range, but the theory that can be used to study it is
surprisingly detailed and generalizable.

------
nuclearsugar
For anyone needing a collection of MIDI files to experiment with, here is an
excellent dump.

[http://www.reddit.com/r/WeAreTheMusicMakers/comments/3ajwe4/...](http://www.reddit.com/r/WeAreTheMusicMakers/comments/3ajwe4/the_largest_midi_collection_on_the_internet/)

------
mrob
Starting with classical music is far too ambitious. Music can be viewed as
having five major attributes:

\- Melody

\- Harmony

\- Timbre

\- Rhythm

\- Form/structure

In classical music typically all five are important. I think it would be
better to start with techno (the dance music subgenre, not electronic music in
general), where only rhythm and timbre are important. I think this has much
better chances of generating something enjoyable to listen to.

~~~
mehwoot
_where only rhythm and timbre are important._

The problem is, the fewer elements you have, the higher importance each is and
the higher quality that is needed to sound "good". When you look at electronic
music genres that focus only on timbre (often called sound design) people
spend years perfecting their craft and whilst we have some broad notions of
how to construct pleasing melodies and harmonies without first listening to
them, I doubt anybody can construct new sounds and have them sound good fist
go without a human ear to guide the process (which is what we are asking a
program to do). Sound design simply doesn't have the depth of analysis and
understanding that melody and harmony have currently.

------
ThomPete
What composition algorithms lack is not the ability to compose like humans but
a life that will give them perspective and a story a context to compose
within.

Music is (dis)harmonies over rythmic patterns. There isn't anything inherently
artistic about humans that computers can't replicate with time. Even the
ability to compose an original song isn't beyond the alghoritms.

The irony is that performing musicians are actually striving for, but failing
at, reaching the perfection level that computers so naturally have.

And so for computers to sound more human like they have algorithms that make
them more "sloppy".

Then again a lot of music is really formulaic anyway and computers are used
for most of it. There is nothing in a few years that will hinder some sort of
computer star to be born. But it's probably never going to connect with us the
same way another human can. Not for now at least.

------
clessg
Off-topic, but: that's a cool background and site design in general.

~~~
hexahedria
Thanks!

------
raldu
Very impressive. It could as well generate names for each song with some
effort. That sample outputs are the exact same length highlights the fact that
songs are "generated".

------
arxpoetica
This is fantastic. Would love to hear training on something a bit more
contemporary, i.e., 20th century or beyond. Fascinating stuff.

------
jongraehl
just awful. david cope has for years been computer-generating in-the-style-of
(learning from scores) music that's actually decent (unless he's hoaxing us
and just writing it himself).
[https://www.youtube.com/watch?v=PczDLl92vlc](https://www.youtube.com/watch?v=PczDLl92vlc)

------
tunesmith
Interesting - it's pretty linear right now, but it could be much more
convincing if it also had some understanding of form, e.g. Sonata-Allegro form
- themes, recapitulation, etc. I wonder if the training could involve a
process that abstracts the seed data similar to Schenkerian Analysis and then
extrapolating out from there.

------
bbrazil
If you're interested in this,
[https://www.youtube.com/watch?v=OTHggyZAot0](https://www.youtube.com/watch?v=OTHggyZAot0)
which was given at EuroPython 2012 on "Music Theory - Genetic Algorithms and
Python" is also good.

------
holri
In my mind the value of classical music is the emotional expression of a human
being. Substituting the human with a neural network replaces the meaning of
classical music for me. Therefore it does not make sense for me.

------
Tycho
I've often wondered what would happen if someone developed a neural network
system (or similar) that could reliably produce great melodies. It would have
quite a profound impact on the history of music.

------
sengstrom
It would be interesting to hear what some of the sources are in the training
set. My first impression was Goldberg variations meets ragtime.

------
tehchromic
it's absolutely incredible. I wonder if this says something particular about
classical music versus popular music. Often classical music is appreciated for
its compositional complexity, but it reflects on the possibility that
sophisticated mathematical expression in music was idolized by a certain age,
and particularly an age in which massive and cheap computational power was non
existent.

What I am saying is that it would be harder for a neural net to produce a pop
song, than a sophisticated classical work, and if true that suggests that pop
music is sophisticated in ways that aren't easily quantifiable by mathematical
expression. I wonder if this is an 'evolutionary' pressure on music creation.

~~~
dfan
I still think it would be a hugely easier to automatically produce a plausible
pop song than a piece of music from the Classical era (c. 1770-1820). The
output of this neural net is really neat but (at this point) there is no way
that it would ever be mistaken for an actual classical piece by an educated
listener, and it still seems pretty far from that. (David Cope has produced
some really impressive output, but I still think it would have taken a lot
less effort to produce pop music.)

~~~
tehchromic
I agree, if we are speaking about composition alone for example comparing two
pieces played on the same piano. But what I think is interesting about the
contrast is how pop music has evolved into a science of performance, and while
I think a computer synthesizer and algorithm can render a convincing
performance of Bach, the same can't be said of pop music. I know that we don't
have the real Back here to perform, but that's kind of the point of what I am
getting at, which is the idea of what these musics actually are in cultural
math, if that makes any sense. I say it's easier to replicate the idoms of
classical piece than of pop, because pop has complexity in performance, which
actually means recordance, if that word exists.

~~~
TheOtherHobbes
I say you're wrong - and I'm saying that as someone who has worked on both.

Bach, Mozart, and Beethoven are so complex there are _no living human
composers who can produce convincing imitations_ \- never mind machines.

There are a few people who can do clever improvisations in-the-style-of, but
that music completely lacks the big structures, broad relationships, and
metaphorical depth of the real thing.

In comparison, electronic pop styles are much more formulaic, and the forms
and changes are much more predictable.

I think we're less than ten years away from completely generative pop, vocals,
sound design, and all. Good computer generated classical music is going to
take quite a while longer.

~~~
tehchromic
Fair enough! I agree with your statements on the classic composers, however
modern composition in it's highest art forms has moved with the rest of art
towards abstraction, accident, pure invention, and contemplation of pure
noise. You might say the sound of reality is so complex that no living human
composer can produce a convincing imitation either! (in the role of devil's
advocate at the crossroads)

You might also say that all composers are doing clever improvisations in the
style of the sound or reality :}

I agree that pop is simpler and more formulaic, but I'd argue that the
practice can result in complexity that belies it's seeming simplicity. While
simple, pop music is volatile, so a popular form one year might look entirely
different than a popular form the next. It isn't driven by the same stylistic
conformity as music in the classical periods was.

Pop music, when it well done, speaks in many layers, across cultural
realities. I know that Bach for example, did the same, but I'd also argue,
given the fact that composers are speaking on many levels, that many of those
cultural realities are lost to the modern listener, some of whom are
appreciating classical music in the mode you describe: an art of big
structures, broad relationships, and metaphorical depth about ideas and things
that they have no cultural basis to understand (and I think this idea goes a
way towards explaining the increasing fragmentation and deconstruction of
classical order in the high art of modern musical composition).

Therefore machinery might have an easier time reproducing classical music than
popular music in spite of it's simpler melodic formulas.

~~~
TheOtherHobbes
Pop has actually ossified over the last ten years or so. Dubstep is more than
fifteen years old now, and the Paul Van Dyk album I've just listened to sounds
a lot like the last Paul Van Dyk album a few years ago.

Then you get this:

[https://www.youtube.com/watch?t=34&v=WySgNm8qH-I](https://www.youtube.com/watch?t=34&v=WySgNm8qH-I)

and this:

[https://www.youtube.com/watch?v=FY8SwIvxj8o](https://www.youtube.com/watch?v=FY8SwIvxj8o)

and the now famous this:

[https://www.youtube.com/watch?v=5pidokakU4I](https://www.youtube.com/watch?v=5pidokakU4I)

and you start to wonder how much entropy there is to analyse.

Modern classical has gone the way of modern art. It's now basically a
marketing exercise. The musical experience is secondary.

But then I expect creative AIs to be better at marketing too...

~~~
tehchromic
Ahahah! Thanks for the clips and good points! And in spite of how formulaic
pop music is at the composition level, I still think it would be harder to AI
a convincing pop song than a classical piece, because so much of pop's
information is encoded in performance, which ironically is thanks to the
ability to record instead of notate. You are right that it would be much
harder to fool an advanced listener. To me the videos you shared say something
about this whole AI problem and the future of the music market, which is that
interactivity and engagement with creation in musical forms and performances
will replace the traditional performer/audience role format. I think the
problem of artificiality, which we might as well call the ease by which music
can be entirely reproduced by algorithm won't work in the traditional mode of
performer/audience. Instead all the power to cast musical forms that are
effective at entering and altering the psyche of the individual listener will
be sublimated and abstracted just beyond their conscious intention, so that
they can become a player in the musical work. This is what happens now too,
and it is why the AI performer won't work unless as a hoax in our current
format - the participation and identification with the singer are fundamental
to the music experience. Those country singers are singing about manly things:
mostly getting girls into bed. You may be able to program a compute algorithm
to crank out shallow man tunes like that, however it will have to also be
tuned in to the culture of the times, because I can hear a lot of change in
those country tunes than country tunes of ten years ago. There are layers and
layers. But primordially, there is identification with cultural realities, and
these might well be generated by machinery, and already are, however they will
fundamentally align with the desires of the human creator, unless someone
creates an artificial consciousness to compete with our human one. So bar
that, musical AI will enhance and entice us towards a deeper and better human
experience, which the machine knows not!

------
alz
I implore you to delete this before simon cowell finds it

------
wittedhaddock
This is very cool. Thanks so much for sharing.

------
avodonosov
Haven't read, but that's amazing!

~~~
avodonosov
But I've listened for the music example and have heard previously of recurrent
neural networks.

------
tunnuz
Great resource for indie game developers!

------
ommunist
OK, just 'an idea'. Music is human impression. So one who want to train NN to
impress humans with music, must first train NN to model human impression with
all its limitations of perceived frequencies, reaction and tolerance to
repeated patterns and so on. Then you may use the result as a limiting
envelope for 'composing layer'. The good is that you will probably not need a
human to estimate "humanness" of the result.

------
mkempe
Thank you for the detailed essay and sharing the code.

Have you considered training with the works of one composer at a time?

~~~
hexahedria
I've definitely considered it, but so far I haven't found many single-composer
datasets that are large enough to train with. I also think it would be pretty
cool to try training it with music from specific musical periods.

~~~
gwern
I've wondered if it is possible to teach such subdivisions by simply including
them as metadata and then using the same metadata as primes. So for example,
you'd train your RNN on a big dataset of Bach, Mozart, etc, where each line of
music is prefixed "BACH |" and then when you went to generate samples, you'd
pass in as initial state "BACH |". Presumably the RNN would gradually learn
that "BACH |" samples sound different from "MOZART |" samples and would adjust
the conditional probabilities appropriately. Similarly if you wanted specific
time periods. (And if the style metadata tends to be forgotten even with the
LSTM, the metadata tag could be reinjected every _n_ steps.)

~~~
gwern
(The nice thing about this metadata hack, if it worked, is that you could
deploy variants of it without having to rewrite or modify your existing RNNs,
necessarily. For example, you could do this easily with 'char-rnn' by simply
using 'paste' or 'sed' to prefix some metadata to each line of the input file,
without any changes to 'char-rnn' itself, since it already reads in files and
has a '-primetext' option in generating samples. I've been meaning to try this
out.)

------
the_cat_kittles
wow that is really cool. of course no one could confuse the results for a
human, but they are interesting to listen to, so clearly you are on to
something. every so often i see something along these lines, but this is by
far the best result, and most interesting write up. well done

~~~
Trufa
I mean, not to the best of humans, but someone that is intermediate, I'm sure
most people couldn't notice anything "wrong" with the first sample if they
weren't told in advance.

I actually found them really enjoyable, I've been actually listening to them,
and as the author says, except the part where it stays for a really long time
in one chord, it's eerily similar IMHO.

