
Music Language Modeling with Recurrent Neural Networks - yoavz
http://yoavz.com/music_rnn/
======
iammyIP
As usual* the result sounds awfully unstructured and unenjoyable and could be
aswell be achieved by some random walk through a musical scale, since if you
put some basic music theory in program form, you can get these small harmonic
structures pretty easily. (*since projects like this seem to pop up every
couple of months)

Like other people already mentioned, the part that usually gets neglected is
the overarching dramatic structure of a musical piece. Compare a complete
shakespeare piece to a pile of randomly thrown together half-sentences.

I don't fully understand the fascination of music generation with some ai-
neural-learning buzzword bingo technique that always gets kickstarted by dumb-
force-analysing a human made music corpus to achieve it.

What in a musical sense is much more interesting is to generate _new_ music
that cannot be composed by a human, and cannot be played by a human. That's
playing to the strength of the machines. Sonification of large datasets,
sonification of function behaviour. Sonification of the binary world, that's
so different to ours. This is much more interesting than the 10th failed
emulation of a simple folk song.

Nevertheless, as a students piece about programming neural networks, it's
certainly ok, the presentation is nice, but the result is uninspiring, like
building a car tire out of bananas, just because it's possible. Just let the
folk songs belong to the actual folk.

As a side note: what would happen if the result were millions of super nice
catchy folk tunes on a button press? Would it be the end of pop music as we
know it? Maybe i redact my opinion.

~~~
arthur_pryor
here's something along those lines that i'd meant to check out but never got
around to: [https://soundcloud.com/ibm/sets/remixes-made-with-tennis-
dat...](https://soundcloud.com/ibm/sets/remixes-made-with-tennis-data)

generative music based on data from the 2014 US open [0], remixes by james
murphy [1] (of lcd soundsystem).

slightly funny to me that ibm has such a full soundcloud page.

[0] [https://soundcloud.com/ibm](https://soundcloud.com/ibm) (non-remix
versions further down the page)

[1] [http://pitchfork.com/reviews/albums/20103-remixes-made-
with-...](http://pitchfork.com/reviews/albums/20103-remixes-made-with-tennis-
data/)

~~~
iammyIP
That is certainly more musically interesting from a perspective of what can be
achieved with computers, however the translation into electronic music of that
tennis data involved lots of artistical freedom and it needs much more precise
information on how that was translated. How else, if not by human
interference, could there be some 4/4 techno beat and offbeat hihat. A much
better interpretation would take the quite irregular time-based counting
structure of tennis 'point'&'advantage' 'game''set' and interpret that as a
rythm without the boring techno-bed. The sounds here are also much more
fitting, since they are synthetic and adequately mixed.

Maybe all the cases of 'music done with some neural-machine-learning presented
as awful sounding midi piano renderings' are just there because music seems to
be a universally liked phenomenon simply presented as pitch over time and its
appealing to research students of this specific field of programming to take
this as an anchor for their experiments.

As a general tip for these projects: If you take music as a main plot point,
then learn some basic music dsp and render your experiment with well behaved
sinewaves, or get some daw and put some nice preset sound to it - basically
put some minimal effort to the actual musical presentation. Awful music is
much more unbearable than awful graphics.

------
albertzeyer
I'm quite sure I have seen other attempts to generate music with RNNs
recently, although I don't remember exactly anymore. You don't cite that many
references to other approaches, only the one from Boulanger-Lewandowski from
2012.

I did a quick search, and I probably miss a lot, but I found these:

[http://papers.nips.cc/paper/5655-deep-temporal-sigmoid-
belie...](http://papers.nips.cc/paper/5655-deep-temporal-sigmoid-belief-
networks-for-sequence-modeling)

[http://dl.acm.org/citation.cfm?id=2806383](http://dl.acm.org/citation.cfm?id=2806383)

[http://gitxiv.com/posts/WEoQCj8hxHz6vPxe6/gruv-
algorithmic-m...](http://gitxiv.com/posts/WEoQCj8hxHz6vPxe6/gruv-algorithmic-
music-generation-using-recurrent-neural)
[https://github.com/MattVitelli/GRUV](https://github.com/MattVitelli/GRUV)

[http://www.hexahedria.com/2015/08/03/composing-music-with-
re...](http://www.hexahedria.com/2015/08/03/composing-music-with-recurrent-
neural-networks/)
[https://news.ycombinator.com/item?id=10028878](https://news.ycombinator.com/item?id=10028878)

I haven't really looked into any of these, so I'm not sure about the
differences. But it would be good if you cite some relevant other works and
point out the differences.

------
romaniv
In my opinion, anyone who works on music generation should take a look at
Karma ([http://karma-lab.com/](http://karma-lab.com/)) for a baseline of what
can be achieved by simple math and plain old programming. Probably not
particularly interesting from programmer's perspective (it's closed-source and
to the best of my knowledge doesn't use anything fancy), but the end results
are spectacular and used in real music.

~~~
iammyIP
This was new around late 90s i think, yes its a good example how 80s style
programming with knowledge of music theory can give quite nice mega-arpeggios.

But this also misses the point, since the strength of our digital machines is
not generating the same old musical variation that we could aswell compose
ourselve if we weren't so lazy, but to show us music and sounds that we cannot
compose and play by hand. The machine music should tell something about
itself, not try to mimic us.

~~~
bubtubgub
A similar approach, but with a different goal - that of finding and showing
the sound patterns that make music attractive to humans and other animals
could be very useful.

~~~
iammyIP
For monetisation, yes, but for nothing else.

~~~
bubtubgub
Why nothing else? It could help us understand more about the human mind.

------
aklein
I would love to see a real-time bebop improvisation generator in the style of
Charlie Parker, Sonny Rollins, Bill Evans, et. al. - bebop is definitely a
musical (jazz) language that I bet would be well suited to RNNs.

~~~
dharma1
I'd be interested in jazz harmonisation of an input melody line. Not band-in-
a-box level results but something that actually sounds like it was done by a
competent arranger, like this - [https://www.youtube.com/watch?v=Eaqf-
wRSx7E](https://www.youtube.com/watch?v=Eaqf-wRSx7E)

It feels like it would be an achievable goal, given the right kind of training
material.

Maybe hiring a session pianist for a few days to harmonise a bunch of
key/tempo normalised jazz standards on a midi keyboard, so that the
harmonisations and melody input are separate, labeled data?

------
cel1ne
You could feed Lilypond [0] files into the network. You might gain more long-
term structure that way.

It looks like this, you can do repeats and everything else:

    
    
      \new Voice \with {
        \consists "Ambitus_engraver"
      } \relative c' {
        \voiceTwo
        es4 f g as
        b1
      }
    
    

[0] [http://lilypond.org/](http://lilypond.org/)

~~~
iammyIP
That is a notation Program, why not just directly feed e.g. midi files?

~~~
jimm
The Lilypond input would allow the NN to discover structure (repeats,
sections, voices). A MIDI file doesn't have any concept of repeats or
sections.

~~~
iammyIP
i see.

------
varelse
Both of the pieces at the top of the article sound like off-key broken record
renditions of the main refrain of "Jesu, Joy of Man's Desiring" to me. It's
like the RNN cannot hold enough state to express the structure of a real
musical piece and it just emits riffs here and there of main themes from its
training set.

What would be somewhat impressive is if it spontaneously figured out the note
sequence I hear from observing its re-expression in bits and pieces from
various jigs and folk pieces in its training set, kind of like this:

[https://www.youtube.com/watch?v=XPLp_gInC-o](https://www.youtube.com/watch?v=XPLp_gInC-o)

~~~
pitchka
LSTM can hold as much state as you want.

It would make more sense to force the leitmotif and generate the rest of the
song instead of generating from a random note.

------
studentrob
Cool. Anyone have an opinion on "state of the art" for music generation? I
realize this is entirely subjective. This one sounds pretty interesting! It'd
be awesome to get something like this on a top 10 list and start influencing
man made music. We can't be so far off from that. The kids these days love
techno and that is easily synthesized relative to music with original lyrics
and voices.

~~~
6stringmerc
Well there are a couple different paths, if I can offer up a bit of
perspective.

There's the "generated" music concept sort of like this, that basically
creates the piece from zero to finished product. As in, there are tones and
sounds and maybe some rhythm in it. Basically it makes a track. There was a
post here recently about a 'brain support' music generation program/service
thing, and I'm pretty sure the sounds they use would fit in the above
description.

The other concept is "element" music generation. This would be a plug-in or
software piece that works for a specific instrument. Apple's GarageBand has
Drummer[1] and I've had good results using it so far. I think there are others
on the market and different examples of a similar concept, like Instant
Haus[2]. These aren't stand-alone music generation pieces, but resources upon
which to build into a whole.

[1]
[https://www.youtube.com/watch?v=-pwlKgS43gM](https://www.youtube.com/watch?v=-pwlKgS43gM)

[2] [https://www.ableton.com/en/packs/instant-
haus/](https://www.ableton.com/en/packs/instant-haus/)

~~~
studentrob
Cool thanks. I find this area really interesting and relatively unexplored to
the degree that profitable ventures are sometimes pursued. Yet it seems to me
there would be a market for it. Not that there needs to be. But could be, and
will be at some point.

------
6stringmerc
Fascinating piece of research and the details in the write-up managed to click
mostly even though I know it's a level far above my head. Well done and glad
to have come across it.

