

Deep learning for assisting the process of music composition - albertzeyer
https://highnoongmt.wordpress.com/2015/08/11/deep-learning-for-assisting-the-process-of-music-composition-part-1/

======
pierrec
Well, this field is really exploding right now! I was curious about the
performance and searched around a bit: in another other post, the author gives
a slightly more detailed explanation of how the tunes are automatically turned
into audio:

" _I convert each ABC tune to MIDI, process it in python (with python-midi) to
give a more human-like performance (including some musicians who lack good
timing, and a sometimes over-active bodhran player who loves to have the last
notes :), and then synthesize the parts with timidity, and finally mix it all
together and add effects with sox._ "

[https://highnoongmt.wordpress.com/2015/08/07/the-infinite-
ir...](https://highnoongmt.wordpress.com/2015/08/07/the-infinite-irish-trad-
session/)

The generation of tunes by the RNN is pretty nice and definitely the trending
topic, but I think I'm more impressed by the little performance script that
he's put together. The output is quite pleasant and I'm curious about the code
that generates the bodhran part. Hope this gets open-sourced!

 _(Off-topic to the guy who submitted this: thank you for making OpenLieroX
and turning my university into a chaotic LAN party on many an occasion.)_

------
dang
This was posted twice. We kept this thread as the earlier of the two, but
changed the URL to the more explanatory post. The other URL is
[http://www.eecs.qmul.ac.uk/~sturm/research/RNNIrishTrad/inde...](http://www.eecs.qmul.ac.uk/~sturm/research/RNNIrishTrad/index.html)
and actually plays the music. The other HN thread was
[https://news.ycombinator.com/item?id=10069007](https://news.ycombinator.com/item?id=10069007),
but we moved the comments here.

------
andkon
Fiddle player for the last 18 years: can confirm that this is pretty much what
most traditional music circles sound like (especially their endlessness).

~~~
stcredzero
At a workshop I attended, American piper Kieran O'Hare was once decrying the
typical American "intensification" pattern as applied to seisiúns. It becomes
about people's egos and shoehorning in their sets or showing that they can
keep up with the best. Really, it should be about friends socializing and
sharing music. Tunes-tunes-tunes-tunes is antisocial and a bit mechanical when
it comes down to it.

At another workshop, Kevin Creehan (Junior Creehan's grandson) noted how it
was strange for seisiúns to end up in pubs. Originally, they were in people's
houses. They were intimate gatherings of family and friends, and some of the
heights of artistry in Irish trad music clearly stem from this quieter and
more intimate context.

For the most part Bay Area seisiúns seem to include a healthy dose of
socializing and human interaction. Haven't been down to Mountain View yet. I
hear those guys play fast. O'Flarety's in San Jose is pretty raucous, but even
so, they manage to get in a good amount of socializing.

You techies that show up to The Plough and the Stars sessions -- know that you
are in one of the premier sessions in North America, both for the skill of
musicians and the health of the community that surrounds it. Don't expect to
be entertained, like it's a show for you. It's a gathering for the musicians
to share music with other musicians. If you like Irish trad, listen carefully,
because the very best music comes out when it's musicians sharing with their
friends. (Also listen patiently, because there's still a certain diversity of
musicianship.)

Also, when someone is singing a slow air, then it's good manners to be quiet
and listen! (And very bad manners if you don't!)

~~~
colomon
In Newfoundland, anyway, it wasn't just the tunes that used to be in people's
homes, it was the dances, too: "Mostly they'd have a dance in the kitchen. It
was all wooden floors. Some of the kitchens was a bit small so they'd have to
take out the stove, lift it outdoors in the yard for to make more room for
dancing...and they'd play all night and in the morning they'd bring in the
stove again for to boil the kettle and have a cup of tea." (Vince Collins)

~~~
stcredzero
Was also once the pattern for Irish traditional dancing, with even more
wrinkles on top of that. (Churches building the parish halls, and forwarding
Ceili dancing as a more chaste form of social dancing.) Newfoundland has
benefited in terms of the preservation of its cultural heritage through a
degree of isolation.

------
bane
I really appreciate the effort that went into the performance part of this
work. There was a real effort to try and make it sound like a reasonable
representation of humans playing...a little off beat, out of sync at times.
Instead of just hammering the notes out like I hear with lots of these
systems, it makes it listenable...I've had the endless trad on for 15 minutes
now in the background.

I also like how the basic structure of the musical forms has mostly carried
through the model, that seems to be a good "sniff test" if the model is
producing reasonable output, if the musical structure makes sense as well as
the notes. It makes it feel like there was a little bit of planning.

Great work.

~~~
coldtea
> _I really appreciate the effort that went into the performance part of this
> work. There was a real effort to try and make it sound like a reasonable
> representation of humans playing...a little off beat, out of sync at times._

That said, there's a certain ways that humans interact when they play together
(even if they track their parts independently).

Just being a little off beat randomnly doesn't capture that, and can sound
fake just as being perfectly on beat.

There are a few algorithms about how actual players interact, here's a
relevant study:
[http://scitation.aip.org/content/aip/magazine/physicstoday/a...](http://scitation.aip.org/content/aip/magazine/physicstoday/article/65/7/10.1063/PT.3.1650)

There are several more for real-life like quantization/humanization.

------
Yenrabbit
This is great! He is using deep learning as it should be used in regards to
music - not as the sole generator of songs (no technique is quite up to that
yet) but as a source of inspiration for a 'proper' musician, who can take it's
output and do cool things with it. As a bagpipe player, I can hear ideas for
several new pieces among the output he posted! I see this in the same line as
IBM's 'Chef Watson' \- great if a sufficiently skilled person is there to
supervise :) Good work.

------
raverbashing
This seems it is almost getting it to work in a musical sense. (The Irish
songs seem to be simple enough for the NN to understand it back to front, also
a lot of 'similar' samples definitely helps)

The NN seems to be able to assemble the repeat sections with different endings
and having a song with two distinct sections

But they seem to all be in the same key and time signature

~~~
mazelife
You're point about "Irish songs seem to be simple enough for the NN to
understand it back to front," is a good one. There are a number of aspects of
the compositional parameters here that I think make it much easier to generate
this kind of music.

Firstly, it's modal harmony, not diatonic. In this case, dorian mode in A. If
you've every improvised in say, a whole tone scale, you notice that you can
play almost anything and it sounds good. Modal music works in a similar way. A
lot of the things you find in diatonic harmony: the tension between tonic and
dominant, chord progressions in general, key changes, chromatic inflections,
etc....are all absent in this kind of music. Which isn't to say anything bad
about modal music, just that it's much simpler. Because of the added
complexity of diatonic harmony, there are many many more ways the music could
"go wrong" so to speak. Most people are so grounded in diatonic harmony that
they would easily perceive even small mistakes (or statistically speaking,
deviations from the norm) without necessarily being able to explain what rule,
exactly, is being broken.

It's also monophonic music; there's just the one voice and no accompaniment
(other than a completely static rhythmic accompaniment that was added to the
performance).

Finally, even within this much simpler framework, I'd argue this tune gets it
wrong in a big way: it doesn't know how to come to an end. It just sort of
stops, in medias res. In a lot of folk music, you'll find that the way a tune
is ended still tends to hearken back to diatonic harmony: some sort of motion
from dominant (e), maybe even with a raised 7th scale degree to tonic (a)
that's outlined by the melody. That doesn't happen here, which is why the tune
sounds like it just got cut off.

I find these NN experiments in music generation quite interesting
conceptually, but so far the results--as music--have been pretty
disappointing. I suspect that you could actually build a model that would
allow for algorithmic generation of folk tunes that would produce music that
would probably be more satisfying. The number of rules that govern a lot of
kinds of folk music are small enough that you could encode many or at least
most of them in your model. [1] However, at the end of the day you'd still
just have a model that would only generate a fairly limited spectrum of folk
music--say Irish gigues, reels, hornpipes, etc--whereas the dream with NNs,
markov models and other statistical methods is that you could plug in any
corpus of songs without understanding a thing about their harmony, form,
structure, melodic patterns, etc. and get back music that sounds the same.

[1] and this is really massively simplifying on my part w/r/t the varied
amount of folk music out there, some of it quite complex

~~~
TheOtherHobbes
Folk tunes are more complicated than they sound. Jigs, reels, etc are all
_classes_ of tune, with broadly similar features. But there are further sub-
groupings based on date of composition, composer, and even location.

So if you feed an ML system a generic mix of folk tunes without understanding
how the subgroupings work, you'll get a messy blob of musical data out. It
will sound sort-of interesting in a work-in-progress way, but you will always
have to hand-edit it to get something acceptable. And even then it will
probably be mediocre rather than memorably great. And if it sounds at all
good, you'll likely find you've created a mashup machine, not a true composer.

Really, it's like training an ML on "ballads". You'll get a few features that
are similar, but everything else will be too noisy to be anything other than a
crude attempt.

So I think good musical imitation is probably a lost cause, because the rules
are so complex and contingent, even for "simple" music, that there simply
isn't enough consistency to do the job.

At the same time the differences from the template create recognisable styles,
which have emotional and other associations. So the differences are
significant in their own way - but even noisier as a recognition problem.

------
vonnik
People interested in the history of computer-generated music should look into
David Cope's experiments in musical intelligence:
[http://artsites.ucsc.edu/faculty/cope/experiments.htm](http://artsites.ucsc.edu/faculty/cope/experiments.htm)

------
archagon
This is really interesting, but it also makes the failure points of computer
generated music evident. Namely: it works best with simple, "jammy" music; it
requires a large existing base of music to pull from; and it has no way of
using compositional techniques for rhetorical effect. (Increasing/decreasing
tension, "going somewhere", expressing emotions, etc.) In other words, it
can't create innovative music or music that has something interesting to say.
It can only recombine the past.

I like the idea of using these tools for educational purposes, however. Can we
derive the "rules" for music of different cultures by feeding them through
this kind of algorithm? If so, it would give us a fantastic insight into
different musical traditions around the world, even if it couldn't write that
music for us.

~~~
eru
> In other words, it can't create innovative music or music that has something
> interesting to say. It can only recombine the past.

So far. I hope we'll be seeing progress on that front in the future, too.

------
kephra
The generated sniplets have a lot of pentatonic intervals. Of course
pentatonics makes music more harmonic, but we are used to more disharmonics
adding color. I can not look at the notes sheets, but I guess the genenerated
music has less disharmonics then original Irish fiddle.

------
maxki
it has a lot of similarity with human made trad music, but to my ear it sounds
very different from music. something is definitely missing. Inflatable dolls
look like the real thing, somewhat, on the surface... still a long way to go
for synthetic music to fool a musician's ear, IMHO...

------
chazu
Apologies if this is a stupid question, but what is meant by ABC tune in this
context? It seems to be a method of musical notation?

~~~
boblsturm
Yep! [http://abcnotation.com/](http://abcnotation.com/)

------
scottlocklin
You could use LZW or prefix trees for assisting music composition. Rather less
mystification with those tools though.

------
tonetheman
This type of stuff is super cool.

------
pierrec
Submitting this twice simultaneously may be causing more confusion than
anything. I didn't see this one at first and only commented the other one
(which seems to be magically higher on the front page):

[https://news.ycombinator.com/item?id=10069007](https://news.ycombinator.com/item?id=10069007)

~~~
dang
Thanks. We've now merged the threads.

------
monochromatic
I thought web pages that automatically played sound went out with the
nineties.

------
boblsturm
Thanks everyone! This is Bob L. Sturm.

Pierrec and bane: My scripts are just a hodgepodge of bash and python. Happy
to share (email me). I do not care much for MIDI piano; and since this music
is typically monophonic, why not just use all the typical instruments? I
generate the bodhran part from the MIDI and randomly choose to play a note,
which kind of note to play, and whether to double up a note. I also give to
each player the option to be late or early. Sometimes it gets a bit much, but
is fun nonetheless.

bane: One of the reasons why I got into this music generation work is that it
provides a sanity check of the internal models, just like speech recognition:
look at the transcription to confirm the model is paying attention to relevant
aspects. You may be interested in my "horse" article: "A Simple Method to
Determine if a Music Information Retrieval System is a “Horse”"
([http://ieeexplore.ieee.org/xpl/articleDetails.jsp?arnumber=6...](http://ieeexplore.ieee.org/xpl/articleDetails.jsp?arnumber=6847693)).
Another reason is that I like to compose, I like traditional music, and here
we go!

Yenrabbit: Thank you! I agree completely. It is quite hopeful to believe a
single artificial system will produce "music." It is merely shifting
characters around according to a probabilistic model in light of constraints
it has learned (such as four whole notes to a measure in common time), and it
is up to musicians to "realise" it. Certainly, a lot is missing from the
reduction to ABC. Music is much more than a sequence of arbitrary symbols. :)

All: you can browse all the tunes so far generated here:
[http://www.eecs.qmul.ac.uk/~sturm/research/RNNIrishTrad/Sess...](http://www.eecs.qmul.ac.uk/~sturm/research/RNNIrishTrad/Session/)

Many thanks for your comments!

