Hacker News new | past | comments | ask | show | jobs | submit login
Markov Composer – Using machine learning and a Markov chain to compose music (zx.rs)
68 points by vsakos on April 25, 2015 | hide | past | favorite | 32 comments



I got really into Markov chains a while back and started analyzing Jazz solos, but using the current harmony, beat, and note durations as part of the state I tracked.

The results were promising I thought. Though I wouldn't call them great (they tend towards long, fast runs that are statistically possible, but unrealistic), they do have a touch of musicality that simpler analysis often lack.

It's all old Lisp code that I need to resurrect, but I found an .mp3 of generated blues solo from an analysis of Coltrane's Blue Train: https://soundcloud.com/spankalee/markov-blue-train


That's cool! Sounds a bit like Allan Holdsworth! But that's a given since "they tend towards long, fast runs that are statistically possible, but unrealistic" is a description that fits some of his solos quite well :) https://www.youtube.com/watch?v=BNAHOcZ6NxM&feature=youtu.be...


How'd you source the training data from Blue Train? Ie, did you manually transcribe the solos, or if done automatically what software did you use?


I forget exactly where, but I had just googled for transcriptions of Coltrane, Parker, etc. solos. This was one of the best I found.

I think getting that high-quality data is one of the hardest things about this problem if you're interested in Jazz.


It would be better to feed this tool with hooktheory's API containing common chord progressions...

https://news.ycombinator.com/item?id=9394176


[deleted]


Interesting! Do you have any demos to listen to?


Well, another clear argument that 'composing music' is far more complicated than sequencing notes.


Indeed.

Music is not a list of numbers produced by a statistical distribution.

Structural relationships are global, not local. Markov chains are nowhere near smart enough to model that.


It depends on how exactly global they are, a Markov chain with a large enough degree could be good enough. There's nothing stopping anyone in using additional global features when they're conditioning in the chain. But, that would require quite a bit of data to construct a good distribution. Nonetheless, Markov chains can be pretty smart.

Or, if you want to get even better performance, you can use conditional random fields with exactly the same global features, one has an advantage of not needing so much data because the distribution being modelled isn't a joint one + CRFs are excellent with custom features (features are observed variables and their distribution is implicitly present in the conditional distribution). Disadvantage would be that you couldn't as easily generate the sequence of notes (chords) because the model isn't generative (unlike Markov chains), but one can use Gibbs sampling (combined with CRFs) to search over the space of probable sequences.

Or, some nicely trained convnets could get you even closer to the brain of the composer :D


No, Markov chains at least cannot work because they are fundamentally finite state machines with no global state.

Say you want to generate 'verse chorus verse slightly-different-chorus', which is an idea that I've seen in basically every type of music that I've listened to. If you want to generate a slightly different version of the first chorus, you need knowledge of the first chorus, which is not possible with a Markov chain unless the state that represents the start of the second chorus is only possible to reach given that the first chorus was generated; i.e. you need to code in every second chorus possible into your Markov chain, i.e. you need to code in every first chorus possible into your Markov chain, i.e. the human's composed the piece.

The thing with computer generated music is that music is complicated; it's fundamentally not just a set of rules that you can apply and get good music. Yes, counterpoint does have many rules and suggestions that can restrict you, but they don't specify all good music.

In the same way that if you start combining logical axioms and inference rules, you generally just get random useless theorems, combining musical rules in an unstructured way is pretty much guaranteed to get you useless sequences of locally-alright notes.

The correct way of using the rules is (with logic) to start at the conjecture you want to prove and use the computer to prove the theorem correct by working backwards. With counterpoint, it's to compose the music, click the 'check for mistakes' button in Sibelius and check that you haven't made any glaring errors.


Of course one needs the knowledge of the first chorus, and that is possible to do with a sufficiently large degree of Markov chain, or you could add all of the notes before the chorus as features during the transitions.

If you use CRFs you can condition on the whole piece and learn the model like that. Yes, you'll have to use a lot of data but models can be as global as you need them to be.

If you want to use a 'verse chorus verse slightly-different-chorus' way of composing, yes, you can use a first level of a chain to generate the probable musical sequence blocks, and generate each block separately, using at the same time features generated in each part (verse, chorus, slightly-different-chorus etc.) to keep the same feeling.

If you train your model in a way described above, you can then pick a tune in your head, put it in and ask the model to generate the most probable sequence for the whole song. Or, if you're using CRFs with Gibbs sampling, you start from the complete piece and iterate until the probability it fits is large enough. Same could be done, somewhat easily, with Hidden Markov Models (I just realised that Markov models might not be the thing I was referring to in the post above, I was talking about statistical variant of Markov chain).

Convolutional neural networks could do the same thing, probably even better than CRFs and HMMs. Music isn't more complex than language and people have been using these sequence modelling methods to do extraordinary things in natural language processing.


I honestly don't think (from your comments) that you know enough about music to make pronouncements of the sort that you are making. I'm sure that people are doing amazing stuff in natural language processing, but I'm also sure that you're underestimating the complexity of music.

Producing a program that can output quality music on demand would be largely comparable to producing a program that can output quality novels on demand. I'd be entirely unsurprised if it turned out to be an AI-complete problem; some evidence for this being that most humans with training are found to be incapable of composing quality music (where almost anyone can perform most of the tasks that have been solved by NLP researchers).


Not really, models in NLP go beyond human performance in some tasks (not tasks as trivial as part-of-speech tagging).

I have a ten year formal training in music - piano (never went to college), I assumed we aren't really talking about composing Rachmaninoff-like pieces. You seem to be aiming at genius-level compositions, that is, currently, unrealistic, and I was surely not talking about that.

You're also going into philosophy of quality. What is quality? Are you doubting the ability of the model trained on thousands of classical compositions to reproduce a fully structured classical piece that sounds well and has a few leitmotifs? It's very easy to constrain the model with a leitmotif positioned at several places and ask of it to find you the most probable sequence (to fill the blanks). It's very easy to take a composition, decompose it into its constituent parts (chorus, verse, etc.) train this kind of sequence to a sequence model, and then do the same for the higher level stuff.

I mean, I agree with you that rule based systems wouldn't work. But statistical models could, if used in music with as much fervor as they are used in tasks in NLP, absolutely produce regular compositions that don't sound like you're randomly spitting out the notes.

Or are you aiming at profound genius compositions? Or maybe super-pop songs? Then I agree, that would be an AI-complete problem, equivalent to machine translation and 300 page novel production.


> Are you doubting the ability of the model trained on thousands of classical compositions to reproduce a fully structured classical piece that sounds well and has a few leitmotifs?

Yes, I am. It turns out to be immensely difficult to do basic compositional tricks, like writing acceptably good classical counterpoint, or harmonising simple chorales.

Writing a full scale piece is a whole other level of difficulty. Writing a full scale piece that's going to be played over and over is a level or two beyond that.

David Cope's EMI is probably the state of the art:

http://artsites.ucsc.edu/faculty/cope/mp3page.htm

Listen to the Bach and Chopin. If you know anything about music you can hear that they sound like what they are: randomised cut and paste mash-ups of elaboration techniques and motifs that lack the musical narrative logic that the original composers were so good at.

Basically they're competent but mediocre pastiche, glued together out of little bits and pieces, lacking any overall form or drive.

Now - you're supposed to learn this stuff at composition school, and getting a computer to do it to this level is certainly an achievement.

But it's still some way short of being interesting and memorable music.

I don't think pop is any easier. E.g. trance and progressive house sound totally formulaic - until you try to copy them, and realise that getting something good is harder than it sounds.

So no - it's in no way a trivial problem. And a naive Markov approach is in no way a good enough answer.


I feel like a computer can come up with interesting sounds and possibly works similarly to a musician in the initial brainstorming phase of creating music. The difference is that the musician can recognize when one particular phrase is interesting and then run with that idea. Whereas the computer is not able to make the distinction and so it just goes on sounding like so much musical gibberish.


I have heard that some musicians use WolframTones for some inspiration.

http://tones.wolfram.com/generate/


Absolutely

Just transitioning from one note to the other means nothing. Transitioning from a C->E can occur in different situations, and can be "right" or "wrong"

The most pressing issue I see is that it is memoryless (Markov process), also the apparent lack of musical knowledge of the author.


This is why I track current chord as part of the current state, so that you're more likely to generate notes in key and chord tones. The problem was then chromatic runs and leading to upcoming chord changes, so I tracked the beat, hoping to get diatonic notes on strong beats and push the leading notes to weak beats.

That makes thing a quite a bit better, but it's still a memoryless process which makes any apparent motifs only generate if you happen to walk a similar path through the STM.

There are a lot more bits of state that you can track in the analysis to fix things up, but at some point you need to build up a memory of what you've generated and use that to modify the STM on the fly.


True. Your "Markov Blue Train" sounds better, even though it's clearly memoryless, as you state.


Algorithmic composition is a decently-sized field. Here's a textbook:

http://www.amazon.com/Algorithmic-Composition-Paradigms-Auto...

And some more examples:

http://algorithmiccomposition.org/


Will it be possible to write a filter on top of this to eliminate invalid/dissonant outputs based on music theory -- e.g. key signature, beat, harmonics etc?

Edit: Try listening to only the first bar of each sample at the bottom of the page. It almost sounds like the beginning of a proper composition. The only problem is that subsequent bars don't match. Fix that and I think there's something great here.


Does someone know of a source of MIDI files that could be used for training music composition algorithms (without violating copyrights)?




Cool, it reminds me of one of my very old project : http://robhub.github.io/melogen/@auto.swf (sorry, Flash, I know..) However, it would require tons more work to make something interesting.


It's a shame that DarwinTunes didn't become more popular.

http://darwintunes.org/

I'd like to see something that combines user-guided evolution and markov chains.


Comment to the maintainer of that page: Don't put dark blue on black background. It's very hard to see and read.


Dark blue on black? Either something has changer over the last 2 hours, or there's something with your display, because what I'm seeing is light grey on dark grey and it's quite readable.


The main content for me is black background/white(light grey) text. However the code samples contain blue on black and so does some content in the left sidebar.


What about rhythm ?


Nothing really new here, I've seen multiple blog posts about how to do exactly that thing using exactly the same methods. But everything I've seen so far is pretty primitive, as it constructs a sequence of tones completely disregarding the rhythmic part, and I'm not sure if treating them as separate things is actually good idea: at least it deserves another blog post exploring if there should be some relationship between tone and rhythm in order to produce good results.


Can't tell if parent is being sly or cautiously serious. Yes: Note envelope (attack/sustain/decay) and other attributes could and should be included in chained phrases.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: