
Making Music: When Simple Probabilities Outperform Deep Learning - prostoalex
https://towardsdatascience.com/making-music-when-simple-probabilities-outperform-deep-learning-75f4ee1b8e69?mkt_tok=eyJpIjoiWTJVNU1HUXdPVEJtWkdaaCIsInQiOiJSZ1lpUFE3bG94c3Z3Z3R4bExLdmNTaDlHTllCbGxPNG8yNXMrZ1NSZDZpcGNwNE8rU2tPcGJaUGFwQjlyVXRsSFZaSnBPWEE1NzJFU2xWNHZkZkhHd2J6YWp5d3pCXC9QMjd3Qzhsb0hmUHQzWjllVGxaQVFXcUFGYTRLWXoyaW4ifQ%3D%3D
======
fenomas
Having worked on procedural music for some time, I really can't see how simple
Markov-style approaches like this are likely to have interesting results. You
can certainly create sequences of notes that have statistical similarities to
the notes in a training set, but we don't hear statistical similarities, we
hear phrases and refrains and call-and-responses and whatnot - higher-order
structures that markov models seem really ill-suited to.

It strikes me as analogous to generating text from an ML model trained on a
bunch of short stories. A short snippet of the results may sound convincing,
but obviously a whole story generated that way would be gibberish. I think the
same sort of thing is happening in the example songs - any given chord change
or three-note phrase may sound musical, but over the course of several
measures it's not distinguishable from random notes.

~~~
coldtea
> _You can certainly create sequences of notes that have statistical
> similarities to the notes in a training set, but we don 't hear statistical
> similarities, we hear phrases and refrains and call-and-responses and
> whatnot - higher-order structures that markov models seem really ill-suited
> to._

You can have markov models at various levels of granularity...

~~~
TheOtherHobbes
You can, but OP didn't.

And you need to understand what you're Markovising - which apparently OP
didn't either.

~~~
AstralStorm
OP built a whole song self similarity matrix and regularised on that, which
forces the model to adapt this kind of high level structure.

~~~
TheOtherHobbes
Neither transition tables nor naive self-similarity searches are an effective
way to model this domain.

------
Hroble
20-30 years ago groups/artists like Autechre were already making generative
music much more musical than this without sometimes even computers. The
metrics in this article seem wrong, but even without any metrics the clips
sounds like something a first-year student in Computer music could throw over
the weekend in MAX/MSP. The result is bad from music perspective and the
method is uninteresting from data-science/machine-learning perspective.

------
vfinn
Two problems that come to mind when writing a good music generator: 1) What's
the metric? Can we have a metric that's not based on human judgment? Imho the
best idea is to have a parametrized generator that creates music within
manually set bounds, that way we can have a goal to compare against (we need
to vary the music enough so that it's new); you apply the length, the
structure (draw the ups and downs of your music), maybe even what chords to
use, etc. (how many parameters you find necessary), and then set how much
variety you're looking for against your trained net; 2) Good music is often
like a structure within a structure within a structure; it has internal links
several bars away from each other that are out of reach for simple Markov
chains; there are patterns and progressions on every level of music, and a
slight change to that pattern can make a big difference; every abstraction
level has its own logic (previous chord against the next chord vs. previous
melody vs. the next melody, previous ornament vs. the next ornament), and if
you omit one layer, the whole tower of music won't be in balance.

------
rubatuga
So, the guy just put randomly generate notes over a chord progression ... ?
and then created his own metric stating that it sounds better?

~~~
haebichan
It's a little more complex than that. The metric I've created is built on an
existing literature of self-similarity matrices vis-a-vis music, though at
this point I suspect it's been dead due to the rise of GANs networks for
signal extraction. If you have any suggestions of your own on how to
objectively compare generated elements, I would appreciate the opportunity.

------
stevehiehn
I've been attending a lot of edm festivals lately. I can't help but think
popular music has moved to become much more timbre based opposed to harmonic
based due to the complex human subjectivity. Its much harder to model timbre
to harmony than melody to harmony.

~~~
te_chris
It’s not due to the complex subjectivity. It’s due to the medium, I.e daw’s
which discourage complex time and harmonic arrangement in favour of
encouraging massive sound design, via their UI’s

~~~
mojuba
That's an interesting point but I don't think it explains why music as a whole
is going where it's going: from melodies and harmonies towards a more general
art of sound. After all DAWs are created to accommodate the need, not the
other way around. One of the reasons why it's happening may be that harmonies
are pretty much exhausted and most of music in this sense is just a
reiteration of the past. Apart from some innovation in jazz and progressive
rock music, both being in stagnation (i.e. practically dead), nothing else is
new in terms of melodies and harmonies.

~~~
jcelerier
> After all DAWs are created to accommodate the need, not the other way
> around.

DAWs _were_ created for this point. But for a lot of artists nowadays, their
whole window into the music-making world is their DAW, and as such, the roles
have somewhat reversed, the DAW serving as inspiration for the work and
providing the limits of it.

I remember reading a paper a few years ago where they showed that just the
color skin used for your DAW had an effect on the music you produced. Humans
are fairly influencable.

~~~
acceit
> I remember reading a paper a few years ago where they showed that just the
> color skin used for your DAW had an effect on the music you produced.

I'd love a link for this.

~~~
jcelerier
I can't find exactly the one I'm thinking about but here are a few on this
topic :

\- a whole phd on exactly the subject of this debate:
[https://yorkspace.library.yorku.ca/xmlui/bitstream/handle/10...](https://yorkspace.library.yorku.ca/xmlui/bitstream/handle/10315/34478/Macchiusi_Ian_A_2017_PhD.pdf?sequence=2&isAllowed=y)

\- [http://www.arpjournal.com/asarpwp/experiencing-musical-
compo...](http://www.arpjournal.com/asarpwp/experiencing-musical-composition-
in-the-daw-the-software-interface-as-mediator-of-the-musical-idea-2/)

\-
[https://www.researchgate.net/profile/Josh_Mycroft/publicatio...](https://www.researchgate.net/profile/Josh_Mycroft/publication/308078381_The_Influence_of_Graphical_User_Interface_Design_on_Critical_Listening_Skills/links/57d9492a08ae0c0081efaaff/The-
Influence-of-Graphical-User-Interface-Design-on-Critical-Listening-Skills.pdf)

------
ToJans
"all models are wrong, but some are useful" George E.P. Box.

I like the way he presented his explanation, and his approach to finding a
solution.

However, listening to some of the songs I would say that he needs to find a
better model, or like mister Burns would say: "Smithers, continue the
research!".

------
onurcel
It's so unfortunate that the author didn't feel much the need of humility. He
worked on some pretty cool stuff, it could have been great to present it
differently.

~~~
WalterSear
IMHO, the author doesn't understand the subject domain deeply enough to
address it.

~~~
haebichan
Darn, I was hoping my 20 years of musical experience would give me enough
domain knowledge. If there are specific gaps in my understanding of music
theory, please let me know; I'd love to learn.

~~~
WalterSear
I'm sorry to have offended, but I stand by my statement. IMHO, you've
conflated the guidelines of musical grammar for rules of musical composition.

------
haebichan
Hey guys! The original author of the article here. I didn't know my article
was generating so much attention here. There are great complements,
criticisms, and external points I don't have enough hands to address
individually. But if you guys have specific questions, please send it my way.
Would love to further the discussion in hopes of expanding the frontiers of my
project! Thank you --Haebichan.

~~~
stevehiehn
I really respect & enjoyed your post. I've been experimenting with similar
ideas but in my case I've used multiple RNN & CNN trained for different
aspects of the music. For example train one NN just to identify compatibility
between bass lines and rhythms. One NN to generate chord progressions. One RNN
to generate melodies on top of the chords etc. check it out:

[http://treblemaker.ai/](http://treblemaker.ai/)

[https://github.com/shiehn/TrebleMakerDocker](https://github.com/shiehn/TrebleMakerDocker)

------
aub3bhat
Without any human assessment or blind testing, claiming one method outperforms
another using an ex post facto unsupervised metric is not scientific.

------
ArtWomb
If anyone is interested in taking these explorations further, Crowd AI has a
long running music generation with AI challenge. Input to the black box you
design is a single MIDI sample. And output is a generated MIDI file that
extends and "riffs" on that initial piece. Win condition is based simply on
which piece sounds better to the judges ;)

[https://www.crowdai.org/challenges/ai-generated-music-
challe...](https://www.crowdai.org/challenges/ai-generated-music-challenge)

Also check out the recent work from Google's Magenta team. Their MusicVAE
seeks to model not just the instrument, but the expressiveness of the
musician. With "style" emerging from the MIDI representation alone.

Latent Loops Beta

[https://teampieshop.github.io/latent-
loops/](https://teampieshop.github.io/latent-loops/)

------
ecocentrik
The problem with algorithmic music creation is that compositional models are
always going to be limited to baked in technical parameters. If you're working
from analysis, your analyzer needs to be able to recognize a compositional
technique before it returns any useful data about it. Which means it needs to
recognize a wide range of techniques, even for pop music.

Is this project the result of a high school computer music assignment?

------
adamnemecek
I’ve been exploring some of these ideas as well. I’m almost done with my
product, sign up at [http://ngrid.io](http://ngrid.io) if you want to be
notified when it’s ready.

~~~
haebichan
would love to learn more about how your model works. Would appreciate the
explanation!

------
raverbashing
Yeah trying to make a RNN learn about simple (but "complicated") priors is an
exercise in how not to do deep learning.

Just think how big a model should be to understand self similarity in 2
different parts of a song.

~~~
AstralStorm
Unfortunately a bigger more complex RNN learns the songs by heart and doesn't
generalise. Beta-VAE units seem essential and they're not doing well either.

------
ape4
Link for the lazy [http://popmusicmaker.com/](http://popmusicmaker.com/)

~~~
shawn
Aaand it's dead. Drat.

(500 internal server error when clicking on Generate Pop Music.)

~~~
haebichan
Yea, the machine is on a T2large EC2. It just got too much server requests I
think. Some haven't been following directions too, like giving empty requests
for lyric generation, and for some reason that doesn't reflect programming,
it's crashing my EC2.

~~~
shawn
Oh! I didn't realize I needed to specify anything before clicking "Generate".
That could have been why.

~~~
haebichan
No problem. And have fun! It was super fun to make this project

