
Neural Translation of Musical Style - pshaw
http://imanmalik.com/cs/2017/06/05/neural-style.html
======
the_cat_kittles
im glad to see someone talking about dynamics rather than pitch for once. its
really what breathes life into music anyway. the results are impressive. one
thing i worry about is that midi patches are often very sloppy about subtle
changes to velocity- sometimes 60-70 sound the same, and then 71 is way
louder, or the tone is much different. that means that even though you have a
nice, dynamic range of velocities, it can get aliased in a way that robs it of
some of that. not sure what kind of midi patches you are using for the sample
generation. if you use a piano simulator, or a really high quality patch
library you are probably ok.

the other thing i would note is that there is not a right or wrong way of
coloring a piece with dynamics. its a huge area for artistic choice. there
doesn't seem to be a great nomenclature for really detailed dynamics in the
western music tradition, so it hasn't been able to be recorded on written
scores, but i think it can be as much (or more!) part of the composition as
the pitches are. a solo snare drum piece would be a good example of this.
whats cool about what you did is that it gets away from monotonous dynamics
that are the default when you write stuff on a daw by hand.

my own personal approach to this problem has been to treat dynamics as
compositional elements independent from pitches or rhythms, and realize this
though code. its a pain in the ass to explore dynamic ideas when you have to
change all the note velocities by hand, like you said, so i decouple the
dynamic curves / sequences / ??? and map them on to different regions of
pitches and rhythms. its nice to add a breathing quality to a line by adding
some kind of cyclical dynamic, then also being able to add a crescendo on top
of that, or being able to decrescendo while still maintaining some kind of
pattern of accents or dynamic phrasing. and then being able to change all that
quickly, instead of by hand.

~~~
fiatjaf
> im glad to see someone talking about dynamics rather than pitch for once.
> its really what breathes life into music anyway.

So half of Bach's music was dead when he wrote it?

~~~
the_cat_kittles
what do you think makes glenn goulds recordings so great?

------
TheOtherHobbes
For comparison, three other human performances of the same Chopin piece:

[https://www.youtube.com/watch?v=V7SvQzkZmuM](https://www.youtube.com/watch?v=V7SvQzkZmuM)

[https://www.youtube.com/watch?v=GOe670xcKhk](https://www.youtube.com/watch?v=GOe670xcKhk)

[https://www.youtube.com/watch?v=fRqynzR_8Ts](https://www.youtube.com/watch?v=fRqynzR_8Ts)

Both of the performances in the demo do a mediocre job with the shapes in the
music, including the phrasing and dynamics.

I suspect more people would be able to hear a clear difference if a more
representative human performance was used for the comparison.

As is often the case with ML in music, the bar is higher than it seems to be.

~~~
justifier
as i understand it the human piece was chosen from the available dataset

your comment seems to imply intentional misrepresentations

the thing about recordings of performances of music from these periods and
composers is that the music is public domain but the performer can copyright
the performance

if the human performance midi recordings dataset used in the thesis was
legally able to also include the performances by Valentina Lisitsa, Pollini,
and Horowitz i am unable to see how the net would fail to make use of their
contribution

also for the best results those performers would need to be involved in the
production of those midi files because they carry with them a lot of a
subjective meta information

i commend the human performers in the available midi files for their effort
both in their expression of the piece as well as their desire to make music
accessible in a verbose spec'd digital data standard

regardless i feel the real impressive part of the thesis is taking the droned
midi and altering it to sound like the human midi.. which i believe is the
point moreso than the lcd wow! effect of an, author defined, 'musical turing
test'

i mean, really.. access to: the thesis, a full blog write up, repo containing
code and a jupyter notebook, and the dataset used;

this work was excellent and the write up phenomenal

------
kevincennis
This is really cool, but the difference between A and B was _immediately_
obvious to me.

I couldn't find it – did OP say whether or not respondents were musicians or
not?

~~~
ronack
Yes, I'd say the most obvious tell was the strict quantization. If they can
transfer some of the timing information it might be more convincing.

~~~
macawfish
I thought the most obvious giveaway was that the melody was not articulated at
all by the robot. Aside from that, the dynamics in the robot version were
completely absent. I actually didn't think it was even a very good song until
I listened to the human playing it, then I kinda liked it.

------
olegkikin
Please compress your .wav files to mp3/mp4/ogg/whatever. And please only load
the music when I want to play it. The size of that page is ridiculous.

~~~
imalikshake
Just (losslessly) converted the wavs into mp3s. Hope it loads faster now.

~~~
anamexis
Slightly pedantic, but there is no such thing as lossless mp3 compression.

~~~
yorwba
Also pedantic, but lossless mp3 compression is possible; it just requires the
decompressed signal to be identical to the original.

~~~
anamexis
More just having fun now, but I feel lossless compression implies
deterministic decompression, i.e. there is one and only one signal which
compresses to a given compressed signal.

Even if you had a signal which compressed to itself, it seems to me that there
would likely be other possible signals which would compress to an identical
compressed signal.

~~~
yorwba
Good point, even if compressing and then decompressing does not change
anything, you need to _know_ that nothing changed, otherwise you lose
information about the compression error.

------
gabrielgoh
Could you provide a baseline case of a complete random musical style? I know
it will sound terrible, but I may be too much of a musical pleb to even notice
the difference.

~~~
imalikshake
That's a really good idea! I've produced mp3s with randomised velocities for
the Chopin track and also the Yiruma track. I'll be adding them to the blog
post. Thanks!

Chopin:
[http://imanmalik.com/assets/audio/random_chpn.mp3](http://imanmalik.com/assets/audio/random_chpn.mp3)

Yiruma:
[http://imanmalik.com/assets/audio/random_y.mp3](http://imanmalik.com/assets/audio/random_y.mp3)

~~~
imalikshake
I've added the randomised tracks to the blog :)

------
marmaduke
I know the focus is on neural networks and music, but there's an underrated
art of data curation and preprocessing of which this is an interesting
example.

------
fasteo
Really really cool, but anyone having some classical music background - my
grandpa was a professional cellist, I did cello for 8 years - can _easily_
guess who the bot is in the A, B test

------
Matumio
> So the StyleNet model has successfully passed the Turing Test and can
> generate performances that are indistinguishable from that of a human.

Not quite. There is only a ~2% chance that those ~72 people who made a choice
got such a good result (62% correct) by random guessing. Still an impressive
result.

------
yooooooooooo
This is come cool stuff. >:(

