
Neural Style Transfer for Musical Melodies - lerch
https://ashispati.github.io//style-transfer/
======
shams93
The thing is there is no narrative or story telling element to image style
transfer. Think of Beethovan's 9th symphony. It develops a narrative its not
entirely abstract, the theme of the Song of Brotherhood gets subtly developed
through the course of the composition. This is story telling. This is the same
reason why its much easier to use ML to generate credible poetry. Generating a
real novel with ML faces the same challenges as generating music, even
instrumental music has a narrative and a story its telling. Story telling is
far more difficult a problem than style transfer.

~~~
Nydhal
In one word: locality. Images have it narratives do not.

~~~
Florin_Andrei
In another one word: language. Music has some elements of language that images
don't have.

~~~
kahoon
Interestingly though the failures of image generation are often of the missing
narrative kind: eg. windowless bedrooms, eyes/ears at wrong places on animals.
It feels like that deep learning is good at local but fails at global
coherence.

------
romaniv
_> Why do we need Style Transfer for Music?_

We don't. Most of the applications of ANNs to music that I've seen so far are
solving problems that either don't exist or have been very successfully solved
by much simpler methods with much better results.

If you want to apply AI to something useful in music, how about:

\- Machine learning for reverb generation. I.e. being able to "record" reverb
from real space and apply it to sound.

\- AI/ML for high-quality sample transposition.

\- AI for pure sound synthesis. Imagine something that does PCA a set of
sounds and then allows you to generate new sounds by changing components with
a bunch of knobs.

\- AI for real-time sample morphing. For example, being able to synthesize a
choir that sings the words you need. Or, how about transforming your own voice
into a quality choir vocals? Hey, style transfer!

\- AI for high-quality music transcription (sound to notes or MIDI). This is
being done to some extent, but I don't think it's very good yet.

And so on.

I can assure you, every singe one of these would be something musicians would
be willing to pay good money for.

~~~
adelrune
Recording reverb from real space is as simple as recording an impulse (firing
a gun or anything that makes a short high intensity noise) in a room and then
convolving that with the audio you want to add reverb to.

The only limits to this is how close to a perfect impulse you can generate.

~~~
romaniv
Well, yes, that's how convolution reverbs work. Except I've never seen
anything that would allow users to record a reverb with conventional hardware
and without any elaborate setups.

Also, there is more to reverb than just impulse response. A more sophisticated
reverb simulator would map out the space and allow you to choose where the
"listener" and the source are located in that environment.

The nearby comment about moving obstructions is on-point as well.

There is ample space for application of machine learning here.

------
enkiv2
We're quite lucky that neural nets are overkill for procedural music
generation.

By that I mean, we have huge masses of music theory, applied across genres and
focusing on differences between genres, in terms of heuristics for both
analysis & composition, and because of a tradition of procedural generation of
music that goes back a couple centuries, a lot of it is fairly easy to
translate into computer programs. (For instance, end-to-end serialist
composition is easier for computers than it is for people, while canons and
other mechanisms for creating permutations of melodies are equally
straightforward.)

This doesn't translate into a straightforward method for putting in two wav
files and producing a third with transferred style, but it does mean that a
sufficiently motivated person can write something that translates notes
between two known genres with greater ease than they would with images.

(Text is somewhere in the middle. I've worked on a couple attempts at 'style
transfer' for text -- mostly using word2vec.)

~~~
dontreact
I think you’re overestimating the power of music theory to serve as a basis
for generation or modification of music, and also the scope of music that it
explains. Rhythm has been central over the past 100 years since Western pop
music has a lot of its roots in American blues. However there is surprisingly
little that music theory has to say about rhythm or groove.

There are a lot of regularities and patterns in harmony and harmonic sequences
which music theory covers, but there are also a combinatorial explosion of
melodies that will be justified by music theory in a particular harmonic
context. The choice of which melodic path to go down is very poorly
constrained by music theory.

~~~
romaniv
Generating rhythms is a problem that has been solved by arpeggiators, step
sequencers, analog modular rigs and more sophisticated tools like KARMA. You
don't need machine learning for it.

~~~
dontreact
This is a greatly oversimplified view of rhythm. There are many rhythms and
grooves that do not lock in with "the grid". These tools will most likely not
produce a natural pattern of velocity that sounds appropriate for the
generated rhythm. Step sequencers are a tool for inputting rhythm, not for
generating it. Arpeggiators typically have a consistent rhythm (hitting on
every one of some subdivision).

~~~
romaniv
Modern step sequencers (for example, Elektron boxes) are way more than just a
grid. They have microtiming, parametrized triggers, parameter sliding,
probabilistic and conditional triggers, and are capable of running multiple
patterns of varying lengths that reset at different rates.

------
eivarv
There are some fun examples of this sort of stuff on
[http://dadabots.com/](http://dadabots.com/) , which includes attempts to
synthesize music in the style of The Beatles, Meshuggah and The Dillinger
Escape Plan.

------
thijsvandien
If you like the idea of taking a song and changing its genre, check out
Postmodern Jukebox. They’re pretty brilliant.

~~~
bluetwo
Or Richard Cheese. Or Nouvelle Vague.

All lots of fun.

------
chestervonwinch
I find this surprising, from the simplistic (and probably naive) view that
images are 2D signals while music is 1D.

~~~
kastnerkyle
"Style transfer" also rarely works for object level transfer - it is more
pattern based (high frequency content is often the "style" that is enhanced
and transferred). Really nice transfers in practice sometimes require the
object level content in the images to be similar, c.f. [0][1]. And all of this
is coupled with really heavy human curation (people don't normally show their
bad outputs)!

In music the "style" _is the content_ in some sense. For example jazz has very
different "style" than classical, at many levels (key and tempo choice/mode
choice/melodic intervals/motifs/amount of repetition of said motif/how it
varies/harmonization and chord choice/global structure (AABA format)) and it
isn't easy separate what pieces make it "jazz", and what don't (what factors
of variation matter).

The equivalent in images would be replacing objects as well as texture, to
form a new image that is reminiscent of the original but also novel at
multiple scales - think Simpson's "Last Supper" as the goal of a style
transfer [2].

It is also hard because as consumers we are used to hearing high quality
versions of these types of "style transfer" for some styles all the time - and
we even have a name for it ... "muzak".

[0] [https://raw.githubusercontent.com/awentzonline/image-
analogi...](https://raw.githubusercontent.com/awentzonline/image-
analogies/master/examples/images/sugarskull-analogy.jpg)

[1] [https://github.com/chuanli11/CNNMRF](https://github.com/chuanli11/CNNMRF)

[2]
[http://s267.photobucket.com/user/wiro_bucket/media/last%20su...](http://s267.photobucket.com/user/wiro_bucket/media/last%20supper/SimpsonsLastSupper.jpg.html)

~~~
taco_emoji
I think "cover song" is a more generic term than "muzak" for musical style
transfer.

~~~
kastnerkyle
It can be, though some covers are "straight up", while others (generally the
memorable ones) are practically a new creation in themselves, with a sliding
scale in between. But for "cover song" meaning something like Hendrix's "All
Along the Watchtower" or Coltrane's "My Favorite Things", I agree.

------
peterburkimsher
"Music is NOT well-understood by machines (yet !!)"

I wrote this blog post, with some data that might help improve that.

[http://peterburk.github.io/chordProgressions/index.html](http://peterburk.github.io/chordProgressions/index.html)

------
TheOtherHobbes
Finally someone gets it. At least a little. :)

