
MuseNet - runesoerensen
https://openai.com/blog/musenet/
======
brilee
My take on the classical parts of it, as a classical pianist.

Overall: stylistic coherency on the scale of ~15 seconds. Better than anything
I've heard so far. Seems to have an attachment to pedal notes.

Mozart: I would say Mozart's distinguishing characteristic as a composer is
that every measure "sounds right". Even without knowing the piece, you can
usually tell when a performer has made a mistake and deviated from the score.
The mozart samples sound... wrong. There are parallel 5ths everywhere.

Bach: (I heard a bach sample in the live concert) - It had roughly the right
consistency in the melody, but zero counterpoint, which is Bach's defining
feature. Conditioning maybe not strong enough?

Rachmaninoff: Known for lush musical textures and hauntingly beautiful
melodies. The samples got the texture approximately right, although I would
describe them more as murky more than lush. No melody to be heard.

~~~
varelse
Funny, as someone with the useless superpower of knowing an enormous number of
TV Show themes, I could hear all sorts of riffs on such themes ranging from
Magnum PI to Sesame Street.

Overall, IMO it wildly gyrates from the best I have ever heard all the way to
the return of Microsoft's thematically lifeless Songsmith without warning...

[https://en.wikipedia.org/wiki/Microsoft_Songsmith](https://en.wikipedia.org/wiki/Microsoft_Songsmith)

------
modeless
I think this is far, far beyond any algorithmic composition system ever made
before. It displays an impressive understanding of the fundamentals of music
theory.

Most previous attempts at neural net composition restricted the training set
to one style of music or even one composer, which is pretty silly if you
understand how neural nets work. It was obvious to me if you used a very large
network, chose the right input representation, and most importantly used a
complete dataset of all available music, you would get great results. That's
exactly what OpenAI has done here.

It is still lacking some longer term structure in the music it generates (e.g.
ABA form). But I think simply scaling up further (model and dataset size both)
could fix that without any breakthroughs. This seems to be OpenAI's bread and
butter now: taking existing techniques and scaling them up with a few tweaks.
(To be clear, I don't mean to minimize what they've done at all. "Simply"
scaling up is not so simple in reality.)

What might still need some breakthroughs is applying the same technique to raw
audio instead of MIDI. Perhaps what is needed is a more expressive symbolic
representation than MIDI. I'm imagining an architecture with three parts: a
transcription network to produce a symbolic representation (perhaps embedding
vectors instead of MIDI), something like this MuseNet for the middle, and a
synthesis network to translate back to raw audio. This would be analogous to
gluing together a speech recognizer, text processing network, and a speech
synthesizer. Such a system could generate much more natural sounding music,
even perhaps with lyrics.

~~~
p1esk
How do you compare them to Aiva [1]?

[1] [https://aiva.ai/creations](https://aiva.ai/creations)

~~~
0815test
Aiva is _way_ better than this honestly, and I mean the pure AI part of it,
that (AIUI) only really generates monotimbral, piano-like music (the
orchestration in the full pieces they release was - by their own admission -
done by humans, last I looked into it). You can actually _tell_ that Aiva
involves real _serendipitous_ ("out-of-sample") creativity of the sort that
AIs (and good human composers!) are best at. (But the openly-available models
I mentioned elsewhere in the thread are still a bit better, TBH. At least
IMHO, and when it comes to rewarding active listening.)

------
jefftk
I don't know the other genres well enough to evaluate them, but the bluegrass
one ([https://soundcloud.com/openai_audio/genre-
bluegrass](https://soundcloud.com/openai_audio/genre-bluegrass)) is pretty
bizarre:

* Who uses piano as the lead instrument in bluegrass?

* They're only using one one note velocity for the whole piece, which misses a huge wealth of variation through rhythmic accent.

* Timing generally feels a bit robotic?

* The best parts it sounds kind of ok but boring; in bad parts it sounds like nothing ([https://soundcloud.com/openai_audio/gaga-beatles](https://soundcloud.com/openai_audio/gaga-beatles) is even worse that way)

* A lot of the artificiality seems to be the synth they're using.

* On the other hand it does understand a bit about phrasing and repetition, which many people new to traditional music take a long time to pick up.

~~~
throwaway66666
It's midi. It's meant to be notation, not a realistic composition. I suppose
the background "machine gun" piano notes would be a bassy synth combined with
90s electronic music drums. So please do not focus that much on the audio or
performance part of it, but on the composition part of it.

I do agree they could have trained it about the importance of velocity though.
(That neural net, and most young music students out there too, heh)

~~~
jefftk
_> It's midi. It's meant to be notation, not a realistic composition._

MIDI certainly can represent realistic compositions, when used with good
synths, though I agree it's not what it's known for!

 _> I suppose the background "machine gun" piano notes would be a bassy synth
combined with 90s electronic music drums._

Uh, what sort of bluegrass music do you listen to and where can I find some of
it?

~~~
hatsunearu
I think they meant the gaga-beatles one, there's a lot of "machine gun" piano
sounds.

[https://soundcloud.com/openai_audio/gaga-
beatles](https://soundcloud.com/openai_audio/gaga-beatles)

------
rco8786
This seems incredibly applicable to musical scores in movies. I can imagine a
product where the editor/director/someone inputs a handful of variables (mood,
genre, instruments, etc) and timing requests (crescendo beginning at 30s and
ending at 75s, calm period from 90s - 120s, etc) and out comes a musical score
for the movie that matches up with their scene editing.

~~~
wilg
Maybe more interesting for video games, where you can have a realtime input of
gameplay variables.

~~~
fastball
That could be _very_ interesting.

Every gameplay a different soundtrack. Sounds fun.

~~~
squarefoot
Even more interesting if it allows variations in the same soundtrack according
to different dramatic moments in the same scenarios. Example: a man is walking
down the street vs a man is walking alone in the night down the street vs a
man is walking alone in the night down the street unaware of a bright red dot
on his back. All these scenarios could use the same soundtrack, but require
different dramatic levels. The game would send data to the music algorithms so
that the soundtrack would reflect the right mood.

------
codelitt
While this may not be perfect composition, this is a surreal (and almost sad)
moment in my life to hear music that is passable created by a computer under
it's own volition. I work at a company that works with a lot of machine
learning so I generally understand its limitations and haven't ever been an
alarmist. That being said, I generally always thought of it being applied to
automate work. For some reason I had always considered that which we normally
attribute to human creativity to be off-limits. Sure now it's not great, but
in 10 years will it be able to compose better music than Chopin? In 10 years,
will music created by a computer surprise and delight me more than music
composed by humans?

~~~
meesles
Fear not, my friend, because the execution is only a part of why music and art
as a whole carries meaning for us.

Consider two tracks that are identical (forget copywrite for a minute).
Between one that an AI generated and a human composed, I would personally
grant the human-generated version more credit and enjoy it more. The story of
how art is created and the stories of the artist are as substantial to
appreciating art as a stroke of a brush or a note on a page. Computers will
never replicate this until singularity.

~~~
harigov
What makes you think that a computer that can generate music cannot also
generate a background for the "creator" of that music? It can give you all the
stories that touch our hearts even more so than we can imagine.

~~~
sidthekid
Why generate a fake story? Why not communicate the real and moving journey of
how a single note in the training data travelled through hundreds of neurons
and thousands of matrices, and eventually made it past the final activation
function to become a feature in the output tensor.

~~~
unrealhoang
That must be a touching story, I’m sure.

------
rmbryan
MuseNet will play an experimental concert today from 12-3p PT livestreamed on
our Twitch channel. No human (including us) has ever heard these pieces
before, so this will be an interesting showcase of MuseNet’s strengths and
weaknesses. Through May 12th, we’re also inviting everyone to try out our
prototype MuseNet-powered music generation tool.

~~~
daeken
I really, really want to see a live band or orchestra play a set of AI-
composed pieces, none of which were filtered or manipulated by humans. Just
generate MIDI, spit out sheet music (hopefully written well enough that it can
be sight-read), and hope for the best. I'd buy a ticket for that, without a
doubt!

~~~
gdb
We'll be playing generated MIDI without human filtering or manipulation today
(though playing them synthetically of course)! If you know of an orchestra
looking for some music, we could do the rest of what you describe :).

------
Areading314
This may be academically interesting, but the music still sounds fake enough
to be unpleasant (i.e. there's no way I'd spend any time listening to this
voluntarily).

~~~
diehunde
Yes, the lack of human factor is very noticeable if you are a musician. I
believe it's pretty similar to when grandmasters can tell if they are playing
a human or a bot. Something that's hard to explain. Now saying if it's better
or worse is just subjective.

~~~
laughinghan
Grandmasters also use different tactics against bots:
[https://en.wikipedia.org/wiki/Anti-
computer_tactics](https://en.wikipedia.org/wiki/Anti-computer_tactics)

The concept of how different human intelligence is from "AI" fascinates me, as
it would seem to say a lot about the nature of intelligence and how far we are
from GAI (pretty darn far).

------
ericye16
Something I'm curious about: If I make some music I really like through this
tool, do I own the copyright to that? Can I turn generated music into an album
and sell it? I'm not sure if the site does caching but if it does and me and
another person generate the same music, do we both own rights to that?

~~~
gwern
I got this question a lot about my StyleGAN anime faces & GPT-2-small poetry:
[https://www.gwern.net/Faces#faq](https://www.gwern.net/Faces#faq)

The legal consensus, such as it is, seems to be that (if you did not otherwise
agree to a contract/license modifying this in arbitrary ways) you create a new
copyright & own it if you use their music-editing tool to tweak settings until
you got something you liked, because you are exercising creative control,
making choices, and engaging in labor. On the other hand, if you merely
generated a random sample, neither you nor anyone else own a copyright on it.

~~~
occamschainsaw
How do you prove that too? Maybe in the future somebody creates a random
painting in FuturePaintGAN(TM)+3DCanvasPrinter(TM) that moves millions of
people to tears and sells for hundreds of thousands in some auction house. Is
that their IP?

What if that person is a monkey[1]? Is it "animal-made art"[2]?

[1]
[https://en.wikipedia.org/wiki/Pierre_Brassau](https://en.wikipedia.org/wiki/Pierre_Brassau)
[2] [https://en.wikipedia.org/wiki/Animal-
made_art](https://en.wikipedia.org/wiki/Animal-made_art)

~~~
gwern
How do you prove anything to a court about who owns a copyright? You present
what evidence you have and marshal what arguments you can, and hope that the
person in the right prevails.

As your own links indicate, animals have no more copyrights any more than a
computer program would because they are not human, and copyright is explicitly
granted to human creative efforts.

------
ipsum2
Neat output, the project is very similar compared to the Google Doodle a few
weeks back, where they generated music based off of user-submitted notes.
[https://www.google.com/doodles/celebrating-johann-
sebastian-...](https://www.google.com/doodles/celebrating-johann-sebastian-
bach)

Since this is OpenAI, is MuseNet open source?

~~~
gdb
Not yet, but we plan to release the code & pre-trained model!

~~~
dharma1
What kind of hardware (and what size data set) would you need to train with
new types of music?

Doable with a single 1080ti and a couple of hundred midi files?

Also, can you do supervised learning with this - say melody input and chords
(with good voice leading) output?

~~~
gwern
> Doable with a single 1080ti and a couple of hundred midi files?

A 1080ti would probably require something like several days or a week. It
depends on how big the model is... Probably not a big deal. However, a few
hundred MIDI files would be pushing it in terms of sample size. If you look at
experiments like my GPT-2-small finetuning to make it generate poetry instead
of random English text (
[https://www.gwern.net/GPT-2](https://www.gwern.net/GPT-2) ), it really works
best if you are into at least the megabyte range of text. Similarly with
StyleGAN, if you want to retrain my anime face StyleGAN on a specific
character ( [https://www.gwern.net/Faces#transfer-
learning](https://www.gwern.net/Faces#transfer-learning) ), you want at least
a few hundred faces. Below that, you're going to need architectures designed
specifically for transfer learning/few-shot learning, which are designed to
work in the low _n_ regime. (They exist, but StyleGAN and GPT-2 are not them.)

------
CuriouslyC
I'm very into both music composition/production and ML, so this is neat. That
being said, I think the "computer generated music" path is probably the wrong
approach in the short term. Music is really complex and leans heavily on both
emotions and creativity, which aren't even on the radar for AI. Being able to
dynamically remix and modulate existing music is still really cool though.

I would kill for a VST tool that would take a set of midi tracks, and
synthesize a new track for a specific instrument that "blends" with them. I
would also kill for something that can take a set of "target notes" and break
them up/syncopate/add rests to produce good melodies, or take a base melody
and suggest changes to spice it up.

~~~
leesec
>Music is really complex and leans heavily on both emotions and creativity,
which aren't even on the radar for AI.

I definitely think creativity is on the radar for AI, see: AlphaGo. Everything
we think is based on emotions is ultimately learnable.

~~~
CuriouslyC
I don't think alpha go is a good example here. It play with strange
brilliance, but I wouldn't call that creativity. If you care to elaborate on
why you said this I'm curious to hear your rational.

To me, creativity is really about generation of "aesthetic novelty" which is
hard to get from a ML algorithm that is trying to approximate patterns in
training data. Eventually, there will be models trained on a wide variety of
art, music, stories, etc that can recognize aesthetic and structural
isomorphism between mediums (say between a grizzly picture and death metal),
then we'll lose our competitive advantage. I don't think we're nearly so close
to that as the singularity types would have us believe though.

~~~
leesec
I suppose I was loosely defining creativity as "ability to generate novelty or
novel insight". People call AlphaGo created because he demonstrated a new and
novel way to play that has since influenced others. AI Music will eventually
teach us things about music in a similar way.

------
peter_d_sherman
This thing sounds awesome! After hearing it, I am truly amazed at the level to
which AI has evolved! It's good music... Heck, it's not just good music, it's
potentially borderline (with some tweaks by human professionals) great music!

But, despite this potential greatness, if there's a problem, it's that this AI
_only_ produces music...

What this AI composer really needs is an _AI lyricist_ to _write lyrics_ for
the songs it composes!

Sort of like an AI Lerner to it's AI Loewe...

An AI Hammerstein to it's AI Rodgers...

An AI Gilbert to it's AI Sullivan...

An AI Tim Rice to it's AI Andrew Lloyd Weber...

An AI Robert Plant to it's AI Jimmy Page...

An AI Keith Richards to it's AI Mick Jagger...

An AI Paul McCartney to it's AI John Lennon...

An AI Bernie Taupin to it's AI Bernie Taupin...

An AI James Hetfield to it's AI Lars Ulrich...

An AI Wierd Al Yankovic... to it's... AI Wierd Al Yankovic... <g>

You know, an AI Assistant for this... AI Assistant... <g>

Well, an AI Assistant to _write lyrics_ that is... An AI "Lyrcistant"... <g>

Come on, I know there's someone in the AI world who can do this! But it might
be a bit challenging... the AI would not only have to write poetry, but it
would have to match that poetry to all of the various characteristics of the
music...

Not an easy task, to say the least!

But, for the right AI researcher... an interesting, challenging, worthy one!

(I think I hear 2001's "Daisy Bell" playing in the background...)

By the way... disclaimer: _I am an AI_. That is, _An AI wrote this message on
HN_.

No, I'm kidding about that! But... _how would you know_? (insert ominous
sounding music here) <g>

------
ucha
While I admire the effort, the music sounds quite unpleasant...

Jukedeck has significantly better AI generated music but since I have not
found a description of how their model works, it is hard to compare it to
this.

------
gnahckire
When discussing this with my friends, an interesting question came up: Who
owns the music this produces?

Couldn't one generate music and upload that to Spotify and get paid based off
the number of listens?

------
craze3
Damn, I was just about to code this!

Generative music is definitely on the come-up. If you like this, also check
out [https://generative.fm/](https://generative.fm/) , which is from another
HN member.

~~~
CamelCaseName
I think [https://brain.fm](https://brain.fm) also came out of HN.

I've used it once or twice, but for whatever reason, nothing sounds better to
me than the music I used to listen to when I was a teenager.

~~~
bduerst
>nothing sounds better to me than the music I used to listen to when I was a
teenager.

Tangental, but listening to the music from your teen years is a form of
therapy for people who developed dementia. It's possible the music we impress
in our teen years hold a special value in our brains.

------
pdxww
Impressive. Would this model benefit from something like "dilated attention"?
Instead of feeding it raw sound samples, we could split the input into 16 sec,
8 sec, 4 sec and so on slices, assign each slice a "sound vector" serving as a
short description of that slice and let the generator take those sound vectors
as input. This should supposedly let it gain global consistency in output.

Now an unpopular opinion. I'm not an ML expert, so take my words with
reasonable skepticism. This fancy GPT2 model diagram can impress an
uninitiated, but we are initiated, right? There is really no science there and
it's still the good old numbers grinder: an input of fixed size is passed thru
a big random pile of matrix multiplications and sigmoids and yields a fixed
size output. We could technically replace this nice looking GTP2 model with a
flat stack of matmuls and tanhs, with a ton of weights and given enough
powerful GPUs (that would cost tens of millions), train that model and get the
same result. It just won't make an impression of science. How are these GTP2
models designed? By somewhat random experiments with the model structure. The
key here is the GPU datacenter that could quickly evaluate the model on a huge
dataset. The breakthru would be achieving the same quality with very little
weights.

~~~
p1esk
_Instead of feeding it raw sound samples, we could split the input into 16
sec, 8 sec, 4 sec and so on slices, assign each slice a "sound vector" serving
as a short description of that slice and let the generator take those sound
vectors as input._

I didn’t quite get it. How would you feed this variable sized input?

~~~
pdxww
To illustrate more this idea, let's use soundtrack v=negh-3hi1vE on youtube.
Such soundtracks consist of multiple more or less repeating patterns. The
period of each pattern is different: some background pattern that sets the
mood of the music may have a long period of tens of seconds. The primary
pattern that's playing right now has a short period of 0.25 seconds, plays for
a few seconds and then fades off. The idea is to split the soundtrack into 10
second chunks and map each chunk to a vector of a fixed size, say 128. The
same thing we do with words. Now we have a sequence of shape (?, 128) that can
be theoretically fed into a music generator and as long as we can map such
vectors back to 10 second sound chunks, we can generate music. Then we
introduce a similar sequence that splits the soundtrack into 5 second chunks.
Then another sequence for 2.5 seconds chunks and so on. Now we have multiple
sequences that we can feed to the generator. Currently we take 1/48000th
second slices and map them to vectors, but that's about as good as trying to
generate meaningful text by drawing it pixel by pixel (which we can surely do
and the model will have 250 billion weights and take 2 million years to train
on commodity hardware).

~~~
p1esk
How would you map these chunks to vectors?

~~~
pdxww
The same way we map words to vectors or entire pictures to vectors. We'll have
another ML model that would take 1 second of sound as input (48000 1 byte
numbers) and produce a say vector of 128 float32 numbers that would "describe"
this 1 second of sound.

~~~
p1esk
What would be an equivalent of a word for music?

~~~
pdxww
1 second of sound. Or a few seconds of sound.

~~~
p1esk
This would rule out such common mapping methods as word2vec, because unlike
words, vast majority of 1 sec chunks of audio would be unique (or only
repeating within a single recording).

~~~
pdxww
That's fine. The goal is to map "similar" 1 second chunks to similar vectors.
I'm sure this can be done and uniqueness of sound won't be a problem.

~~~
p1esk
Sure, we can probably find a way to map two similar chunks to two similar
vectors. However, with 1:1 mapping the resulting vectors will be just as
unique. That's a problem, because, if you recall, we want to predict the next
unit of music based on the units the model has seen so far. Training a model
for this task requires showing it sequences of encoded units of music
(vectors), where we must have many examples of how a particular vector follows
a combination of particular vectors. If most of our vectors are unique, we
won't have enough examples to train the model. For example, showing the model
multiple examples of a phrase "I'm going to [some verb]", it will eventually
learn that "to" after "I'm going" is quite likely, that a verb is more likely
after "to" than an adjective, etc. This wouldn't have happened if the model
saw 'going' or 'to' only once during training.

~~~
pdxww
Can we diff spectrograms to define the "distance" between two chunks of sound
and use this measure to guide the ML learning process?

Would it help to decompose sound into subpatterns with Fourier transform?

Afaik, there is a similar technique for recognizing faces: a face picture is
mapped to a "face vector". Yet this technique doesn't need the notion of
"sequence of faces" to train the model. Can we use it to get "sound vectors"?

~~~
p1esk
How would you use spectrogram diffs for training?

I'm not sure what would be useful "subpatterns" of sound. In language
modeling, there are word based, and character based models. Given enough text,
an RNN can be trained on either, and I'm not sure which approach is better.
For music the closest equivalent of a word is (probably) a chord, and the
closest equivalent of a character is (probably) a single note, but perhaps it
should be something like a harmonic, I don't know.

Unlike faces, music is a sequence (of sounds). It's closer to video than to an
image. So we need to chop it up and to encode each chunk.

Ultimately, I believe that we just need a lot of data. Given enough data, we
can train a model which is large enough to learn everything it needs in the
end to end fashion. Primary achievement of GPT-2 paper is training a big model
on lots of data. In this work, it appears they only used a couple of available
midi datasets for training, which is probably not enough. Training on all
available audio recordings (either raw, or converted to symbolic format) would
probably be a game changer.

------
0815test
Not very impressive/interesting, they aren't releasing their code or models,
and it seems that they had to use lots of rather obscure hacks to make the
whole multi-instrument thing work. (Though I suppose it definitely _is_
impressive in other ways, especially hacking their existing GPT architecture
to do something worthwhile with non-text tokens. So, I'm not saying that any
sort of work should be dismissed outright!)

I do still think that the good old [https://github.com/hexahedria/biaxial-rnn-
music-composition](https://github.com/hexahedria/biaxial-rnn-music-
composition) (Hexahedria's Biaxial RNN/Tied Parallel Networks, also published
a paper achieving SOTA at least for that time, on a variety of midi datasets)
has more interesting/compelling output musically, despite starting from a
comparatively tiny dataset and using rather elegant and easily-undestandable
convolution- and RNN-based techniques. Too bad that the implementation relies
on Theano which is quite endangered at this point (doesn't seem to support up-
to-date python 3?), but I do think it's a compelling starting point if anyone
really wants to work on this domain.

------
elihu
I expect the next major copyright law legal battle will be around the question
of whether content generated by some machine learning approach that was
trained on copyrighted data is a derivative work or not.

~~~
bufferoverflow
Proving that a certain piece was generated from a certain dataset will be a
huge obstacle.

Heck, proving a piece was generated would be hard.

------
nate
As an amateur video producer constantly on the lookout for even some basic (as
in not grammy winning, or even "artist" quality) music to add to my work that
I don't have to pay a fortune to license, but don't want it to be incredibly
unoriginal, this path has me excited.

------
GistNoesis
Nice.

We played a little with transformers inside a browser using tensorflow.js a
few month ago, for real-time music transcription.

For those interested : Website :
[https://gistnoesis.github.io/](https://gistnoesis.github.io/) Project :
[https://github.com/GistNoesis/Wisteria/](https://github.com/GistNoesis/Wisteria/)

My project is currently on hold, but will definitely receive update in the
future. As it's kind of project for fun with no hope of monetization given
that the space is already crowded with Google's Magenta, and now OpenAI is
joining the dance with museNet.

------
davesque
ML models that generate music continue to improve. However, to me, the result
still just seems like a novelty and nothing more. Perhaps it will take much
more iteration and improvement in compute power before I hear a piece of music
that I find truly inspiring that was generated by an AI.

On the other hand, I've been much more impacted by visual "art" already being
generated by AIs. Perhaps the musical medium is ironically harder to crack
since the format is simpler, rather like how it took longer for an AI to
defeat an expert goban player than a chess player.

~~~
p1esk
Have you seen a “truly inspiring” piece of visual art generated by AI?

------
ArtWomb
Wondering if synthesizing a single _instrument_ rather than an entire MIDI
file might be better, as each note could be evaluated via finite
differentials. Transformer would use as input a single note from say violin or
guitar. And "translate" the pitch and envelope to the new sounds. The physical
simulation of music, rather than notational logic. Problem of long range order
in the composition still remains unsolved, unless the model accounts for the
importance of transitions in music theory. In Neural Machine Translation,
characters, words, even grammatical rules themselves can pre-define such an
implicit structure.

Enjoyable to the trained mind to be sure. Especially as a moment in history.
But the tunes themselves lack _soul_. And jarring for myself as the background
music I had turned off to catch the tail end of MuseNet's performance, was of
a particularly feverish level of human expressiveness ;)

Bad Brains - Live at the CBGB's 1982 (Full Concert)

[https://www.youtube.com/watch?v=2pUlNfdnsAM](https://www.youtube.com/watch?v=2pUlNfdnsAM)

------
nickelcitymario
This was literally a plot point in the last episode of American Gods.

So the turn-around time from sci-fi to reality is... less than a week now?

------
uptownfunk
What speaks to me here is this is one of the closest things we have to
_creativity_ in AI and while AGI is a ways away, this is a key step, even
though it may not seem like some hard tech we can put to use in industry
immediately.

I think it’s a subtle distinction, but imagine a model where we can throw in
thousands of math proofs then give the model some initial assumptions and just
let it run wild. I think getting a neural net to model the creative spark /
the ingenuity is what has been missing.

A part of me doesn’t want to believe that it is possible, but a part of me is
genuinely curious as to the consequences.

An exciting time to be alive, folks.

------
ctoth
Was any of the work from [https://openai.com/blog/sparse-
transformer/](https://openai.com/blog/sparse-transformer/) incorporated in
this?

------
mises
I think this has great potential for content creators and their use of
royalty-free music. It might just help a lot of the big copyright issues, and
distinguish each person's work with a distinct score.

------
Ericson2314
As a sometimes composer, the thing that makes me most sad about this stuff is
it's all computer or no computer. Deep learning with symbolic stuff is still
in it's infancy, so I don't expect to really mix this with hand-composed stuff
anytime soon. As far as I understand it as a layman, the way the input is
compressed for learning is also how it's fed the sample, so there's no way to
"pay more attention" to the piece your augmenting than the pieces in the
training data.

------
dillondoyle
Can someone live stream to Twitch, collect likes or comments, and then feed
back and make a GAN trained on our collective response to the musical output?

I would LOVE to see where that goes... Is it going to turn into 4 chord pop?
Or maybe more dominant, more resolve-y? Or maybe my assumptions will be wrong
and we would collectively train for more complex music?

------
DannyB2
Am I remembering correctly that in Orwell's 1984 (the book), music and poetry
was created by machines?

~~~
hoseja
Yes, all this effort to generate music, with disappointing, lifeless results,
is somewhat disturbing.

------
dom96
Pretty cool. Ludum Dare is coming up and this would actually be a pretty nice
way to generate some music.

------
Gimpei
Feels like this kind of thing would be useful for generating better jazz
backing tracks. The programmatic backing tracks you get these days are good,
but can be really repetitive. I wonder if there's some way to force the model
to follow changes.

------
z3t4
Have anyone attempted to make a code autocomplete or snippet/boilerplate
generator using this !? Would be nice when coding on mobile where its hard to
navigate between code and navigation due to the small screen.

~~~
snazz
Wrong thread?

~~~
z3t4
> MuseNet uses the same general-purpose unsupervised technology as GPT-2, a
> large-scale transformer model trained to predict the next token in a
> sequence, whether audio or text.

------
ohiovr
Can it make ukulele music?

Perfect for youtube ads that must have obligatory music complete with the
drone in the background "Pond 5.... Pond 5..." and the producer can't pitch in
5 bucks for it.

------
csomar
How does licensing work here? Do they have the right to "merge" that music
into their neuronal weights? Is there a specific license for usage in a neural
net?

~~~
ArekDymalski
The legal situation of such music has been recently described as:
[https://futurism.com/ai-generated-music-legal-
clusterfuck](https://futurism.com/ai-generated-music-legal-clusterfuck)

------
SubiculumCode
Seems to have a hard time being the fifth Beatle. I couldn't even get a cheap
Beatles feel, much a sublime melody. However, some of the piano pieces are
pleasant.

------
mrfusion
What makes this different than those markov chain models people used to use to
generate nonsensical papers that sound real?

------
rambojazz
Is this open source?

------
asparagui
Cool stuff, Christine!

------
daveheq
I just read today a tweet by one of the React Rally spokespersons how Hacker
News is a toxic forum... yet it's the least toxic most data-oriented forums
I've used in years.

~~~
pvorb
Sorry, but I can't take anyone seriously writing this on Twitter, the most
toxic platform of them all.

------
revskill
I always think, to be a better programmer requires a better understanding of
art and science.

From literature, music, physics,... everything unrelated to the computer
itself.

That's why every IT book needs more knowledge to teach users how to really
understand art and science behind everything we experience.

------
dpcan
As a web developer, all these "do it yourself" website systems make me sick.
The end results look amateur and messy most of the time, and it's an insult to
the industry to give people with no visual or code-based skills the tools to
create sub-par websites.

I'll bet talented musicians and real composers feel the same about MuseNet.
The music probably makes them cringe to their core.

Then there is the general public...

They don't notice anything wrong with do-it-yourself websites, and this music
sounds amazing.

~~~
TaupeRanger
Quite true. I've found this to be the case with almost every "composition AI"
system ever created, with the exception of those that aid human composers,
such as those designed by David Cope.

------
mzs
> RELENTLESS DOPPELGANGER \m/ \m/ \m/ \m/ \m/ \m/ \m/ \m/ \m/ \m/ \m/ \m/ \m/
> \m/

[https://youtu.be/CNNmBtNcccE](https://youtu.be/CNNmBtNcccE)

> Neural network generating technical death metal, via livestream 24/7 to
> infinity. Trained on Archspire with modified SampleRNN. Read more about our
> research into eliminating humans from metal:
> [https://arxiv.org/abs/1811.06633](https://arxiv.org/abs/1811.06633)

> More albums [https://dadabots.bandcamp.com/](https://dadabots.bandcamp.com/)

~~~
throwaway66666
Despite this being a cool feat. It sounds so close to the source material
(down to parts of the songs being a lofi 1-1 copy paste), that I don't see a
real point of it for even existing. (except "omg death metal neural nets lol"
factor)

