Hacker News new | past | comments | ask | show | jobs | submit login
Generating Music Using GANs and Deep Learning (arxiv.org)
84 points by Agrodotus on May 4, 2017 | hide | past | favorite | 22 comments

I'm baffled by this paper. They appear to be trying to generate music from broad musical gestures caught on video. They seem surprised that you can't train a network to synthesize good music from a statistical map of sounds and gestures.

This makes as much sense as using a video of the postures of a developer at work as a training set, combined with code output, and then wondering why this doesn't generate useful apps when fed by a live posture csm.

Am I missing something?

Generative Adversarial Networks (GANs) are a new thing and it is not at all clear where their best applications will be found. So there are a lot of people throwing them at different problems to see how well they perform (or don't).

Only kinda crazy.

As a ukulelist, when I sit in with (say) a guitarist, I do a sequence-to-sequence translation of the chords I see my friend playing to chords I can play. If you know your instrument well, you know the sound that goes with certain visual configurations. So you can think of it as sequence-to-sequence translation with a very non-traditional input.

You are missing the the magic word "deep" in the title. I was looking at Nvidias GPU conference session lineup recently, it was hilarious how many talks had the words "deep learning" in the title.

Why would that be hilarious?

NVIDIA GPUs are the primary platform where Deep Learning training and inference is being done. Not only that, NVIDIA is making that the core of what they do

It's getting a lot of press for sure, but most of their profit (by a large margin) still comes from gaming

Back in the day, it is the same sheep effect with 'XXX Topic Model'

Reminds me of:

here was a whole chain of separate departments dealing with proletarian literature, music, drama, and entertainment generally. Here were produced rubbishy newspapers containing almost nothing except sport, crime and astrology, sensational five-cent novelettes, films oozing with sex, and sentimental songs which were composed entirely by mechanical means on a special kind of kaleidoscope known as a versificator.

-- George Orwell, 1984

Any open source code released? I am very interested to have a try!

I doubt it. This is academia. When discussing releasing the code to prove that the researchers did what they said, I was told that I was unreasonable and should just trust people. If that worked, my house would run on cold fusion and my hybrid would fly.

Edit: science requires reproducing. Opening the code is literally enabling reproduction. When researchers refuse to do that, they refuse to be scientists. I find this especially irksome when there are public funds used NG the researchers.

I'm as annoyed by this as the next person, but to be fair opening the code isn't necessary for reproduction, it just makes it easier.

If their paper is properly written there should be enough information there to recreate what they've done. If there isn't that and they refuse to provide that information when asked then you have every right to complain.


There is a really interesting case study in the Collins Parser. Starting around 1999, Michael Collins published some really exciting results on parsing (interpreting the grammatical structure of nature language). However, people could not replicate them--a "clean room" implementation of Collins' models didn't work nearly as well as the paper claimed it should.

Dan Bickel identified a number of apparently trivial implementation decisions that, when taken together accounted for Collins' models improved performance. There's a nice tech report describing this process here: http://repository.upenn.edu/cgi/viewcontent.cgi?article=1026...

Kind of amazing that a single guy put in the effort to rev-eng such a complex implementation. Brilliant paper that, thanks :)

The thing is that anything ever published in a paper represents a claim: the researchers claim that they have achieved something. You may choose to trust the claim to be true, or not- or you may try to reproduce the results if you need to do so for some reason (for instance, because you want to use them in your own work). Until you have convinced yourself that something is true, you can't be expected to believe it just because someone else -anyone else- says it is.

This is true even for theoretical papers with mathematical proofs and the like. I don't think this is a very controversial opinion. I've often seen reports in the (mostly popular) press starting with "such and such team claims to have solved such-and-such longstanding problem in a new paper released..." etc.

In the popular press we tend to see claims made in papers reported as an absolute fact: "These Danish boffins trained a deep managerial neural pixie network to recognise the sex of starfish" etc. The point that the paper reports the team's own results as the team understands them and that other teams may have a different interpretation of the same results, is often lost in the translation.

The best outlets often include a few opinions from researchers not involved in the work- I tend to trust those better.

One thing I've noticed more with science journalists (which I greatly approve of) is that they seem as a whole to have taken to heart the "get a quote from a researcher who wasn't involved in the study" mantra, so that it's rare to see an article without a "says x who was not involved in the research" line.

Not true for the ML scene, code is very frequently released. cf. http://www.gitxiv.com/

Thankfully this is starting to change; the UK research councils already require open access to papers as a requirement for funding, and they're currently looking at open access to data (which would include code).

It's currently a bit fuzzy what open data requirements would be, so before enforcing the requirement they're trying to pin it down with regard to things like subject/patient/etc. confidentiality, industrial agreements, what to do about massive datasets (e.g. TB+), etc.

There is hope!

> science requires reproducing. Opening the code is literally enabling reproduction. When researchers refuse to do that, they refuse to be scientists.

Yet refusing to open the code is not refusing to enable reproduction.

It can be somewhat tricky though if maybe they want to open source, but the code simply isn't ready yet to be released. The presumption there being that they will eventually open source their code.

They could release it under the crapl[0].

[0] http://matt.might.net/articles/crapl/

Yet another example of generative adversarial networks learning mindbogglingly complicated functions, in this case, a function that takes an image of a person playing an instrument and produces sounds made by that instrument, and another function that does the opposite (i.e., generates an image of a person playing an instrument from a sound sample).

Very cool.

Do they not have links to the audio generated anywhere?

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact