
Open Source Neural Network Synthesizer - burningion
https://nsynthsuper.withgoogle.com/
======
anigbrowl
I'd be more excited if this weren't so tame. The Nord Modular had genetic
algorithm patch mutation well over a decade ago, details starting on page 99.

[http://www.nordkeyboards.com/sites/default/files/files/downl...](http://www.nordkeyboards.com/sites/default/files/files/downloads/manuals/nord-
modular-G2/Nord%20Modular%20G2%20English%20User%20Manual%20v1.4%20Edition%201.4x.pdf)

The Hartmann Neuron took a similar approach with neural networks in 2003:
[https://www.soundonsound.com/reviews/hartmann-
neuron](https://www.soundonsound.com/reviews/hartmann-neuron)

I mean, well done and everything, it's a good project, but _Synthesize Brand
New Sounds In Ways Never Before Possible!!!_ is a pitch that synth users hear
year after year (pun intended). It turns out that musicians don't like black
box patching all that much, but prefer morphing things in parameter space
because being musicians they want to interact with their instruments, whether
that's timbrally, melodically, or harmonically.

Electronic musicians in particular don't need More Sounds or even More
Oscillators and More filters and More FX - sure, those are interesting, but
honestly people are already spoiled for choice. What people like most is an
instrument whose timbral range may be limited but which has a strong center -
secondary characteristics remain largely consistent as primary variables are
manipulated, so oscillators don't thin out at higher or lower ranges, filter Q
negative feedback gain isn't damped so aggressively that it changes gain
structure and so forth. The nicest thing an electronic musician can say about
an instrument is not 'it can make so many sounds' but 'you can't get a bad
sound out of it.'

~~~
dharma1
I agree with most things you say, but "you can't get a bad sound out of it" is
pretty subjective, especially when it comes to synthesis. So how would one
define the training set of "good sounds" for supervised training?

One thing I thought would be pretty cool - I've got a friend who did a phd on
physical modeling of sound in Edinburgh -
[http://www.ness.music.ed.ac.uk/](http://www.ness.music.ed.ac.uk/). Often
those physical models have hundreds of parameters and it's quite difficult to
tune them to make meaningful/good sounds. Perhaps a neural network would be
useful for tuning the parameters - you could use a real musician playing a
real instrument with sensors, and use the generated sensor data and audio for
your training set. And then the neural network could learn to match the
physical model parameters to that.

~~~
anigbrowl
It's absolutely subjective, eg as a simple example I love the 303 and you may
hate it but adore some other model. But I generally understand a comment like
to mean both that the person likes the timbrality of a particular instrument,
and that one won't 'lose' the sound of a patch while tweaking.

I'm not sure how to articulate this. I have synths that I can program in my
head because I know the architecture really well and when I imagine a sound I
can walk over and dial it in from the front panel and get more or less what I
expected, plus I can then tweak the results with abandon and for musical
satisfaction. Then I have others (sometimes from the same manufacturer which
emit all manner of nice sounds but are far harder to program and easily veer
off into sonic mush - technically impressive but not really fun to play.

~~~
dharma1
Yeah I know what you mean. The "sweet spot" on some synths. Just don't know
how to pose that as a supervised learning problem, or what it would do (limit
the range of parameters?) I'm not sure the amount of parameters in normal
substractive synth+filter+mod architecture is large enough for very
interesting results that you couldn't achieve otherwise, or how you would
generate a training set that can produce meaningful mapping.

I also think wavenet sample by sample generation and interpolation between
latent features doesn't sound that exciting, as cool as it technically is.

We'll find some place to use machine learning in music/audio eventually :) I
think perhaps more natural sounding pitch shifting could be one area (since
you could learn the structure of sound of different instruments at various
pitches), reverb removal, denoising, polyphonic audio to midi - things like
that, where you have obvious training data.

------
iammyIP
Neat! Like a Kaoss pad DIY sample 2d crossfader running on rpi3.

This is 2 Parts, a high end computer that analyses (with ML and Neural magic!)
some source waves and outputs blended samples that you can put on a 2d grid,
and for these generated waves a simple sample player (made with
openFrameworks) running on rpi3 that mixes the waves depending on your xy
position.

However it doesn't sound interesting or good for what they show, they probably
need a better demo without any roland classics. Their Bass / Piano mix sounds
mushy and essentially represents the most boring average synth sound i could
imagine. The most interesting thing is the flute/snare crossover that is
buried in the overladen promotional fluff video.

Would be nice to hear a demo that really puts out the 'new' neural sounds.

edit: the essential 15 seconds of the video here:
[https://www.youtube.com/watch?time_continue=100&v=iTXU9Z0NYo...](https://www.youtube.com/watch?time_continue=100&v=iTXU9Z0NYoU)

~~~
TheOtherHobbes
They missed the boat on being first with NN synthesis by about fifteen years.

[http://www.vintagesynth.com/misc/neuron.php](http://www.vintagesynth.com/misc/neuron.php)

(IMO it's depressing that Google don't appear to know this.)

Musically the sound is a lot less interesting than the engineering is. In fact
it's a perfect demonstration of why you can't just throw NNs at a problem and
expect to get something useful out.

Musical sounds - even synthesised musical sounds - tend to cluster around
certain perceptual parameter sets. If you don't know what those parameter sets
are - and they're not just frequency distributions, or envelope shapes, or
waveform sequences - your model will tend to generate sounds that are
perceived as musically trivial and/or uninteresting.

By a strange coincidence, this was the problem with the Hartmann Neuron. There
was some very clever technology inside the box, but the sounds had none of the
_quality_ that made it a must-have for musical production. It shipped a few
hundred units and then disappeared.

That quality is a very elusive thing. Some synth companies, like Roland, have
been very good at capturing it. But if you ask their designers what they're
aiming for, it's unlikely they'll be able to tell you. Even more strangely,
that quality sometimes appeared in products apparently by accident, when they
were abused to make sounds that were an accidental twist on their original
design.

...Which would be a convincing argument for cultural preference if it weren't
for the fact that many of the classic products that were abused in this way
were made by Roland.

All we know is that musicians respond to that quality when they hear it.
Unfortunately for engineers, sounds that have that quality can have very
little in common with each other. So there's unlikely to be a statistical
process that can engineer "good" sounds with a high hit rate.

------
chrisallick
"You will also need the following Open NSynth Super-specific items" super
specific indeed.

If someone is interested in machine learning and music, I'd send them to:
[http://wekinator.org](http://wekinator.org) which is actually a research
project rather than a marketing campaign one off, and can be setup, run, and
played with in a matter of minutes.

~~~
rtkwe
If you want to actually play with it Magenta (another Google group I think)
who provided the actual musical sauce has released a plugin for Ableton and
for MSG plus there's the browser experiment.

[https://experiments.withgoogle.com/ai/sound-
maker](https://experiments.withgoogle.com/ai/sound-maker)

[https://magenta.tensorflow.org/nsynth-
instrument](https://magenta.tensorflow.org/nsynth-instrument)

------
tibbon
I wish I was a little more excited, the results honestly sound rather like
what a Yamaha TG-33 outputs, or any other wavetable synth where you can
crossfade between two sounds.

I _love_ the idea of using neural networks to find new sounds and
possibilities, but for some reason the NSynth project just doesn't hit it for
me. Would love to be convinced otherwise.

------
bluetwo
Neat. Really needs someone to go ahead and mass produce. I assume Google
realized the market is too small for them to worry about, but if someone could
build them in bulk I'm sure they would find an audience of people willing to
pay a decent price.

My guess is there are not a lot of people who could both a) build this in a
short amount of time and b) find practical uses for it.

~~~
edna314
> I assume Google realized the market is too small for them to worry about,
> but if someone could build them in bulk I'm sure they would find an audience
> of people willing to pay a decent price.

Definitely underestimating the market if that is the case. My gear acquirement
syndrome is already triggered.

------
bravura
I seem to recall that NSynth and Wavenet are only operating at 16Khz mono, or
perhaps it was even lower. Are we now able to generate full 44.1Khz sound?

~~~
dharma1
Still 16kHz - [https://github.com/googlecreativelab/open-nsynth-
super/tree/...](https://github.com/googlecreativelab/open-nsynth-
super/tree/master/audio)

------
mmjaa
Aphex Twin did it better:

[https://fo.am/midimutant/](https://fo.am/midimutant/)

Not that I mind having this sort of technology being promoted by the likes of
Google (casts glance at two 19" racks full of synthesisers), but I think I'd
prefer to go with Mr. D James comes up with over the corporate bread maker
path ..

~~~
anjellow
In what way is Aphex Twin's project better?

~~~
noelwelsh
He cares because we do, while Google is a faceless uncaring corporation.

~~~
anjellow
thats a very subjective way at looking at it.

~~~
noelwelsh
It was a joke, referencing an Aphex Twin album:
[https://en.wikipedia.org/wiki/...I_Care_Because_You_Do](https://en.wikipedia.org/wiki/...I_Care_Because_You_Do)

------
hashkb
Distortion was "discovered" by misuse of technology. I'm sure it would never
have been invented / discovered on purpose. Digital technology can't be abused
the same way analog technology can.

~~~
optimuspaul
> Digital technology can't be abused the same way analog technology can.

I assume that is sarcasm

------
posterboy
Too lazy to log into github: where is the nsynth-generate that is referenced
in in the repo in audio/readme.md? Not in the repo in any case and no link
either ... but a hyperlink to tmux is given. Mixed up priorities!

