
Generative Models - nicolapcweek94
https://openai.com/blog/generative-models/
======
hasenj
This is so cool and I can't help but feel like I'm missing something important
that's taking place and has huge potential.

As a busy programmer who gets exhausted at night from the mental effort
required at my day job, I have a feeling like I will never be able to catch up
at this rate.

Are there any introductory materials to this field? Something I can read
slowly during the weekends, that gives an overview of the fundamental concepts
(primarily) and basic techniques (secondarily) without overwhelming the reader
in the more advanced/complicated techniques (at least during the beginning).

I'd really appreciate any recommendations.

~~~
T-A
For reinforcement learning, one of OpenAI's focus areas, the book by Sutton &
Barto is still the standard reference:
[https://webdocs.cs.ualberta.ca/~sutton/book/the-
book.html](https://webdocs.cs.ualberta.ca/~sutton/book/the-book.html)

Improved algorithms have been devised since it was written, see

[http://karpathy.github.io/2016/05/31/rl/](http://karpathy.github.io/2016/05/31/rl/)

and, in particular,

[https://arxiv.org/abs/1502.05477](https://arxiv.org/abs/1502.05477)

~~~
georgehm
I personally found Andrew Ng's videos on Reinforcement Learning from
cs229@stanford + inverted pole balancing programming assignment to be great
intro's on the topic.

------
andreyk
Brief summary: a nice intro about what generative models are and the current
popular approaches/papers, followed by descriptions of recent work by OpenAI
in the space. Quick links to papers mentioned:

Improving GANs
[https://arxiv.org/abs/1606.03498](https://arxiv.org/abs/1606.03498)

Improving VAEs
[http://arxiv.org/abs/1606.04934](http://arxiv.org/abs/1606.04934)

InfoGAN [https://arxiv.org/abs/1606.03657](https://arxiv.org/abs/1606.03657)

Curiosity-driven Exploration in Deep Reinforcement Learning via Bayesian
Neural Networks
[http://arxiv.org/abs/1605.09674](http://arxiv.org/abs/1605.09674)

Generative Adversarial Imitation Learning
[http://arxiv.org/abs/1606.03476](http://arxiv.org/abs/1606.03476)

I think the last one seems very exciting, I expect Imitation Learning would be
a great approach for many robotics tasks.

------
brandonb
Very cool. As you're thinking about unsupervised or semi-supervised deep
learning, consider medical data sets as a potential domain.

ImageNet has 1,034,908 labeled images. In a hospital setting, you'd be lucky
to get 1000 participants.

That means those datasets really show off the power of unsupervised, semi-
supervised, or one-shot learning algorithms. And if you set up the problem
well, each increment of ROC translates into a life saved.

Happy to point you in the right direction when the time comes—my email is in
my HN profile.

~~~
aub3bhat
Most top Hospitals in USA have high quality data on Millions of patients the
legal and bureaucratic challenges to sharing those datasets are
insurmountable. However if you are affiliated a university hospital its not
difficult to get 690,000 CT scans or time series data with 400+ signals from
450,000 Operations.

Even outcomes data procedures performed and diagnosis across multiple visits
can be easily obtained for millions of patients on national scale. My research
involves applying deep learning to these datasets.

~~~
tansey
Isn't the labeling really tricky, though?

In my limited experience, EHRs aren't usually setup to handle structured
labeling of something like an image. There are lots of different fields for
text entry that can be unstructured. Then the only label left is the billing
code, which ends up being a poor choice of label since the hospital often
bills for what it can get reimbursed for, not what you actually had.

~~~
PeterisP
You don't need labels for the image if you can get them from other patient
information, in particular, the diagnosis.

E.g. you know from image metadata that it's a chest x-ray of patient #1234 at
2012/03/04\. Then you automatically check patient EHR near that date - do they
have lung cancer Y/N; do they have broken ribs Y/N; do they have TB Y/N, etc,
and make your image labels based on that. How diagnosis are codified, though,
differs significantly between various medical systems, I have no idea how it's
in USA EHR.

------
johnwatson11218
Have these techniques been used to generate realistic looking test data for
testing software? I have had ideas along these lines but people think I'm
talking about fuzz testing when I try and describe it.

I'm imagining something where you take a corporate db and reduce it down to a
model. Then that can be shared with third parties and used to generate
unlimited amounts of test data that looks like real data w/o revealing any
actual user info.

~~~
hacker42
That depends on the nature of the data, I think. If the data has a lot of
sequential, sparse, hierarchical statistical dependencies (like source code,
text or data streams), they might be better modeled by an LSTM. If you have
high-dimensional dependencies (like images, where each pixel tends to
spatially depends on many other pixels), then an autoencoder or some
undirected model might be the right choice.

------
viach
Looks like fake accounts on Facebook will have real unique userpics soon

~~~
bytefactory
And games will have more variety in NPCs faces!

------
ElHacker
I really like that they used TensorFlow and published their code in GitHub. It
will help a lot of people like me, that are new in the field and want to learn
more about generative models. Amazing work by the OpenAI team!

~~~
aerovistae
In theory everything OpenAI does will be available on GitHub or in some
comparable form: that's the point of the organization. That's why it's called
_Open_ AI. So that we can all share the benefits, instead of just Google
having it for themselves. Because we all know that's who's hording the AI
progress.

------
bradscarleton
It looks like they are using both TensorFlow and Theano. Is there a reason to
use both?

~~~
TimSal
The VAE code and the semi-supervised part of the GAN code build on code that
was developed about half a year ago, when Tensorflow was less developed and
was lacking in speed and functionality compared to Theano. It has since caught
up and most new projects at OpenAI are now done using Tensorflow, which is
what we used for the newer additions.

~~~
Eridrus
Could you mention a bit about why you're using Tensorflow?

I'm glad you are since I'm using it myself, but I haven't used any other
frameworks so I'm wondering if I should expect more people to head in this
direction, or spend time learning others.

~~~
TimSal
There are currently many excellent frameworks to choose from: TensorFlow,
Theano, Torch, MXNet are all great. The comparative advantage of TensorFlow is
mostly its support in the community (e.g. most stars on GitHub, most new
projects being developed, etc).

------
j2kun
The actual outputs look grotesque. Disembodied dog torsos with seven eyeballs
and such. It's cool, but to me this is clearly showing the local nature of
convolutional nets; it's a limitation that one has to overcome if one is to
truly generate lifelike images from scratch.

~~~
visarga
Those weren't the best images. Current best results don't have disembodied dog
torsos. I remember a paper that was about generating plausible bedroom images.
Not only did they look real, but they could interpolate between two bedrooms
generating a transformation sequence.

~~~
vintermann
Yup, that was the original DCGAN paper:

[https://arxiv.org/abs/1511.06434](https://arxiv.org/abs/1511.06434)

------
dkarapetyan
The generated images look like the stuff nightmares are made out of. Which is
to say they're extremely aesthetically unpleasant. So what exactly have these
networks learned?

~~~
robotresearcher
They've learned an approximation of what stuff looks like projected into 2D.

My guess is that your brain is creeped out by an uncanny-valley-like effect.
The images are plausible in their structure so part of your visual system is
happy, but the causality is not there, so your brain is thrashing around
looking for meaning that is missing.

------
Rexxar
Can we see somewhere the generated images with higher resolution ?

~~~
shpx
No, that's how they come out of the model.

Using larger images means your code runs much (exponentially) slower, and
gives you only slightly (asymptotically) better results so people usually use
tiny images. All their outputs are 32*32.

------
zump
Why do I constantly feel like I'm missing out with all this stuff?

------
pestaa
What a beautifully presented research.

------
gradstudent
Interesting topic, tedious article. Paraphrasing:

Q: What's a generative model?

A: Well, we have these neural nets and...

Ugh. I understand the excitement for one's own research but if the point is to
make these results accessible to a wider audience then it's important not to
get lost in the details, at least not right away. IMO, there's very little
here in the way of high-level intuition. If I did not already have a PhD, and
some exposure to ML (not my area), I would probably find this article entirely
indecipherable. Again, paraphrasing:

Q: OK, so I understand you want to create pictures that resemble real photos.
And you really like this DCGAN method, right?

A: Yes! See, it takes 100 random numbers and...

Come on guys. You can do better.

~~~
resu_nimda
FWIW, I found this comment pretty indecipherable. I have no idea how your
examples illustrate your point.

Maybe you can do better as well? Which is to say, effectively communicating
something technical to a diverse audience is difficult, let's not be
unnecessarily derisive.

~~~
gradstudent
>Which is to say, effectively communicating something technical to a diverse
audience is difficult, let's not be unnecessarily derisive.

There's nothing especially derisive in my assessment. I don't think the
content is bad, just boring. I also think it's too technical for a non-
specialist audience.

> Maybe you can do better as well?

My first criticism is that generative models are not something specific to
neural nets but that's not obvious from the article.

My second criticism is that their explanations are overly mechanical. In the
case of DCGAN the article begins by talking about parameters and magic
numbers; i.e. they explain how the thing works rather than what it does, at an
intuitive level.

Clear enough?

~~~
platz
Listen to gradstudent, he knows what he's talking about.. this article
presents generative models as a grab bag of NN algorithms that came out this
quarter.

~~~
visarga
Most papers are migrating from plain classification to variational methods.
This is the new trend. That means, instead of just predicting labels, they
know also predict a probability for each label, a degree of confidence. And
they work both ways, they can generate, as well as classify. And they are
unsupervised, which means they can benefit from tons of data laying around.

~~~
platz
Your definition of "variational methods" is actually a definition of
generative models.
[https://en.wikipedia.org/wiki/Generative_model](https://en.wikipedia.org/wiki/Generative_model).
variational methods is a much more specific concept, and your answer speaks
more to the general and high-level concept of generative models.

Notice on the wikipedia page for generative models, there is a _lot_ more than
variational methods.

