
Learning to write programs that generate images - dsr12
https://deepmind.com/blog/learning-to-generate-images/
======
wpasc
This is really cool tech and very awesome work by deepmind.

But man does this scare me. I remember a quote: “You furnish the pictures and
I’ll furnish the war.” – William Randolph Hearst, January 25, 1898. This was
in the lead up to the spanish american war.

Can you imagine tech like this/tech like "deepfakes" being used today? Fake
news that was text alone has done and is doing damage in elections around the
world now. Imagine that armed with pictures?!?!

In a dueling NN architecture, many say the discriminator will be able to
detect the fake images. I wonder is there a threshold that a produced image is
just too damn close to a real picture that even an equally good NN that is
discriminating can no longer differentiate? In the end, both real and fake
images are just pixel values... what would we do then?

Cool tech, scary possibilities.

~~~
MrLeap
This has already happened, albeit manually. See this clumsy representation:
[https://en.wikipedia.org/wiki/Adnan_Hajj_photographs_controv...](https://en.wikipedia.org/wiki/Adnan_Hajj_photographs_controversy)

Will be interesting if it ever reaches the point where it can be automated and
scaled. I predict a modernized repeat of the war of the worlds tipping point
followed by ratcheted skepticism for all kinds of temporal simulacra.

Then 3d imaging will enter widespread consumer usage and prove to be very
difficult to convincingly reproduce by neural networks, until it is. Trust
will be restored in some kind of media until it's broken. Rinse and repeat.

~~~
opportune
Here's another scary GAN proof of concept [0]. In this case, researchers
transferred someone's face in real time to facial expression and mouth
movements of public figures. Combined with DeepMind's new tech that seems to
be able to produce human voice with believable candor and inflection [1], you
could make some very convincing fake footage.

[0][https://www.youtube.com/watch?v=ohmajJTcpNk](https://www.youtube.com/watch?v=ohmajJTcpNk)

[1][https://research.googleblog.com/2018/03/expressive-speech-
sy...](https://research.googleblog.com/2018/03/expressive-speech-synthesis-
with.html)

------
lainga
It seems to me like many of the computer-generated MNIST digits involved
retracing the same contours multiple times.

Is it possible to (a) filter out these duplicate strokes, (b) convert them to
heavier-weight single strokes, or (c) change the training regime to not
produce duplicate strokes?

I can see that being useful for e.g. a real robot with a limited amount of ink
or lead (or time to draw each character).

~~~
nthngnss
The reason for that is that I chose a particular brush from the set of
available brushes ('dry brush'). Since MNIST digits are quite sharp and
opaque, the agent tries to achieve this by retracing the contours. I guess the
remedy is to pick an appropriate brush style or make the agent choose it.

~~~
lainga
Neat! Thanks and welcome.

------
DanielBMarkham
This is still a GAN, right? It's running an adversarial system at a level
higher than pixel-poking.

~~~
gwern
Depends on whether you see the glass half-full or half-empty. Is it a DRL
actor-critic where the reward & critic happen to be half of a GAN, or is it a
GAN where the generator happens to receive a RL-style loss instead of the
normal discriminator loss? Actor-critic and GANs have always been hard to tell
apart:
[https://arxiv.org/pdf/1610.01945.pdf](https://arxiv.org/pdf/1610.01945.pdf)

------
eli_gottlieb
Yeah, ok, so you basically reinvented inverse-graphics analysis-as-synthesis,
stuck DEEP NEURAL and the GOOGLE DEEPMIND(TM) brand on it, and now you're
acting like it's the bee's knees.

I'm starting to understand how Juergen Schmidhuber feels.

~~~
nthngnss
You'd be surprised to see that both "inverse graphics" and "analysis-by-
synthesis" are mentioned in the paper

~~~
eli_gottlieb
Yes, I saw that they are, but at that point, I'd have to ask where the novelty
_is_ besides transforming them from probabilistic problems to plain neural-
network problems.

~~~
nthngnss
I would call "inverse graphics" a task. One can solve that task following
different strategies. We demonstrate one way that uses RL and GANs and gives
reasonably good results. For Omniglot, for example, there are works by Lake et
al. that employ probabilistic perspective but the amount of hand-engineering
involved makes their approach hard to apply to other tasks

------
zombieprocesses
"When trained to paint celebrity faces, the agent is capable of capturing the
main traits of the face, such as shape, tone and hair style, much like a
street artist would when painting a portrait with a limited number of brush
strokes:"

That's interesting. Do we know how artists draw? Is it as "algorithmic" as the
article lays it out? I don't draw so I always assumed it was more intuitive
and personal rather than a "step by step" process.

~~~
Isamu
Methods are very individualized and vary also by medium. Check out the
"Manben" videos:

[http://www.dailymotion.com/video/x65w5fu](http://www.dailymotion.com/video/x65w5fu)

It is eye-opening, even among fellow manga artists, to see how different
sometimes their processes are.

Some may start with a definite sketch, others may go straight to ink with only
the barest suggestion of a layout. Sometimes they struggle with expressions
and may whiteout and re-ink (up to seven times in one of the videos.)

Some artists start inking with the eyes, some may start with an outline of the
face. And so on.

------
Ono-Sendai
Haven't looked into the paper. However you don't need a NN to simulate
painting. Basically it's a search problem.

See my results here:
[https://forwardscattering.org/post/42](https://forwardscattering.org/post/42)

~~~
nthngnss
While it is true that one can simulate painting by search (in some sense), the
problem is that (naive) search doesn't always work (we have an example in the
paper). Moreover, training an agent has a benefit of fast test-time inference
(i.e., you give an image, and it paints it almost instantly). Of course, we
haven't achieved the ultimate goal yet but it's a step in that direction.

~~~
Ono-Sendai
Thanks for the reply. Where in the paper is the example given? Skimmed it but
didn't see it. (Edit: nevermind, see it now)

Fast painting is a benefit I guess. My search/painting program is very
computationally intensive.

Edit: I think I see the point of the paper now. Unguided search is going to be
difficult in high-dimensional search spaces like this. So the NNs become a
hopefully-effective heuristic guiding the search.

~~~
nthngnss
Yup, exactly! Btw, didn't realize those were your results (just mindlessly
clicked on the link). Very cool stuff!

~~~
Ono-Sendai
Thanks.

The (semi-) obvious next step is to do object/digit recognition with a
Bayesian probability calculation, with probabilities bases on this image
reconstruction process. In other words, we choose e.g. digits based on how
likely they are to have been drawn to give the target image.

I have experimented a little with this idea, but with no successful results so
far (plain old NNs still beat it).

------
ultrasounder
Discovered mypaint.org FTW!!!Looks awesome and extendable too

------
chillingeffect
What kind of robot arm is that btw?

------
m3andros
What a sweet site! May I ask what SSG you're using?

