
Google Brain's Magenta: Multi-Style Image Transfer with Code - cinjon
https://magenta.tensorflow.org/2016/11/01/multistyle-pastiche-generator/
======
salik_syed
As an artist I find it very frustrating when people try to apply style
transfer type techniques in an attempt to emulate an artist like Picasso. It
kinda works and generates a bunch of hype but it isn't even close. The reason
it's frustrating is because I think Deep Learning is _actually_ capable of
doing this stuff but the people implementing need to understand how Picasso
actually did his work.

If you look at cubism the whole idea is to capture multiple sides of a
3-Dimensional object at once. A lot of art is not a "style" but rather a
projection from 3D (or 4D) space to 2D space.

If you wanted to paint a "dog" in the style of Picasso your network would need
to understand the geometry of a dog.

Training on a bunch of 2D before and after training examples is
underspecified.

It's important to understand that it is a mapping from 3D -> 2D ... _NOT_
2D->2D

Another example is "Nude Descending a Staircase" by Duchamp:
[https://en.wikipedia.org/wiki/Nude_Descending_a_Staircase,_N...](https://en.wikipedia.org/wiki/Nude_Descending_a_Staircase,_No._2)

It is a painting describing motion. To apply style transfer would be
completely stupid because the point of the image is to project 4D->2D ... not
to have wavy black and brown lines.

~~~
pavlov
To translate that kind of conceptual aesthetic logic into an algorithm, the
programmer essentially needs to become the artist: make subjective creative
decisions about the style to achieve, and enshrine those into code. And (as
dmreedy wrote in a sibling comment) that's specifically the kind of "old-
school" AI approach the current DNN-based work is trying to avoid.

I'm not as optimistic as you that the current statistics-driven approaches
could ever reach the kind of deep analytic modeling that would be required for
a style transfer system to be able to look at a Picasso and infer that there's
a 3D->2D mapping at play... And it's a very interesting thought because (to
me) it seems to demonstrate how far we are from actual AI that could make that
kind of inventive conceptual leap.

~~~
salik_syed
What data does an artist consider when he paints? He does sort of an
optimization procedure very similar to what something like Deep Dream does.
But rather than doing response optimization to make random-noise more "dog-
like" or "cat-like" or "human-like" (as Deep Dream does) the optimization is
done to evoke a certain feeling within the artist himself. To create more
extreme feelings than just a photo-realistic rendering.

The mapping between feeling and images are correlated to each other through
experience. Certain images are fundamental to human experience and the human
brain through evolution( a mother smiling, scary monsters). Others are learned
(ever been hit by a car? bet that every time you see that exact model and
color of car you'll feel an emotion)

Here's a thought experiment:

What if we fed the deep learning "painter" tons of 3D animation. Each point in
time would be a full 3D Scene. Each point in time would be labelled with
emotions "scary", "happy" , "angry"

I bet the algorithm could generate original art and learn new artistic styles
by maximizing response to certain permutations of feelings.

~~~
visarga
Learning from video is researched in many papers. The way video data is
structured, it allows for identification of new objects by comparing
consecutive frames. It creates a "model of the physical world" that can
predict the future a few time steps ahead. It is being used to identify
activities and to help plan robotic movements.

------
gabipurcaru
Why is everyone working on style transfer? It doesn't seem like such an
interesting problem in the field, compared to things like speech recognition
for example. Is it just because it's a "cracked" problem and it looks nice?
I'm just genuinely curious here, not trying to bash the amazing work these
people do.

~~~
bertiewhykovich
Because it's a way to avoid confronting the increasingly unavoidable fact that
the AI renaissance DNNs were supposed to usher in is looking increasingly less
impressive. Unsurprising, given that throwing more computing power at neural
networks doesn't constitute a fundamental leap forward -- but disconcerting to
a community that expected, and promised, far more than is being delivered.

~~~
Florin_Andrei
Hold on a second. We're still in the very, very early stages here. We haven't
even started to connect those networks together to make hierarchies.

You're speaking like someone watching the Wright brothers testing some of
their earliest models, and going "supersonic flight my ass, you guys can't
even fly across this football field".

~~~
therein
> We haven't even started to connect those networks together to make
> hierarchies.

What's stopping us at this point?

~~~
zardo
Nothing, but it might not be the right approach.

We didn't progress from the wright flyer by stacking more and more wings on.
(Although that path was explored for a couple decades)

------
bcheung
Not sure how related this is but seems like the right crowd to ask...

As a photographer who also does programming full time I've been wondering what
would need to happen to synthesize skin texture to remove imperfections.
Example, removing small scars, wrinkles, etc. Currently I just use the healing
brush in Photoshop but wondering if ML can be used to automatically do it.

Does anyone have any recommendations on what sub-fields or papers I could read
to get a better idea of what would be involved to create a solution like that?

~~~
whataretensors
Sounds like a problem for inpainting.

Here's a paper around synthesizing human faces. It includes inpainting
[http://www.faculty.idc.ac.il/arik/seminar2009/papers/VisioFa...](http://www.faculty.idc.ac.il/arik/seminar2009/papers/VisioFaces.pdf)

[https://arxiv.org/pdf/1604.07379.pdf](https://arxiv.org/pdf/1604.07379.pdf)
this uses a GAN to inpaint with arbitrary data. This is probably a couple of
iterations from being easy to implement, as training GANs efficiently and
accurately is still a technical challenge.

