
Understanding Aesthetics with Deep Learning - jipy9
https://devblogs.nvidia.com/parallelforall/understanding-aesthetics-deep-learning/
======
saltenhav
Jurgen Schmidhuber has an interesting perspective on the rewarding aspects of
art and music.

To paraphrase, "Artists (and observers of art) get rewarded for making (and
observing) novel patterns: data that is neither arbitrary (like incompressible
random white noise) nor regular in an already known way, but regular in way
that is new with respect to the observer's current knowledge, yet learnable
(that is, after learning fewer computational resources are needed to encode
the data)".

In other words, enjoyment of art is about learning (easy) patterns.
Schmidhuber likens "fun" to the improvement of an observer's ability to
compress a scene.

The source is worth the read:
[http://people.idsia.ch/~juergen/creativity.html](http://people.idsia.ch/~juergen/creativity.html)

------
phreeza
I wonder how much our sense of aesthetics has to do with the perceived
scarcity or effort of creation needed. I remember the first HDR photos looked
absolutely mind-blowing to me, but now, as the process has been automated and
is ubiquitous, it just looks tacky.

~~~
zaaakk
read adorno

~~~
yusee
Have you read him? If so, why not synthesize his arguments so we can all
understand?

~~~
mccoyspace
Here goes: Aesthetic Theory is a posthumous book by German philosopher
Theodore Adorno and is one of the most important works on aesthetics in 20th
century philosophy. In it, he combines elements from Kant, Hegel and Marx.
From Kant's 1790 work, Critique of Judgement, he draws on the idea of the
Sublime and the notion that art has formal autonomy (which Adorno modifies).
From Hegel, he takes a dialectical understanding of the ultimate goal or aim
of art, and from Marx he approaches art as being inseparable from society as a
whole.

He shows how art has a semi-autonomous 'truth-content' (a la Kant), but one
that is always the product and embodiment of unresolved contradictions within
the larger social fabric. (a la Hegel, and more so Marx).

Returning to the NVidia article at the root of this thread, this passage pops
out as problematic (certainly for Adorno, but also in general): "In our case,
we use supervised machine learning, with a dataset of photographs pre-
categorized as aesthetically pleasing or not." There is a real sense of
question-begging going on here. And Adorno would say that this approach
forecloses the very fact that the definitional boundaries of these categories
are constantly shifting and lack any real social stability.

edit: links.... [http://www.upress.umn.edu/book-division/books/aesthetic-
theo...](http://www.upress.umn.edu/book-division/books/aesthetic-theory)
[http://plato.stanford.edu/entries/adorno/#4](http://plato.stanford.edu/entries/adorno/#4)
[http://plato.stanford.edu/entries/kant-
aesthetics/](http://plato.stanford.edu/entries/kant-aesthetics/)

~~~
ibuildthings
Author here. Just to clarify, motive of this work is to ease curation, with
the massive amount of content being created; but by no means an attempt at
creativity or originality.

The work is not at all contradictory to Adorno, especially in the sense that
it is explicitly trying to as non-reductionist as possible, and assuming
notion of aesthetics is a dynamic entity .

There is a finite pattern in the dataset; more interesting, it has its
interesting share of subtleties ( for example, as opposed to a image
classification problems), and the technological question is whether we can
capture these.

But there is another interesting data question. For our work, we curated our
training set with the help of expert curators. But the dataset itself is a
metamorphising entity; i.e. it is subject to revision ( it is a continuous
process for us at the moment), but more interestingly it is a chance for open
debate between our curators. In some sense, technology allow to codify and
challenge our notion of aesthetics ( especially with the evolution in our
training sets) at a given point of time.

~~~
mccoyspace
Thanks for the followup. I enjoyed reading the article and it is a very
interesting project and a great effort. I did read at the end that it was
about how to amplify the efforts of the human curators -- a great problem to
tackle.

------
kafkaesq
A better title would have been "Using ML to rank and classify images according
to aesthetics." Which in itself is quite impressive. But nowhere do we see
indications that these algorithms "understand" the images, in any meaningful
sense.

~~~
ibuildthings
The author here. I used the term "understanding", not as in machines
understanding the images, but more as scientific attempt in understanding
aesthetics. ( <snippet from the text>"empowering me to develop systems for
understanding images from a computational and scientific perspective"</snippet
ends> ).

~~~
kafkaesq
Fair enough -- thanks for clarifying.

------
yusee
Aesthetics is a game of cat and mouse. Artists create some new things. Then
critics and theorists observe the patterns of composition, color, proportion,
etc., that are popular. These rules are canonized in books. Then artists
challenge the rules.

The comparisons to music theory in this thread are apt. Music theory is always
behind music production.

How can you understand aesthetics without understanding creativity?

------
BanzaiTokyo
I believe that there are factors that go beyond visible composition. Just a
though experiment, I imagine that the brain would evaluate easthetics of two
similar images differently depending on whether it is an image of an object it
recognizes or not - when evaluating the image with an object other qualities
of the object (that are not necessarily visible in the image) will be taken
into account.

~~~
maldusiecle
Yeah, exactly. No good critic of photography thinks that subject matter is
irrelevant, that you can understand pictures as if they were abstract
compositions of light and color. You'd might as well try to read a poem in an
unknown language. This algorithm might learn to identify certain cliches, but
it'll never learn what makes a picture powerful.

~~~
randcraw
You have to wonder, though. Is it impossible that there's a "music theory" for
images/paintings/art that explains the mechanics of what makes them more
compelling vs less compelling? I suspect there is, at least to some degree.

Obviously images convey much more information than music, so any theory that
doesn't encompass the semantics of the subject will miss most of the signal.
But is there a theory for the presentation and composition of the subject? To
some degree, I'm confident there is.

Some of the methods used to debug the deep learning of images already do a
fair job of showing the locus of focus in the image where the DNN found
maximum information. I can see such a technique discovering many of the
techniques used by artists and photographers to direct the observer's eye or
juxtapose objects that conflict.

~~~
d12345m
Perhaps it's not quite analogous to music theory, but what you're describing
in the first paragraph would be referred to as the formal elements of art or
simply the elements of art.

Analysis of these elements (form, line, space, color, and texture) is usually
a part of the sort of art criticism you'd find in academic studio art, art
history, or even just the New York Times art section.

The visual design field has a similar, extended set of elements for describing
the formal elements of a design piece.

In both art and design, works are usually considered effective if they use the
formal elements of art/design to support what you refer to as the semantics of
the subject. That's a broad generalization, but you see it in practice a lot,
so it seems like a fair thing to say.

Academic art history is starting to feel the influence of machine learning and
computer vision precisely because computers can be trained to recognize the
formal elements of art and associate their use with movements and historical
periods. There are way more detailed articles than this one, but this will get
you started if you're interested in this sort of thing:

[https://www.technologyreview.com/s/537366/the-machine-
vision...](https://www.technologyreview.com/s/537366/the-machine-vision-
algorithm-beating-art-historians-at-their-own-game/)

~~~
posterboy
I heard the term Gestalt Laws, from a German loan word, used to describe this.
I don't the relation of this to your notion.

[http://www.dict.cc/?s=gestalt](http://www.dict.cc/?s=gestalt)

[https://en.wikipedia.org/wiki/Gestalt_psychology](https://en.wikipedia.org/wiki/Gestalt_psychology)

------
controll
I don't think computers would be able to understand aesthetics. It is a really
high-level concept. Plus, I think deep-learning is a marketing mambo-jambo and
does not perform much better than a linear SVM.

~~~
visarga
Then why are we using deep convolutional networks for state of the art vision
and speech when we could just plug an SVM with handcrafted features? From what
I know, error rates in vision dropped from 25% to less than 5% since deep
learning. That's no trifle, especially at the higher end of the accuracy
scale. It's very hard to conquer those last few percents.

