
Genetic Programming: Evolution of Mona Lisa - bdfh42
http://rogeralsing.com/2008/12/07/genetic-programming-evolution-of-mona-lisa/
======
patio11
Now that is just a cool hack.

Sadly I don't think there is too much commercial value in _extraordinarily_
CPU-intensive lossy image compression but it sure beats the heck out of 99.5%
of what I saw in undergraduate AI/algorithms-type courses as a demo.

Incidentally, I think from a marketing perspective that is a great test case
to use for it. If you had picked, I don't know, some portrait by an unknown
artist it would have been no less technically impressive but it wouldn't
spread NEARLY as far. (This is one reason why I managed to work the KJV Bible
and the US Constitution into my test corpora for almost every natural language
processing project I've ever done.)

~~~
henning
Most of the papers one sees about serious, potentially lucrative applications
of biologically inspired optimization metaheuristics (algorithms like the one
in the OP) are engineering-oriented.

For example, NASA used evolutionary computation to design a little antenna
thing, and it went into space, and it was human-competitive (as good or better
than anything a human could design in a reasonable amount of time).
<http://ti.arc.nasa.gov/projects/esg/research/antenna.htm>

Most of the applications, in other words, have little to do with the kind of
highly data-driven web applications with rapidly changing requirements that
seems to occupy a lot of the time of the users on this site. :( Still,
excellent candidate for spare-time/weekend hacking.

Finally, one basic thing to observe about most of these heuristics is that
they are embarrassingly parallel, which means they really could make good use
of a 16-core machine.

~~~
yters
Koza's a pretty big name in that space:

<http://www.genetic-programming.org/>

------
bemmu
Here's something I'd like to see someone try.

Take pictures of a 3D object from say 100 different angles, remembering which
angles they were. Now, use this mona lisa algorithm to put 3D polygons inside
a box of sufficient size, keep those that most look like the original object
when looked at from all those angles.

Will you get a compressed 3D representation of an object this way?

~~~
lkozma
Usually these type of inverse problems are severely ill-defined. There can be
many 3D configurations that give the 2D images, and you need to use some
knowledge of the world (or some regularization) to select which one is the
most plausible.

A typical application is when they measure brain signals at the surface of the
brain and then try to reconstruct what happens inside. Have people thought of
using genetic algorithms and other stochastic methods ? Probably yes.

------
bd
If we want to be more precise, this type of optimization is called _simulated
annealing_ (population = 1, no cross-over, just mutations):

 _"At each step, the SA heuristic considers some neighbour s' of the current
state s, and probabilistically decides between moving the system to state s'
or staying in state s. The probabilities are chosen so that the system
ultimately tends to move to states of lower energy. Typically this step is
repeated until the system reaches a state that is good enough for the
application, or until a given computation budget has been exhausted."_

<http://en.wikipedia.org/wiki/Simulated_annealing>

~~~
randomwalker
Unfortunately, it's not even that! Simulated annealing needs a cooling
heuristic, which makes the algorithm smarter as it gets closer to the optimum.
What this is is some combination of hill climbing and random state space
search. Not sure if it has a name.

Want evidence? Look at the iteration counts. With SA, you'd see the
differences between the successive pictures be roughly constant or even
decrease (I can't prove it, but I've implemented it, and you usually end up
choosing a cooling heuristic that roughly matches the human intuition for
"closeness").

As it is, the algorithm that's implemented finds it exponentially harder to
get close to optimum. You start off with a gap of 100 iterations between
optima, and you end up at 300,000.

Which is exactly why I think this is even more amazing -- it's just about the
dumbest algorithm you can think of, and it still manages to vectorize an
arbitrary raster image. If an actual GA or SA version were built, it could
have some real potential. (See my comment below.)

~~~
globalrev
I just did this because I got inspired. What does it count as, genetic
programming? It uses naive mutation and crossover(don't know if they qualfiy
as real crossover and mutation) and a simple fitness function. It is fairly
pointless since all it does is find a list with ones but it does so in just
around 400 generations while random guessing takes forever.

    
    
      from __future__ import division
      import matplotlib.mlab as M
      import matplotlib.pyplot as plt
      import random as R
      
      def crossover(a, b):
          new = []
          for x,y in zip(a,b):
              if R.random() > 0.5:
                  new.append(x)
              else:
                  new.append(y)
          return new
    
      def mutate(xs):
          xs[int(R.uniform(0, len(xs)))] = int(R.uniform(1, 10))
          return xs
    
      def fitness(xs, goal):
          summa = 0
          for x,y in zip(xs, goal):
              summa += abs(x-y)
          return summa, xs, True if summa == 0 else False
    
      def nFittest(xs, goal, n=2):
          fittest = sorted(fitness(x, goal) for x in xs)[:n]
          if fittest[0][2] :
              return fittest[0]
          elif fittest[1][2]:
              return fittest[1]
          else:
              return fittest
    
      def init(fields, n):
          return [[int(R.uniform(1,10)) for x in     xrange(fields)] for y in xrange(n)]
    
      def newGen(a,b,n):
          next = [crossover(a,b) for x in xrange(n)]
          mutated = []
          for x in next:
              if R.random() > 0.9: mutated.append(mutate(x))
              else: mutated.append(x)
          return mutated
    
      def run(printing=True):
          n = 0
          first = init(15, 10)
          while n < 1000000:
              n += 1
              next = nFittest(first,   [1,1,1,1,1,1,1,1,1,1,1,1,1,1,1], 2)
              if type(next) == tuple: break
              first = newGen(next[0][1],next[1][1], 10)
          if printing:
              print "Nbr of generations: ", n
              print next
          return n, next
    
      def plot():
          trials = [run(False)[0] for x in range(100)]
          print "Done trials"
           averages = [sum(trials[:x]) / len(trials[:x])\
                      for x in xrange(1,len(trials)+1)]
          print "Done averages"
          plt.plot([x for x in xrange(len(averages))],   averages, 'ro')
          plt.axis([1, 100, 0, 1000])
          plt.show()
    
      def test():
          goal = [1,1,1,1,1,1,1,1,1,1,1,1,1,1,1]
          c=0
          while 1:
              c += 1
              x = [int(R.uniform(1,10)) for y in range(15)]
              if sum(x) == 15: break
              if c % 10000 == 0: print c
          print x

~~~
randomwalker
Dude, pastebin? :-)

Yeah, I guess this is GA alright. Your crossover is not right -- it should
"cross over" at one point, not flip flop at each location :-)

[Edit] And there's a lot of in-breeding going on! You want to keep 10-20
sequences from each generation, not just 2.

~~~
yters
Crossing over at one point is not as effective as an even crossover, since it
is actually equivalent to an even crossover with one static and one dynamic
point. Thus, building blocks in the middle are more likely to be disrupted
than building blocks on the ends.

That being said, his approach is also a viable GA operator. It is called a
uniform crossover. But really, to qualify as a GA, his solution only needs
some kind of variation, mutation, and selection.

------
huhtenberg
This is exceptionally cool. It is actually on par and along the lines of the
fractal compression ideas.

<http://en.wikipedia.org/wiki/Fractal_compression>

------
shaunxcode
a frame by frame video of that would be awesome (and I assume incredibly long
- 904314 frames to be exact?) Assuming that it takes 24 frames for one second
of film I guess it would be (904314/24)/60 = 627 hours of film so maybe it
ought to be a little bit time lapsed.. Seriously though I could imagine such
cool film/game effects.

~~~
gdee
You forgot a div there. Those are 627 minutes. So more like 10.26 hours. Your
conclusions stand though :)

~~~
shaunxcode
hah good call

