

Java code optimization project - Stasyan
http://www.supercompilers.com/

======
ramchip
Looks like they hit a roadblock somewhere: _The Java Supercompiler Version 1
is scheduled for completion in late 2003._

I noticed however that Ben Goertzel is listed under "People". He's an
important person in the field of Artificial General Intelligence (like Eliezer
who posts here from time to time). I'm not sure if there's a link between this
(apparently dead) research project and his AGI project Novamente.

~~~
gwern
Novamente is written in Java and is in the symbolic/probabilistic vein of AGI
schemes. From very loose and unsourced scuttlebut I have heard from time to
time that performance has been an issue. To go out on a limb, I would venture
a guess that it has a lot of possible optimizations which are too difficult
for a compiler but would eat up a lot of programmer time and hinder the
extreme flexibility and changeability that Novamente would need, and so a
smarter optimizer would be of considerable interest to Goertzel.

On the other hand, program optimization in general has been called an AI-
complete problem (<https://secure.wikimedia.org/wikipedia/en/wiki/AI-
complete>), so maybe the reason is the other way around.

~~~
ramchip
Wow, I thought Novamente was written in C++. Your post makes a lot of sense.

~~~
gwern
Oops; I checked and Novamente is actually written in C++:
[http://www.agiri.org/wiki/Novamente_Cognition_Engine#Current...](http://www.agiri.org/wiki/Novamente_Cognition_Engine#Current_Status)

Turns out it was Novamente's _predecessor_ , the Webmind AI Engine, that was
written in Java.

------
Mathnerd314
The example with function m in their "white paper"
(<http://www.supercompilers.com/white_paper.shtml>) is still not optimized
fully. One first inlines the successive values of y:

    
    
       if (x0<=0) y=0;
        else 
        if (x0>=1) y=1;
        else {
            if ((2-3*x0)<=0) y=0;
            else 
            if ((2-3*x0)>=1) y=1;
            else {
                if ((2-3*(2-3*x0))<=0) y=0;
                else 
                if ((2-3*(2-3*x0))>=1) y=1;
                else {
                    ...
                        else {
                            y = (2-3*(2-3*(2-3*(2-3*(2-3*x0)))));
        }   }   }   }   }
    

These can be solved for x0.

    
    
       if (x0<=0) y=0;
        else 
        if (x0>=1) y=1;
        else {
            if ((2/3)<=x0) y=0;
            else 
            if ((1/3)>=x0) y=1;
            else {
                if (x0<=(4/9)) y=0;
                else 
                if (x0>=(5/9)) y=1;
                else {
                    ...
                        else {
                            y = (2-3*(2-3*(2-3*(2-3*(2-3*x0)))));
        }   }   }   }   }
    

Obviously, because floating point arithmetic is not distributive or
associative (or exact), the actual constants will be slightly different from
1/3, 2/3, etc.

This performs an average of .004 subtractions and multiplications compared to
my estimate of .988 for their algorithm.

Since this will probably not give strictly matching results for values close
to 1/3, 2/3, 4/9, 5/9, etc., this is technically "super optimization" rather
than supercompilation.

Are there any optimizers you know of that could get this far?

------
mmastrac
This is exactly what the Google Web Toolkit compiler does, but outputs Java
bytecode rather than taking it into JS.

The HotSpot compiler does some of this static analysis at runtime (various
forms of devirtualization and type-tightening), but it's far more effective if
you feed it into a big vat and keep running optimizations on it until you can
eke out anything more.

------
kaffeinecoma
I'm kind of curious how this would be practical, from a maintenance viewpoint.
Would you then adopt the program's output back as your new "source"? Is the
output readable, maintainable?

If you end up keeping your original source, and merely compile the output from
this thing to get a fast executable, how do you debug stack traces when it
crashes?

~~~
loup-vaillant
The new source is not a source at all. You still continue to work on the old
source. If your program has bugs you just encountered the "debug vs release"
problem.

Anyway, I can't see how this would be more useful than a classic optimizing
compiler, except for languages like javascript, which is interpreted by third
party clients.

~~~
gwern
> Anyway, I can't see how this would be more useful than a classic optimizing
> compiler, except for languages like javascript, which is interpreted by
> third party clients.

Pretty much every language can benefit from optimizations like constant
folding and inlining (even C, yes); those techniques are subsets of partial
evaluation, and partial evaluation is, apparently, a subset of
supercompilation.

And we should expect these general classes of optimization techniques to offer
speedups.

We know that classic optimizing compilers (GCC?) miss a lot of opportunities;
this is especially obvious when you compare C and FORTRAN numeric performance
- presumably all the FORTRAN optimizations could also be done in C by a
sufficiently smart compiler (they're both Turing-complete languages, after
all), but the C compilers can't reliably figure out when to do them.

Even a language like Haskell which practically goes out of its way to let the
compiler optimize however it wants can benefit:
[http://neilmitchell.blogspot.com/2007/12/supercompilation-
fo...](http://neilmitchell.blogspot.com/2007/12/supercompilation-for-
haskell.html) (Although I've read the paper and was confused; it looked more
like partial evaluation to me - inlining and rewriting at compile-time until a
fixed-point is reached - than this runtime supercompilation stuff.)

