
How an Optimizing Compiler Works - lihaoyi
http://www.lihaoyi.com/post/HowanOptimizingCompilerWorks.html
======
bjoli
That is one thing I sorely miss in languages that aren't lisps. The first
optimizer pass is source->source. In guile I do

    
    
        ,opt expression
    

And it prints the source code after a first basic pass of the expander and
optimizer. It does constant folding, inlining, dead code elimination and
partial evaluation (and some other things). Being able to inspect what guile
does helps a lot with macro writing, and is just a generally handy tool.

It beats reading assembly (which is also available) any day of the week.

~~~
chrisseaton
How does it do source-to-source while maintaining debug-ability? If the result
of your partial evaluation is source code that is run then what happens when I
set a breakpoint on a line that has been partially evaluated away? Is there
some metadata that is produced alongside the source to allow that breakpoint
to be applied to the code that has been optimized away and how does that work?

This is why optimisations aren't usually source-to-source - they need to
include in the output extra information that isn't normally representable in
source code. Another reason is that compiling to a lower-level representation
gives you more power - when I partially evaluate I gain extra information such
as that an add operator will not overflow, and I want to include that valuable
information in my output even if there is no no-overflow add operator in the
source language.

~~~
bjoli
This is not done over regular lisp lists, but over scheme syntax objects that
retain the original source info.

Those syntax objects are also the basis of the hygienic macro systems in many
schemes (at least ones using syntax-case) so that macros also benefit from
that information.

The lisp source representation is already an AST, so these kinds of
transformations are trivial.

------
tom_mellior
This is good, but I'm surprised at the assertion that "We can see that the
function is pure, and evaluate the function up-front to find ackermann(2, 2)
must be equal to 7". I don't know of real-world compilers that would do
this[1], since even pure functions can take arbitrary amounts of time to
evaluate at compile time. The Ackermann function in particular is famous for
its complexity -- it was _constructed_ for complexity.

In the particular case of ackermann(2, 2) you should be able to do it with a
moderate number of inlinings, but I don't think compilers often bother to
inline more than one level of a recursive function.

[1] Of course you can force compile-time evaluation of code in some languages,
like with Lisp macros or C++ templates. But that's not what's happening here.

------
kidintech
Have I not drank my coffee today or does the 'multiplied' value in the
strawman program never change?

It starts at 0 and keeps multiplying itself with integers, which should always
result in 0.

EDIT: My bad, it is explained after the code snippet, but not immediately
after so I didn't catch it at first.

~~~
mrgriffin
From the article:

> multiplied does not: it starts off 0, and every time it is multiplied via
> multiplied = multiplied * count it remains 0.

The article then goes on to use that fact for some optimizations.

~~~
kidintech
Yeah haha, caught it just after I wrote my comment. IMO, I expected the author
to talk about that right after the snippet, as it was the first thing that
stuck out.

------
_bxg1
Slightly off-topic, but does Java have standalone (without a class) functions
now? He said he was using Java but I didn't recognize some of this syntax. I
know it's been changing a lot the past few years (and it's been a few years
since I've used it).

~~~
quelltext
Nope. I think he simply used that notation to keep things simple/concise and
focus on the important pieces.

~~~
_bxg1
That's what I was wondering. Odd decision in a post that works through code
line by line and then translates it straight to bytecode.

------
mrkeen
I was intrigued by the claim that a compiler which optimises Java code would
identify pure functions and treat them differently.

Are there any JVM engineers that could confirm or deny this happening (in the
real world)?

------
MauranKilom
The site goes into an endless reload loop when attempting to connect using
HTTPS...

~~~
Sohcahtoa82
Are you using HTTPS Everywhere or some other similar browser plugin?

When I used HTTPS, the page loaded fine, but then reloaded itself using HTTP,
but didn't loop after that.

OP, you may want to look into figuring out why your page reloads using HTTP
after loading just fine with HTTPS.

~~~
yorwba
It's probably due to this script tag:

    
    
      <script>if (window.location.protocol == "https:")
        window.location.href = "http:" + window.location.href.substring(window.location.protocol.length);
      </script>

~~~
notamy
But why? Genuinely curious, is there a real reason why this might be done?

~~~
lihaoyi
Honestly I do not remember at all why I put in that script tag. It was
something to do with github-pages not supporting HTTPS properly in the past,
the details escape me. I should probably remove it

