
Proebsting's Law: Compiler Advances Double Computing Power Every 18 Years - soundsop
http://research.microsoft.com/~toddpro/papers/law.htm
======
gizmo
I realized this myself when I wrote my first JIT compiler. The x86 code
generated from bytecode was frightingingly naive. With basic code inlining and
redundant instruction elimination my scripting language ran at 95% the speed
of the equivalent c program with -O3 optimizations. And the assembly code
generated by gcc was at least 3 times shorter.

Processors do a damn good job at optimizing code, and cache locality is more
important than you'd think. The CPU blazes through redundant mov instructions
like there's no tomorrow.

The good news is that building a fully functioning native code compiler is
doable by a single person.

~~~
palish
It is. I happen to be working on a small demonstration of one. I'll post the
code for reference when it's done.

------
jimrandomh
This assumes that compiler optimizations proceed exponentially, but it's
actually asymptotic. There is a hypothetical ideal form, where no further
optimization is possible. Each optimization brings the program closer to that
form, and as you get close, you start running into diminishing returns.

Besides, most of the interesting problems are questions of how many programs
you can optimize, not much you can optimize one given program. Automatic
parallelization and vectorization is worth an order of magnitude speed
improvement, but compilers can only apply it in a small percentage of the
cases where it would be applicable.

------
silentbicycle
The other people commenting here are missing half of his point: He's saying,
_Let's stop focusing so much on squeezing out another .2% of CPU efficiency
and instead look at languages that make _programmers_ more efficient._

------
ced
We need to move to algorithmic optimization. I want a language that will
_switch by itself_ between using a vector or a list to represent some sequence
X, then choose the best one.

In any case, that's what I'm working on.

~~~
orib
Interesting. Got any links about the approach you're taking to decide what
would be optimal? It seems like that would be a very difficult problem.

~~~
ced
It's one part of an ambitious OS project. I discard a _lot_ of assumptions
about software and programming languages.

It's actually not hard in principle. There is a very easy algorithm for
deciding whether you should use a list or an array: You try it out.

If I fail, I'll write about it some day...

------
rw
This is intriguing, but it sidesteps _so much_ detail.

~~~
henning
He's done quite a bit of work on compilers, so presumably he's familiar with
the details that allow him to reasonably make sweeping generalizations like
this.

------
jlouis
The second law of computer science is much more dangerous:

"Any program will degenerate until it is written in a conglomerate of perl and
shell script."

There is a windows-corollary: Substitute perl and shell script for VB.

Seriously though, I do think the law is right. It will be harder and harder to
push further optimization out of programs. It is much more interesting to look
at other problems - and I do think the research went there as well.

~~~
neilc
If you interpret "compiler optimization" broadly enough, I don't think the
argument is right: compare the performance of a typical Javascript
implementation from 5 years ago with the performance of the best
implementations today. That difference in performance is often the factor that
allows previously infeasible programs to be written in Javascript.

~~~
SapphireSun
I think what is really going on is a kind of power law thing. After the
initial optimizations over about a decade or so, the additional optimizations
only help so much. So yes, it is helpful to do the compiler right, but there
are diminishing returns after a certain point.

~~~
nostrademons
Dan Sugalski wrote a post on the Parrot blog a couple years ago that I thought
was pretty thought-provoking. In a nutshell, it was:

"I think we - compiler and VM writers - have definitely been on the wrong
track all these years. Instead of looking for ways to squeeze that last
instruction out of a function, we should be looking at memoization, lazy
evaluation, and other ways to avoid whole chunks of work."

That squares with my experience (as an application programmer, not as a
compiler writer). When I've been able to get huge, 2-10x speedups in my
programs, it's always been through _doing less work_ , not doing the work more
quickly.

At my first job, there was an issue with the communication protocol between
the UI and a proprietary database. The engineering department spent a month
redoing the protocol with batch calls that fetched data with one command
instead of individually retrieving each object. They ended up with about a 5%
speedup. Then I looked at the logs and saw that a certain very-common API call
was actually making two requests to the database, and the data from the second
one was not needed in 99% of cases (it was a holdover from when the product
was intended for a totally different market). So I changed the API to lazily
fetch the extra data only when it was accessed, and within an hour we had a 2x
speedup.

So yeah, _existing_ optimizations have probably reached a point of diminishing
returns. But that doesn't mean that _new_ approaches might not result in big
speedups.

