So if you look into [1] you'll find .hrl sources (actually .erl) and their .S counterpart in the subfolders. From these you will see that functions `a` and `b` look more and more like each other. Note that `b` is the literal version of `a`.
How much speedup?
Potentially infinite, at the price of compile time.
In real world cases though I don't have any data yet. I want to be able to compile a whole project first and that fails for one last reason, described in the README. But I have faith :)
The patterns that can be superoptimized are few for now. [2] describes how some of lists.erl functions work so that they can be supercompiled.
The long-term idea is to have a local(/global if secured) cache of code so that inter-module calls can be optimized if the developer wants.
My real world case/motivation is supercompilation of code such as [3] (goal is to turn `a` into `b` at the assembly level).
This pattern is extremely common in my employer's software [4] (which is probably the biggest open source Erlang project AFAIK).
From what I understand (on phone so no refs right now), "super compilation" is limited to a linear speedup. However, "distillation" can achieve super-linear speedup.
A SuperCompiler is a dynamic optimizer that can transform the code of an object according to its actual usage.
This may be simple static optimizations, removing state that isn't used from an object; or it may be very complex e.g. partial evaluation of functions leading to code transformation.
It's on a spectrum with it. JITs don't actually do much "optimization", because they need to be fast. It's subjective, but I'd say supercompiling is nearer to profile-guided whole-program optimization.
It would be kind of cool to hook a profiling execution tracer into a JIT to find hot paths, resolve that to the nearest "usefully supercompilable" code block (it sounds like supercompilation works best when fed whole programs, I'm not sure how much bigger-picture comprehension [large-scale] JITs have of program scope in practice) and then supercompile that in a CPU-throttled background thread.
Thinking in the context of LLVM et al.
I'm aware that JS engines do a "fast" compile designed to just generate code now within a few ms, and a fractionally slower compile that does do some optimizations and returns executable results within a few hundred ms.
This could be a third optimization stage that takes 1 second, or maybe even 2 or 3, but produces really really good results. If this doesn't already exist - I don't know, so I can't really assume either way - it almost sounds like low hanging fruit, and a super fun project too.
To be honest I think even the most sophisticated use of super-compilation - the partial evaluation described above - is really just a fancy way of saying inlining, constant folding and expression simplification.
Graal and Truffle are cool. It's unfortunate that it's in Java, mostly because I don't trust Oracle to do the right thing with the community and I expect them to change licenses to make business opportunities for Substrate VM licensing.
Even without SubstrateVM, Graal is a big step forward IMO in terms of interop and performance for the JVM. And Graal/Truffle are under the GPLv3, and only need JDK9 JVMCI interfaces, so I think it's mostly in the clear at this point... They can change the license, but they can't take away the code that's already open.
That said: I agree it seems extremely likely at this point that Oracle is going to milk Substrate for enterprise money, and that's a shame because it is a really important component of the system, in my opinion. And not having it will be a big drag for people who are considering Truffle/Graal.
Not like you're five, but if you're interested: the standard introduction is still 'Partial Evaluation and Program generation', which is freely available
http://www.itu.dk/~sestoft/pebook/pebook.html
Here's the simplest code I could come up with for 'dynamic' partial evaluation, showing interpreter->compiler but not the higher projections: http://wry.me/~darius/writings/peval/
Obviously the first Tales of Interest episode, where the Professor uses the What-If machine to find out what would have happened if he had invented the Finglonger.
Just wanted to say that 99% of this is thanks to [1] and his 2013 Masters thesis.
There has been slight discussion to integrate a supercompiling pass (more involved than the ones that are in the compiler already) over at [2].
Hope to motivate enough people to see this happen, especially since we now have even cheaper constant literals [3]! Let's get on this.
[1] https://github.com/weinholt
[2] https://github.com/erlang/otp/pull/1080#issuecomment-3145985...
[3] https://medium.com/@jlouis666/an-erlang-otp-20-0-optimizatio...