What about superoptimization?
For instance, Stoke (http://stoke.stanford.edu/) or Souper (https://github.com/google/souper). They will not find all optimizations of course, and program equivalence is undecidable indeed, but they are a good shot at optimal compilation, I would say.
Just because a problem is undecidable that doesn't mean that you can't in fact solve it for a large set of interesting instances (maybe all instances you care about).
It means the idea isn't necessarily wholly impractical. That's not nothing.
I imagine superoptimisers scale extremely poorly, but that's a different question. I imagine they're best suited to bit-twiddling problems rather than, say, figuring out the best assembly code to implement a parallel Timsort.
I have to admit ignorance here though, I know almost nothing about superoptimisers.