Scanning http://www.hackersdelight.org/aha/aha.pdf, it looks like Aha performs some form of exhaustive search over all programs? An SMT solver is probably a better way of doing this search; there's been some work in this direction.
Part of the complexity here in an SMT solution might be encoding all the bizarre instructions that you could possibly take advantage of in x86-like instruction sets.
Interesting, I'd never heard of the technique of searching all possible instruction sequences to find one with desired characteristics. It's the Infinite Monkey Theorem for code.
I ran into it a couple years ago, after hearing from other research from her. Her papers are really impressive. I would have loved that the author of the article had mentioned her.
On a side note, I wouldn't be surprised if the JIT compilers were doing some use of this, allowing some interpreted language to have better performances than compiled C on some benchmarks
Why are JIT compilers time constrained? If you have a banking system that runs all day every day, why does it matter if it takes 5 minutes to compile the hot spots at the top tier of optimisation?
Imagine you happen to walk up to your ATM when it's Mother Banking System is in the middle hotspot compile followed by a GC pause. What would you do if your ATM took 5 minutes+ to respond to your Withdrawal Request?
You'd probably freak out and worry about getting ATM card back.
I would hope an ATM isn't built to use a stop the world GC along with a stop the world jit compiler. I hope you realize how absurd you sound making an argument against jit with a real time device.
If you have a banking system that runs all day every day, why are you not compiling ahead of time and running super optimization then? JIT is most useful where the application or the target is highly dynamic and you aren't able to compile ahead of time.
Superoptimisers can be used on any sequance of instructions, to find a more optimial sequance. Since they approach the problem in a very different way to a compiler, it can do a better job in some cases. And it can work on hand written assembler too.
They tend to be good at finding obscure solutions.
You miss out on branch prediction. There's a bit of a myth that branches are always bad because a branch misprediction is slower than a correctly predicted branch and therefore no branches are better.
A bit of code that eliminates branches but has dependant variables (A must be computed before B can be computed) will have pipeline stalls while the computation for A is completed. Yes it might be faster than a branch misprediction but it will never be as fast as a correctly predicted branch (which takes effectively 0 cycles).
Notice the one thing missing in this post? Benchmarks!
Among other things, there are branch prediction optimal algorithms that either a) only mispredict the final branch, or b) minimize amortized mispredictions.
First, it does an exhaustive search on all possible instruction combinations.
Second, it does not _prove_ that the resulting instructions do the same as the original code. It shows it only for some random numbers and border cases.
There has been talk of using superoptimizers in larger compilers however it is usually not worth the effort unless you know a problem case in particular.
Especially since these kinds of transformations could slow down your code if the branch predictor is working well for your code set.
Both GCC and Clang can compile to branchless solutions for some conditional statements. The FizzBuzz problem for example both GCC and Clang can turn that into a small lookup table
Halting problem perhaps? Even if you put in safeguards, an exhaustive search could still be unacceptably slow. You can never tell when a pathological case might arise.
Jha, Gulwani and colleagues have a very nice paper on oracle-guided synthesis. They show how you can generate loop-free bitvector programs using SMT: http://www.eecs.berkeley.edu/~sseshia/pubdir/synth-icse10.pd.... John Regehr has also written about this on his blog: http://blog.regehr.org/archives/1146.
Part of the complexity here in an SMT solution might be encoding all the bizarre instructions that you could possibly take advantage of in x86-like instruction sets.