Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Branchless code sequences (davespace.co.uk)
158 points by mnem on April 23, 2015 | hide | past | favorite | 31 comments


Scanning http://www.hackersdelight.org/aha/aha.pdf, it looks like Aha performs some form of exhaustive search over all programs? An SMT solver is probably a better way of doing this search; there's been some work in this direction.

Jha, Gulwani and colleagues have a very nice paper on oracle-guided synthesis. They show how you can generate loop-free bitvector programs using SMT: http://www.eecs.berkeley.edu/~sseshia/pubdir/synth-icse10.pd.... John Regehr has also written about this on his blog: http://blog.regehr.org/archives/1146.

Part of the complexity here in an SMT solution might be encoding all the bizarre instructions that you could possibly take advantage of in x86-like instruction sets.


Never underestimate the utility of brute force & ignorance.


John Regehr's most recent post is also on an example superoptimisation - in this case one where the generated answer isn't obviously correct.

http://blog.regehr.org/archives/1219



The advantage of brute force generation is that you can eliminate a lot of solutions extremely fast by trying random inputs.


This sort of thing was automated to entire binaries in this paper.

http://theory.stanford.edu/~aiken/publications/papers/asplos...

They managed to beat gcc -O3 - "we show speedups from 1.7 to a factor of 10 on some compute intensive kernels"



Interesting, I'd never heard of the technique of searching all possible instruction sequences to find one with desired characteristics. It's the Infinite Monkey Theorem for code.


Here's the original paper on super optimization by Alexia Massalin : http://courses.cs.washington.edu/courses/cse501/15sp/papers/...

I ran into it a couple years ago, after hearing from other research from her. Her papers are really impressive. I would have loved that the author of the article had mentioned her.

On a side note, I wouldn't be surprised if the JIT compilers were doing some use of this, allowing some interpreted language to have better performances than compiled C on some benchmarks

EDIT : formatting and JIT addition


JIT compilers are time constrained so super optimization isn't a very good match.


Why are JIT compilers time constrained? If you have a banking system that runs all day every day, why does it matter if it takes 5 minutes to compile the hot spots at the top tier of optimisation?


Imagine you happen to walk up to your ATM when it's Mother Banking System is in the middle hotspot compile followed by a GC pause. What would you do if your ATM took 5 minutes+ to respond to your Withdrawal Request?

You'd probably freak out and worry about getting ATM card back.


The program doesn't have to stop to run the JIT - it can run it on a background thread.


I would hope an ATM isn't built to use a stop the world GC along with a stop the world jit compiler. I hope you realize how absurd you sound making an argument against jit with a real time device.


Hotspot compilations are blocking? That would surprise and horrify me.


If you have a banking system that runs all day every day, why are you not compiling ahead of time and running super optimization then? JIT is most useful where the application or the target is highly dynamic and you aren't able to compile ahead of time.


If the workload characteristic changes over time you may want to recompile with different profiling data.


That paper caused a small sensation in the compiler community at the time, as everyone rushed to implement it!


Where is this compiler community?


Is there any other reason to use a tool like this other than eliminating branch misprediction?


Superoptimisers can be used on any sequance of instructions, to find a more optimial sequance. Since they approach the problem in a very different way to a compiler, it can do a better job in some cases. And it can work on hand written assembler too.

They tend to be good at finding obscure solutions.


Anyone know why compilers don't already do this for small branches? Not worth the effort?


You miss out on branch prediction. There's a bit of a myth that branches are always bad because a branch misprediction is slower than a correctly predicted branch and therefore no branches are better.

A bit of code that eliminates branches but has dependant variables (A must be computed before B can be computed) will have pipeline stalls while the computation for A is completed. Yes it might be faster than a branch misprediction but it will never be as fast as a correctly predicted branch (which takes effectively 0 cycles).

Notice the one thing missing in this post? Benchmarks!


Among other things, there are branch prediction optimal algorithms that either a) only mispredict the final branch, or b) minimize amortized mispredictions.


I'd say probably too much effort?

First, it does an exhaustive search on all possible instruction combinations.

Second, it does not _prove_ that the resulting instructions do the same as the original code. It shows it only for some random numbers and border cases.


It's fairly easy to extend this to either an exhaustive search afterwards, or to run into a SMT solver to prove that it's correct / incorrect.

Though that's probably too much machinery to add on to a compiler.


There has been talk of using superoptimizers in larger compilers however it is usually not worth the effort unless you know a problem case in particular.

Especially since these kinds of transformations could slow down your code if the branch predictor is working well for your code set.


Both GCC and Clang can compile to branchless solutions for some conditional statements. The FizzBuzz problem for example both GCC and Clang can turn that into a small lookup table


Halting problem perhaps? Even if you put in safeguards, an exhaustive search could still be unacceptably slow. You can never tell when a pathological case might arise.


what kind of a bump does this produce


Goosebumps.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: