Comments:"Will It Optimize?"
URL:http://ridiculousfish.com/blog/posts/will-it-optimize.html
July 23rd, 2010
See how well you know (or can anticipate) gcc's optimizer. For each question, the left box contains some code, while the right box contains code that purports to do the same thing, but that illustrates a particular optimization. Will gcc apply that optimization? Put another way, will the code on the left be as fast as the code on the right, when compiled with an optimizing gcc?
I used a pretty ancient gcc 4.2.1 for these tests. If newer versions have different behavior, please leave a comment.
Beware: not all proposed optimizations are actually valid!
1. Recursion elimination
Can GCC replace recursive functions with a loop?
intfactorial(intx) {if (x> 1) returnx * factorial(x-1);elsereturn1; }
intfactorial(intx) {intresult = 1;while (x> 1) result *= x--;returnresult; }
Will GCC hoist out strlen()?
unsignedsum(constunsignedchar *s) {unsignedresult = 0;for (size_ti=0; i< strlen(s); i++) {result += s[i]; }returnresult; }
unsignedsum(constunsignedchar *s) {unsignedresult = 0;size_tlength = strlen(s);for (size_ti=0; i< length; i++) {result += s[i]; }returnresult; }
Will GCC transform an integer multiplication by 2 to addition?
intdouble_it(intx) {returnx * 2; }
intdouble_it(intx) {returnx + x; }
Will GCC transform a floating point multiplication by 2 to addition?
floatdouble_it(floatx) {returnx * 2.0f; }
floatdouble_it(floatx) {returnx + x; }
Will GCC transform an integer division by 2 to a right shift?
inthalve_it(intx) {returnx / 2; }
inthalve_it(intx) {returnx>> 1; }
Will GCC apply the same optimizations to if-else chains as it does to switch statements?
voidfunction(intx) {if (x == 0) f0();elseif (x == 1) f1();elseif (x == 2) f2();elseif (x == 3) f3();elseif (x == 4) f4();elseif (x == 5) f5(); }
voidfunction(intx) {switch (x) {case0: f0(); break;case1: f1(); break;case2: f2(); break;case3: f3(); break;case4: f4(); break;case5: f5(); break; } }
It is tempting to think of compiler optimizations as reducing the constant in your program's big-O complexity, and nothing else. They aren't supposed to be able to make your program asymptotically faster, or affect its output.
However, as we saw, they really can reduce the asymptotic complexity in space (question 1) and time (question 2). They can also affect calculated results (discussion of question 4) and maybe even whether your program goes into an infinite loop (see here).
On the flip side, several "obvious" optimizations are subtly incorrect and so will not be performed by the compiler, especially when they involve floating point. If your floating point code is demonstrably a bottleneck and you don't need exact precision or care about special FP values, you may be able to realize a speedup by doing some optimizations manually. However, untying the compiler's hands through options like -ffast-math is probably a better idea, and then only for the affected files, since these flags have a global impact.
And lastly, this isn't meant to be a prescriptive post, but we all know why micro-optimizing is usually a mistake: it wastes your time, it's easy to screw up (see question 5), and it typically produces no measurable speedup.
Code smart, and be safe out there!