P
Philip Herron
On 22 October 2013 00:41, Steven D'Aprano
Are you suggesting that gcc is not a decent compiler?
No.
If "optimize awayto the null program" is such an obvious thing to do, why doesn't the mostpopular C compiler in the [FOSS] world do it?
It does if you pass the appropriate optimisation setting (as shown in
haypo's comment). I should have been clearer.
gcc compiles programs in two phases: compilation and linking.
Compilation creates the object files x.o and y.o from x.c and y.c.
Linking creates the output binary a.exe from x.o and y.o. The -O3
optimisation setting used in the blog post enables optimisation in the
compilation phase. However each .c file is compiled independently so
because the add() function is defined in x.c and called in y.c the
compiler is unable to inline it. It also can't remove it as dead code
because although it knows that the return value isn't used it doesn't
know if the call has side effects.
You might think it's silly that gcc can't optimise across source files
and if so you're right because actually it can if you enable link time
optimisation with the -flto flag as described by haypo. So if I do
that with the code from the blog post I get (using mingw gcc 4.7.2 on
Windows):
$ cat x.c
double add(double a, double b)
{
return a + b;
}
$ cat y.c
double add(double a, double b);
int main()
{
int i = 0;
double a = 0;
while (i < 1000000000) {
a += 1.0;
add(a, a);
i++;
}
}
$ gcc -O3 -flto x.c y.c
$ time ./a.exe
real 0m0.063s
user 0m0.015s
sys 0m0.000s
$ time ./a.exe # warm cache
real 0m0.016s
user 0m0.015s
sys 0m0.015s
So gcc can optimise this all the way to the null program which takes
15ms to run (that's 600 times faster than pypy).
Note that even if pypy could optimise it all the way to the null
program it would still be 10 times slower than C's null program:
$ touch null.py
$ time pypy null.py
real 0m0.188s
user 0m0.076s
sys 0m0.046s
$ time pypy null.py # warm cache
real 0m0.157s
user 0m0.060s
sys 0m0.030s
[...]So the pypy version takes twice as long to run this. That's impressive
but it's not "faster than C".
(Actually if I enable -flts with that example the C version runs 6-7
times faster due to inlining.)
Nobody is saying that PyPy is *generally* capable of making any arbitrarypiece of code run as fast as hand-written C code. You'll notice that thePyPy posts are described as *carefully crafted* examples.
They are more than carefully crafted. They are useless and misleading.
It's reasonable to contrive of a simple CPU-intensive programming
problem for benchmarking. But the program should do *something* even
if it is contrived. Both programs here consist *entirely* of dead
code. Yes it's reasonable for the pypy devs to test things like this
during development. No it's not reasonable to showcase this as an
example of the potential for pypy to speed up any useful computation.
I believe that, realistically, PyPy has potential to bring Python intoJava and .Net territories, namely to run typical benchmarks within anorder of magnitude of C speeds on the same benchmarks. C is a very hardtarget to beat, because vanilla C code does *so little* compared to otherlanguages: no garbage collection, no runtime dynamism, very littlepolymorphism. So benchmarking simple algorithms plays to C's strengths,while ignoring C's weaknesses.
As I said I don't want to criticise PyPy. I've just started using it
and I it is impressive. However both of those blog posts are
misleading. Not only that but the authors must know exactly why they
are misleading. Because of that I will take any other claims with a
big pinch of salt in future.
Oscar
You sir deserve a medal! I think alot of people are taking these sorts of benchmarks completely out of context and its great to see such a well rounded statement.
I applaud you so much! I've been sort of banging my head against the wall to describe what you just did as succinctly as that and couldn't.