M
Mc Osten
Tim N. van der Leeuw said:I'm curious though, if on the OP's machine the slowed-down Python
version is still faster than the C++ version.
I tested both on my machine (my other post in the thread)
Tim N. van der Leeuw said:I'm curious though, if on the OP's machine the slowed-down Python
version is still faster than the C++ version.
Mc Osten said:In fact Python here is faster. Suppose it has a really optimized set
class...
Fredrik Lundh said:Python's memory allocator is also quite fast, compared to most generic
allocators...
Mc said:In fact also in the two "slow" versions Python outperforms C++.
I didn't notice it in the first place.
Mc said:In fact also in the two "slow" versions Python outperforms C++.
I didn't notice it in the first place.
Mc said:Here some results (I know that the fpoint optimizations are useless...
it's is my "prebuilt" full optimization macro ):
But your C++ program outputs times in seconds, right? So all
compilations except for the first two give results in less than a
second, right? (meaning the optimizations of your standard-compilation
give worst results than -O3?)
BTW, I don't quite understand your gcc optimizations for the first 2
compiles anyways: two -O options with different values. Doesn't that
mean the 2nd -O takes preference, and the compilation is at -O2 instead
of -O3?
Why both -O3 and -O2 at the command-line?
Tim N. van der Leeuw said:Oh boy; yes indeed the slow python is faster than the fast C++
version... Must be something really awful happening in the STL
implementation that comes with GCC 3.4!
Mc said:And the Python version does the very same number of iterations than the
C++ one? I suppose they are looping on arrays of different sizes, just
like my "first version".
[...]Tim said:Mc said:And the Python version does the very same number of iterations than the
C++ one? I suppose they are looping on arrays of different sizes, just
like my "first version".
Hmmm.. You're quite right. The C++ version had an array size 100.000
(your version), the Python version still had an array size 10.000 (as
in my modified copy of the original version).
When fixing the Python version to have 100.000 items, like the C++
version, the Python timings are:
[...]
Fast - Elapsed: 0.512088 seconds
Slow - Elapsed: 1.139370 seconds
Still twice as fast as the fastest GCC 3.4.5 compiled version!
Incidentally, I also have a version compiled with VC++ 6 now... (not
yet w/VC++ 7) .. Compiled with release-flags and maximum optimization
for speed, here's the result of VC++ 6:
LeeuwT@nlshl-leeuwt ~/My Documents/Python
$ ./SpeedTest_VC.exe [...]
Fast - Elapsed: 4.481 seconds
Slow - Elapsed: 4.842 seconds
In fact Python here is faster. Suppose it has a really optimized set
class...
The point is that I was trying to create 400.000 string instances. TheTim> But beware! For Python2.5 I had to change the code slightly,
Tim> because it already realized that the expression
Tim> '%s' % 'something'
Tim> will be a constant expression, and evaluates it once only... so I
Tim> had to replace '%s' with a variable, and I got the timings above
Tim> which show Python2.5 to be slightly faster than Python2.4.
Shouldn't you then get rid of any compiler optimizations your C++ compiler
does? Why penalize 2.5 because it recognizes a useful optimization?
1000000 loops, best of 3: 1.9 usec per loopMaric said:The problem here, is that the strings in the set are compared by value, which
is not optimal, and I guess python compare them by adress ("s*n is s*n" has
the same complexity than "s*n == s*n" in CPython, right ?).
wrong.
> timeit -s"s='x'; n=1000" "s*n is n*s"
100000 loops, best of 3: 4.5 usec per loop> timeit -s"s='x'; n=1000" "s*n == n*s"
Maric said:Le mardi 22 août 2006 12:55, Mc Osten a écrit :
Maybe I'm missing something but the posted c++codes are not equivalent IMO to
what python is doing. I discarded the "slow" version, and tried to get the
equivalent in c++ of :
Tim N. van der Leeuw said:NB: Your code now tests for address-equality. Does it also still test
for string-equality? It looks to me that it does, but it's not quite
clear to me.
Tim N. van der Leeuw said:My conclusion from that is, that the vector<> or set<> implementations
of GCC are far superior to those of VC++ 6, but that memory allocation
for GCC 3.4.5 (MinGW version) is far worse than that of MSCRT / VC++ 6.
(And Python still smokes them both).
Tim N. van der Leeuw said:And the results of IronPython (1.0rc2) are just in as well:
And for Python 2.5:
LeeuwT@nlshl-leeuwt ~/My Documents/Python
$ /cygdrive/c/Python25/python.exe SpeedTest.py
Begin Test
Number of unique string objects: 4
so long...
What do you know
fool
chicken crosses road
Number of unique string objects: 400000
so long...
What do you know
fool
chicken crosses road
Fast - Elapsed: 0.440619 seconds
Slow - Elapsed: 1.095341 seconds
(Next step would be to create a VB version and a Java version of the
same program, oh and perhaps to try a version that would work with
Jython... perhaps somehow w/o the 'set')
<snip>Tim said:Incidentally, I also have a version compiled with VC++ 6 now... (not
yet w/VC++ 7) .. Compiled with release-flags and maximum optimization
for speed, here's the result of VC++ 6:
Want to reply to this thread or ask your own question?
You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.