Python and STL efficiency

L

Licheng Fang

Hi, I'm learning STL and I wrote some simple code to compare the
efficiency of python and STL.

//C++
#include <iostream>
#include <string>
#include <vector>
#include <set>
#include <algorithm>
using namespace std;

int main(){
vector<string> a;
for (long int i=0; i<10000 ; ++i){
a.push_back("What do you know?");
a.push_back("so long...");
a.push_back("chicken crosses road");
a.push_back("fool");
}
set<string> b(a.begin(), a.end());
unique_copy(b.begin(), b.end(), ostream_iterator<string>(cout, "\n"));
}

#python
def f():
a = []
for i in range(10000):
a.append('What do you know')
a.append('so long...')
a.append('chicken crosses road')
a.append('fool')
b = set(a)
for s in b:
print s

I was using VC++.net and IDLE, respectively. I had expected C++ to be
way faster. However, while the python code gave the result almost
instantly, the C++ code took several seconds to run! Can somebody
explain this to me? Or is there something wrong with my code?
 
M

Marc 'BlackJack' Rintsch

Licheng Fang said:
Hi, I'm learning STL and I wrote some simple code to compare the
efficiency of python and STL.

//C++
#include <iostream>
#include <string>
#include <vector>
#include <set>
#include <algorithm>
using namespace std;

int main(){
vector<string> a;
for (long int i=0; i<10000 ; ++i){
a.push_back("What do you know?");
a.push_back("so long...");
a.push_back("chicken crosses road");
a.push_back("fool");
}
set<string> b(a.begin(), a.end());
unique_copy(b.begin(), b.end(), ostream_iterator<string>(cout, "\n"));
}

Why are you using `unique_copy` here?
#python
def f():
a = []
for i in range(10000):
a.append('What do you know')
a.append('so long...')
a.append('chicken crosses road')
a.append('fool')
b = set(a)
for s in b:
print s

I was using VC++.net and IDLE, respectively. I had expected C++ to be
way faster. However, while the python code gave the result almost
instantly, the C++ code took several seconds to run! Can somebody
explain this to me? Or is there something wrong with my code?

There's a difference in data structures at least. The Python `set` type
is implemented with a hash algorithm, so the equivalent STL type would be
`hash_set`. `set` in Python does not store its contents sorted.

Ciao,
Marc 'BlackJack' Rintsch
 
L

Licheng Fang

Marc said:
Why are you using `unique_copy` here?

Sorry, that's a typo. Actually I used 'copy'.
#python
def f():
a = []
for i in range(10000):
a.append('What do you know')
a.append('so long...')
a.append('chicken crosses road')
a.append('fool')
b = set(a)
for s in b:
print s

I was using VC++.net and IDLE, respectively. I had expected C++ to be
way faster. However, while the python code gave the result almost
instantly, the C++ code took several seconds to run! Can somebody
explain this to me? Or is there something wrong with my code?

There's a difference in data structures at least. The Python `set` type
is implemented with a hash algorithm, so the equivalent STL type would be
`hash_set`. `set` in Python does not store its contents sorted.

Ciao,
Marc 'BlackJack' Rintsch

Thank you for your comments. I tested with hash_set, but I didn't see
much performance improvement. When I increased the loop to 1 million
times, the python code still ran reasonably fast and the C++ code got
stuck there. This totally surprised me, because according this page
http://norvig.com/python-lisp.html, the speed of python is nowhere near
that of C++.
 
T

Tim N. van der Leeuw

Licheng said:
Hi, I'm learning STL and I wrote some simple code to compare the
efficiency of python and STL.


I was using VC++.net and IDLE, respectively. I had expected C++ to be
way faster. However, while the python code gave the result almost
instantly, the C++ code took several seconds to run! Can somebody
explain this to me? Or is there something wrong with my code?

Hi,

I'm no C++ guru so cannot comment on the C++ code itself, however I do
wonder if you tested your C++ code with other STL implementation such
as gcc (gcc is available on windows as well, in various versions).

What could be is that expanding the list in C++ is done in very small
increments, leading to many re-allocations. Is it possible to
pre-allocate the vector<> with sufficient entries?


Also, your Python code as quoted, doesn't actually call your function
f(). If you say that you get results instantly, I assume that you mean
all 4 strings are actually printed to console?

(I'm surprised that the console prints things that fast).

btw, using range() in Python isn't very efficient, I think... Better to
use xrange().


Asked a C++ collegue of mine to comment, and he strongly suspects that
you're actually running it in the .Net runtime (your C++ code contains
some C#-isms, such as omitting the '.h' in the include <> statements).

Luck,

--Tim
 
T

Tim N. van der Leeuw

Marc said:
Licheng Fang said:
Hi, I'm learning STL and I wrote some simple code to compare the
efficiency of python and STL.
[...]

There's a difference in data structures at least. The Python `set` type
is implemented with a hash algorithm, so the equivalent STL type would be
`hash_set`. `set` in Python does not store its contents sorted.

The set should be only 4 items in size, according to my reading of the
code, so set implementation differences shouldn't lead to drastic
performance differences.

Ciao,
Marc 'BlackJack' Rintsch

Cheers,

--Tim
 
M

Marc 'BlackJack' Rintsch

Tim N. van der said:
(your C++ code contains some C#-isms, such as omitting the '.h' in the
include <> statements).

That's no C#-ism, that's C++. The standard C++ header names don't have a
trailing '.h'. ``gcc`` prints deprecation warnings if you write the
names with '.h'.

Ciao,
Marc 'BlackJack' Rintsch
 
T

Tim N. van der Leeuw

Marc said:
That's no C#-ism, that's C++. The standard C++ header names don't have a
trailing '.h'. ``gcc`` prints deprecation warnings if you write the
names with '.h'.

Ciao,
Marc 'BlackJack' Rintsch

We stand corrected.

--Tim
 
F

Fredrik Lundh

Licheng said:
I was using VC++.net and IDLE, respectively. I had expected C++ to be
way faster. However, while the python code gave the result almost
instantly, the C++ code took several seconds to run! Can somebody
explain this to me? Or is there something wrong with my code?

in the Python example, the four strings in your example are shared, so
you're basically copying 40000 pointers to the list.

in the C++ example, you're creating 40000 string objects.

</F>
 
P

Peter Otten

Licheng said:
Hi, I'm learning STL and I wrote some simple code to compare the
efficiency of python and STL.
I was using VC++.net and IDLE, respectively. I had expected C++ to be
way faster. However, while the python code gave the result almost
instantly, the C++ code took several seconds to run! Can somebody
explain this to me? Or is there something wrong with my code?

Just a guess: immutable strings might be Python's advantage. Due to your
"benchmark"'s simplicity you end up with 10000 string instances in C++ and
just four str-s (and a lot of pointers) in Python.

What happens if you replace 'string' with 'const char *' in C++ ?
(Note that this modification is a bit unfair to Python as it would not
detect equal strings in different memory locations)

Peter
 
R

Ray

How did you compile the C++ executable? I assume that it is Release
mode? Then are the optimization switches enabled? Is it compiled as
Native Win32 or Managed application?

I suspect that other than what other posters have suggested about your
code, the difference in speed is due to the way you build your C++
executable...

HTH,
Ray
 
R

Ray

Fredrik said:
in the Python example, the four strings in your example are shared, so
you're basically copying 40000 pointers to the list.

in the C++ example, you're creating 40000 string objects.

</F>

In which case, Licheng, you should try using the /GF switch. This will
tell Microsoft C++ compiler to pool identical string literals together.


:)
 
F

Fredrik Lundh

Ray said:
In which case, Licheng, you should try using the /GF switch. This will
tell Microsoft C++ compiler to pool identical string literals together.

in what way does that change the implementation of C++'s string type ?

</F>
 
T

Tim N. van der Leeuw

Ray said:
In which case, Licheng, you should try using the /GF switch. This will
tell Microsoft C++ compiler to pool identical string literals together.


:)

The code still creates a new string - instance each time it tries to
append a const char* to the vector<string> ...

You should instead create the string-objects ahead of time, outside of
the loop.

Regards,

--Tim
 
J

Jeremy Sanders

Licheng said:
I was using VC++.net and IDLE, respectively. I had expected C++ to be
way faster. However, while the python code gave the result almost
instantly, the C++ code took several seconds to run! Can somebody
explain this to me? Or is there something wrong with my code?

It must be the debugging, the compiler or a poor STL implementation. With
gcc 4 it runs instantly on my computer (using -O2), even with 10x the
number of values.

If the problem is that C++ has to make lots of new strings, as other posters
have suggested, then you could do something like

const string foo = "What do you know?";

for (long int i=0; i<10000 ; ++i){
   a.push_back(foo);
...
}

as many C++ implementations use reference counting for identical strings.

Jeremy
 
C

Christophe

Jeremy Sanders a écrit :
It must be the debugging, the compiler or a poor STL implementation. With
gcc 4 it runs instantly on my computer (using -O2), even with 10x the
number of values.

If the problem is that C++ has to make lots of new strings, as other posters
have suggested, then you could do something like

const string foo = "What do you know?";

for (long int i=0; i<10000 ; ++i){
a.push_back(foo);
...
}

as many C++ implementations use reference counting for identical strings.

Jeremy

As a matter of fact, do not count on that. Use a vector<string*> just in
case.
 
T

Tim N. van der Leeuw

Tim said:
The code still creates a new string - instance each time it tries to
append a const char* to the vector<string> ...

You should instead create the string-objects ahead of time, outside of
the loop.

Regards,

--Tim

Alternatively, slow down the Python implementation by making Python
allocate new strings each time round:

a.append('%s' % 'What do you know')


.... for each of your string-appends. But even then, the python-code is
still near-instant.

Cheers,

--Tim
 
R

Ray

Fredrik said:
in what way does that change the implementation of C++'s string type ?

Ah, yes what was I thinking? The fact that it stores std::string
objects escaped my mind somehow. /GF just pools the string literals.
Thanks for the correction.
 
R

Ray

Tim said:
The code still creates a new string - instance each time it tries to
append a const char* to the vector<string> ...

Yeah, you're right... I've been programming Java too long :)
 
T

Tim N. van der Leeuw

Ray said:
Yeah, you're right... I've been programming Java too long :)

Took me a while to see that too! Have been programming too much Java /
Python as well. Anyways, when changing the Python version so that it
adds 40.000 unique strings to the list (and proving that there are
indeed 40.000 unique ids in the list, by making a set of all id()s in
the list and taking the len() of that set), it still takes at most a
second. I cannot test the speed of the c++ version on my computer, so
nothing scientific here.

I'm curious though, if on the OP's machine the slowed-down Python
version is still faster than the C++ version.


Cheers,

--Tim
 
M

Mc Osten

Jeremy Sanders said:
It must be the debugging, the compiler or a poor STL implementation. With
gcc 4 it runs instantly on my computer (using -O2), even with 10x the
number of values.

$ gcc --version
i686-apple-darwin8-gcc-4.0.1 (GCC) 4.0.1 (Apple Computer, Inc. build
5363)

I adapted original poster's code and made a function that did not create
strings each time. The NoisyString is a class we can use to actually
track copying.

In fact Python here is faster. Suppose it has a really optimized set
class...


Here some results (I know that the fpoint optimizations are useless...
it's is my "prebuilt" full optimization macro :) ):




$ g++ -O3 -pipe -O2 -march=pentium-m -msse3 -fomit-frame-pointer
-mfpmath=sse -o set_impl set_impl.cpp
$ ./set_impl
What do you know?
chicken crosses road
fool
so long...
What do you know?
chicken crosses road
fool
so long...
Elapsed 5.8
Elapsed 1.71

$ g++ -Os -pipe -O2 -march=pentium-m -msse3 -fomit-frame-pointer
-mfpmath=sse -o set_impl set_impl.cpp
$ ./set_impl

What do you know?
chicken crosses road
fool
so long...
What do you know?
chicken crosses road
fool
so long...
Elapsed 5.8
Elapsed 1.71

$ g++ -O3 -o set_impl set_impl.cpp
$ ./set_impl
What do you know?
chicken crosses road
fool
so long...
What do you know?
chicken crosses road
fool
so long...
Elapsed 0.47
Elapsed 0.18

$ g++ -o set_impl set_impl.cpp
$ ./set_impl
What do you know?
chicken crosses road
fool
so long...
What do you know?
chicken crosses road
fool
so long...
Elapsed 0.63
Elapsed 0.33

$ python -O set_impl.py
so long...
What do you know
fool
chicken crosses road
so long...
What do you know
fool
chicken crosses road
Elapsed: 1.370000 seconds
Elapsed: 3.810000 seconds



------------------- PYTHON CODE ---------------------------------
#python

global size
size = 1000000

def f():
a = []
for i in range(size):
a.append('What do you know')
a.append('so long...')
a.append('chicken crosses road')
a.append('fool')
b = set(a)
for s in b:
print s

def slow_f():
a = []
for i in range(size):
a.append('%s' % 'What do you know')
a.append('%s' % 'so long...')
a.append('%s' % 'chicken crosses road')
a.append('%s' % 'fool')
b = set(a)
for s in b:
print s

import time
from time import clock

f_start = clock()
f()
f_end = clock()

slow_f_start = clock()
slow_f()
slow_f_end = clock()

print "Elapsed: %f seconds" % (f_end - f_start)
print "Elapsed: %f seconds" % (slow_f_end - slow_f_start)

------------------------------------------------------------------


----------------- CPP CODE -------------------------------------
#include <iostream>
#include <ostream>
#include <iterator>
#include <string>
#include <vector>
#include <set>
#include <algorithm>
#include <ctime>
using namespace std;


#define SIZE 1000000

class NoisyString : public std::string {
public:
NoisyString(const string& cp)
: string(cp)
{
cout << "**** I got copied!" << endl;
}

NoisyString(const char* s ) : string(s) {

}



};


void f(){
vector<string> a;
for (long int i=0; i<SIZE ; ++i){
a.push_back("What do you know?");
a.push_back("so long...");
a.push_back("chicken crosses road");
a.push_back("fool");
}
set<string> b(a.begin(), a.end());
copy(b.begin(), b.end(), ostream_iterator<string>(cout, "\n"));
}

void fast_f(){
vector<string> a;
string s1 = "What do you know?" ;
string s2 = "so long..." ;
string s3 = "chicken crosses road";
string s4 = "fool" ;
for (long int i=0; i<SIZE ; ++i){
a.push_back(s1);
a.push_back(s2);
a.push_back(s3);
a.push_back(s4);
}
set<string> b(a.begin(), a.end());
copy(b.begin(), b.end(), ostream_iterator<string>(cout, "\n"));
}


int main(){
clock_t f_start,
f_end,
faster_f_start,
faster_f_end,
fast_f_start,
fast_f_end;

f_start = clock();
f();
f_end = clock();

fast_f_start = clock();
fast_f();
fast_f_end = clock();


cout << "Elapsed " << (f_end - f_start) / double(CLOCKS_PER_SEC) <<
endl;
cout << "Elapsed " << (fast_f_end - fast_f_start) /
double(CLOCKS_PER_SEC) << endl;

}

-----------------------------------------------------------------------
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,994
Messages
2,570,223
Members
46,812
Latest member
GracielaWa

Latest Threads

Top