which one is faster?

S

Stephen.Wu

str.find(targetStr)
str.index(targetStr) with exception
str.count(targetStr)
targetStr in str

which is the fastest way to check whether targetStr is in str?

thanks all
 
A

alex23

str.find(targetStr)
str.index(targetStr) with exception
str.count(targetStr)
targetStr in str

which is the fastest way to check whether targetStr is in str?

It's generally a lot quicker to investigate this kind of question
yourself using the interpreter & the timeit module. You'll need skills
like these to be able to profile your code to look for actual
performance bottlenecks, generic advice on the fastest of a set of
functions will only get you so far.

IPython is pretty handy for simple timing tests as it provides
convenience wrappers around timeit:

In [1]: t = 'foo'
In [2]: s =
'djoemdmsllsodmedmsoskemozpleaoleodspsfooosoapxooeplaapakekoda'
In [3]: timeit s.find(t)
1000000 loops, best of 3: 374 ns per loop
In [4]: timeit s.index(t)
1000000 loops, best of 3: 381 ns per loop
In [7]: timeit s.count(t)
1000000 loops, best of 3: 397 ns per loop
In [8]: timeit t in s
1000000 loops, best of 3: 219 ns per loop

From the looks of those results, using 'in' seems to be the fastest.
 
B

Bruno Desthuilliers

alex23 a écrit :
It's generally a lot quicker to investigate this kind of question
yourself using the interpreter & the timeit module. You'll need skills
like these to be able to profile your code to look for actual
performance bottlenecks, generic advice on the fastest of a set of
functions will only get you so far.

Indeed. Another point to take into consideration is the _real_ dataset
on which the functions under test will have to work. And possibly the
available resources at runtime. As an example, an iterator-based
solution is likely to be slower than a more straightforward "load
everything in ram" one, but will scale better for large datasets and/or
heavy concurrent access to the system.
 
A

Antoine Pitrou

Le Thu, 28 Jan 2010 22:39:32 -0800, alex23 a écrit :
str.find(targetStr)
str.index(targetStr) with exception
str.count(targetStr)
targetStr in str

which is the fastest way to check whether targetStr is in str? [...]

From the looks of those results, using 'in' seems to be the fastest.

To answer the question more precisely:

* "in" is the fastest because it doesn't have the method call overhead.
This is only a fixed cost, though, and doesn't depend on the inputs.
* all four alternatives use the same underlying algorithm, *but* count()
has to go to the end of the input string in order to count all
occurrences. The other expressions can stop as soon as the first
occurence is found, which can of course be a big win if the occurrence is
near the start of the string and the string is very long

So, to sum it up:
* "in" is faster by a small fixed cost advantage
* "find" and "index" are almost exactly equivalent
* "count" will often be slower because it can't early exit

Regards

Antoine.
 
J

Jonathan Gardner

str.find(targetStr)
str.index(targetStr) with exception
str.count(targetStr)
targetStr in str

which is the fastest way to check whether targetStr is in str?

The fastest way of all is to forget about this and finish the rest of
your program. Developer time is much, much more valuable than
processor time.

When you are all done, and have solved all the other problems in the
world, you can come back and pontificate on how many nanoseconds you
can save by using "find" or "in".
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,176
Messages
2,570,950
Members
47,503
Latest member
supremedee

Latest Threads

Top