timeit module for comparing the performance of two scripts

Phoe6 · Jul 11, 2006

Hi,
Following are my files.
In the format:

Filename
----
content
----

config1.txt
----
#DataBase Repository file
dbRepository = omsa_disney.xml
----

config2.txt
----
# Configfile for sendmail

[Sendmail]
userprefix = testuser
----

pyConfig.py
----
import re

def pyConfig():
fhandle = open( 'config1.txt', 'r' )
arr = fhandle.readlines()
fhandle.close()

hash = {}

for item in arr:
txt = item.strip()
if re.search( '^\s*$', txt ):
continue
if re.search( '^#.*$', txt ):
continue
if not re.search( '^\s*(\w|\W)+\s*=\s*(\w|\W)+\s*$', txt ):
continue
hash[txt.split( '=' )[0].strip().lower()] = txt.split( '='
)[1].strip()

print hash['dbrepository']
----

pyConparse.py
----
from ConfigParser import ConfigParser

def pyConParse():
configparser = ConfigParser()
configparser.read('config2.txt')
print configparser.get('Sendmail','userprefix')

----

Question is:
How do I compare the performance of pyConfig.py vs pyConparse.py using
timeit module?
I tried it as the way it is mentioned in the example, but when I do
Timer("pyConfig()","from __main__ import pyConfig"), it imports
pyConfig in infinite loop. ( How does the python doc example on timeit
work? Anyone tried it?)

I just need to compare pyConfig.py and pyParse.py. ( And as general
question x.py and y.py)
How I can do it? Please consider this as a newbie question as well.

Thanks,
Senthil

Fredrik Lundh · Jul 11, 2006

Phoe6 said:
How do I compare the performance of pyConfig.py vs pyConparse.py using
timeit module?

$ python -m timeit -s "import pyConfig" "pyConfig.pyConfig()"
$ python -m timeit -s "import pyConparse" "pyConparse.pyConParse()"

note that timeit runs the benchmarked function multiple times, so you may want
to remove the print statements.

</F>

Phoe6 · Jul 11, 2006

Fredrik said:
$ python -m timeit -s "import pyConfig" "pyConfig.pyConfig()"
$ python -m timeit -s "import pyConparse" "pyConparse.pyConParse()"

note that timeit runs the benchmarked function multiple times, so you may want
to remove the print statements.

Thanks a lot Fredrik!. I did not know that timeit runs benchmarked
function multiple times. I got scared with multiple prints and thought
import pyConfig has put it in infinite loop and I had killed the
program.

I could use Timer function as well.

Thanks,
Senthil

Fredrik Lundh · Jul 11, 2006

"Phoe6"

Thanks a lot Fredrik!. I did not know that timeit runs benchmarked
function multiple times. I got scared with multiple prints and thought
import pyConfig has put it in infinite loop and I had killed the
program.

I could use Timer function as well.

for cases like this, the command form gives a better result with less effort; it picks
a suitable number of iterations based on how fast the code actually runs, instead of
using a fixed number, and it also runs the test multiple times, and picks the smallest
observed time.

</F>

3c273 · Jul 11, 2006

$ python -m timeit -s "import pyConfig" "pyConfig.pyConfig()"
$ python -m timeit -s "import pyConparse" "pyConparse.pyConParse()"

note that timeit runs the benchmarked function multiple times, so you may want
to remove the print statements.

Hello,
Could you tell me what the "-m" switch does or even better, where to find
information on all switches in the documentation? Thanks.
Louis

John Machin · Jul 11, 2006

Hi,

Hi,

I'm a little astonished that anyone would worry too much (if at all!)
about how long it took to read a config file. Generally, one would
concentrate on correctness, and legibility of source code. There's not
much point IMHO in timing your pyConfig.py in its current form. Please
consider the following interspersed comments.

Also, the functionality of the two modules that you are comparing is
somewhat different ;-)

Cheers,
John

Following are my files.
----

pyConfig.py
----
import re

def pyConfig():
fhandle = open( 'config1.txt', 'r' )
arr = fhandle.readlines()
fhandle.close()

hash = {}

for item in arr:

There is no need of readlines(). Use:

for item in fhandle:

txt = item.strip()

str.strip() removes leading and trailing whitespace. Hence if the line
is visually empty, txt will be "" i.e. a zero-length string.

if re.search( '^\s*$', txt ):

For a start, your regex is constrained by the ^ and $ to matching a
whole string, so you should use re.match, not re.search. Read the
section on this topic in the re manual. Note that re.search is *not*
smart enough to give up if the test at the beginning fails. Second
problem: you don't need re at all! Your regex says "whole string is
whitespace", but the strip() will have reduced that to an empty string.
All you need do is:

if not txt: # empty line

continue
if re.search( '^#.*$', txt ):

Similar to the above. All you need is:

if txt.startswith('#'): # line is comment only
or (faster but less explicit):
if txt[0] == '#': # line is comment only

continue
if not re.search( '^\s*(\w|\W)+\s*=\s*(\w|\W)+\s*$', txt ):

(1) search -> match, lose the ^
(2) lose the first and last \s* -- strip() means they are redundant.
(3) What are you trying to achieve with (\w|\W) ??? Where I come from,
"select things that are X or not X" means "select everything". So the
regex matches 'anything optional_whitespace = optional_whitespace
anything'. However 'anything' includes whitespace. You probably intended
something like 'word optional_whitespace = optional_whitespace
at_least_1_non-whitespace':

if not re.match('\w+\s*=\s*\S+.*$'):
but once you've found a non-whitespace after the =, it's pointless
continuing, so:
if not re.match('\w+\s*=\s*\S'):

continue

Are you sure such lines can be silently ignored? Might they be errors?

hash[txt.split( '=' )[0].strip().lower()] = txt.split( '='
)[1].strip()

Best to avoid splitting twice. Also it's a little convoluted. Also
beware of multiple '=' in the line.

left, right = txt.split('=', 1)
key = left.strip().lower()
if key in hash:
# does this matter?
value = right.strip()
hash[key] = value

print hash['dbrepository']
----

Oh, yeah, that hash thing, regexes everywhere ... it's left me wondering
could this possibly be a translation of a script from another language

John Machin · Jul 11, 2006

Hello,
Could you tell me what the "-m" switch does or even better, where to find
information on all switches in the documentation? Thanks.
Louis

You appear to know what a switch is. I'm therefore surprised that you
appear not to
know that the convention is that any program that uses
command-line switches should do something informative when run with a -h
switch.

HTH [NPI],
John

3c273 · Jul 11, 2006

John Machin said:
You appear to know what a switch is. I'm therefore surprised that you
appear not to
know that the convention is that any program that uses
command-line switches should do something informative when run with a -h
switch.

Doh! Me thinks Windows at work "python /?" (No good!), Linux at home
"python -h" (Ah ha!). I still think it should be in the docs somewhere.
Thanks.
Louis

Fredrik Lundh · Jul 12, 2006

3c273 said:
Doh! Me thinks Windows at work "python /?" (No good!)

that was supposed to be fixed in 2.5, but it doesn't seem to have made it into
beta 2. hmm.

</F>

Georg Brandl · Jul 12, 2006

3c273 said:
Doh! Me thinks Windows at work "python /?" (No good!), Linux at home
"python -h" (Ah ha!). I still think it should be in the docs somewhere.

python /? now works in 2.5 SVN.

Georg

timeit module for comparing the performance of two scripts

Phoe6

Fredrik Lundh

Phoe6

Fredrik Lundh

3c273

John Machin

John Machin

3c273

Fredrik Lundh

Georg Brandl

Members online

Forum statistics

Latest Threads