How can I speed this function up?

C

Chris

This is just some dummy code to mimic what's being done in the real
code. The actual code is python which is used as a scripting language in
a third party app. The data structure returned by the app is more or
less like the "data" list in the code below. The test for "ELEMENT" is
necessary ... it just evaluates to true every time in this test code. In
the real app perhaps 90% of tests will also be true.

So my question is how can I speed up what's happening inside the
function write_data()? Only allowed to use vanilla python (no psycho or
other libraries outside of a vanilla python install).

I have a vested interest in showing a colleague that a python app can
yield results in a time comparable to his C-app, which he feels is mch
faster. I'd like to know what I can do within the constraints of the
python language to get the best speed possible. Hope someone can help.

def write_data1(out, data):
for i in data:
if i[0] is 'ELEMENT':
out.write("%s %06d " % (i[0], i[1]))
for j in i[2]:
out.write("%d " % (j))
out.write("\n")

import timeit

# basic data mimicing data returned from 3rd party app
data = []
for i in range(500000):
data.append(("ELEMENT", i, (1,2,3,4,5,6)))

# write data out to file
fname = "test2.txt"
out = open(fname,'w')
start= timeit.time.clock()
write_data2(out, data)
out.close()
print timeit.time.clock()-start
 
T

Terry Reedy

Chris said:
def write_data1(out, data):
for i in data:
if i[0] is 'ELEMENT':

Testing for equality with 'is' is a bit of a cheat since it is
implementation dependent,
but since you have a somewhat unfair constraint ....
out.write("%s %06d " % (i[0], i[1]))

Since i[0] is tested to be "ELEMENT', this should be the same as
out.write("ELEMENT %06d " % i[1])
which saves constructing a tuple as well as an interpolation.
for j in i[2]:
out.write("%d " % (j))
out.write("\n")

tjr
 
C

Chris

Chris said:
This is just some dummy code to mimic what's being done in the real
code. The actual code is python which is used as a scripting language in
a third party app. The data structure returned by the app is more or
less like the "data" list in the code below. The test for "ELEMENT" is
necessary ... it just evaluates to true every time in this test code. In
the real app perhaps 90% of tests will also be true.

So my question is how can I speed up what's happening inside the
function write_data()? Only allowed to use vanilla python (no psycho or
other libraries outside of a vanilla python install).

I have a vested interest in showing a colleague that a python app can
yield results in a time comparable to his C-app, which he feels is mch
faster. I'd like to know what I can do within the constraints of the
python language to get the best speed possible. Hope someone can help.

def write_data1(out, data):
for i in data:
if i[0] is 'ELEMENT':
out.write("%s %06d " % (i[0], i[1]))
for j in i[2]:
out.write("%d " % (j))
out.write("\n")

import timeit

# basic data mimicing data returned from 3rd party app
data = []
for i in range(500000):
data.append(("ELEMENT", i, (1,2,3,4,5,6)))

# write data out to file
fname = "test2.txt"
out = open(fname,'w')
start= timeit.time.clock()
write_data2(out, data)
out.close()
print timeit.time.clock()-start

with this function I went from 8.04 s to 6.61 s. Now running up against
my limited knowledge of python. Any chance of getting faster?

def write_data4(out, data):
for i in data:
if i[0] is 'ELEMENT':
strx = "%s %06d " % (i[0], i[1])
strx="".join([strx] + ["%d " % (j) for j in i[2]] + "\n"])
out.write(strx)
 
G

Guest

Hi, Chris.
I made a trivial testing framework for this cute problem and tried a
couple of modifications. I also added the 10% of non-ELEMENT lines you
mentioned. First thing, your updated algorithm didn't really get me much
faster results than the original. I guess that my disk array sort of
hides the multiple write penalty. But I experimented with various
algorithms. Here's the code in its entirety:
http://www.rafb.net/paste/results/ZuW4fK85.html My results (Python 2.4,
32bit Fedora Core) were:

[ksh@lapoire tmp]# python test.py
Preparing data...
[write_data1] Preparing output file...
[write_data1] Writing...
[write_data1] Done in 10.73 seconds.
[write_data4] Preparing output file...
[write_data4] Writing...
[write_data4] Done in 10.46 seconds.
[write_data_flush] Preparing output file...
[write_data_flush] Writing...
[write_data_flush] Done in 9.09 seconds.
[write_data_per_line] Preparing output file...
[write_data_per_line] Writing...
[write_data_per_line] Done in 9.71 seconds.
[write_data_once] Preparing output file...
[write_data_once] Writing...
[write_data_once] Done in 7.82 seconds.

I'm pretty sure that your measures will vary (observing your results
you seem to have a faster CPU but slower disk(s)). But you can just take
what works best for you. I'm also quite confident that you won't be able
to catch up C since as you can see Python's data structures are far more
flexible and thus require more processing overhead.

Regards,
Åukasz
 
G

Gabriel Genellina

This is just some dummy code to mimic what's being done in the real
code. The actual code is python which is used as a scripting language in
a third party app. The data structure returned by the app is more or
less like the "data" list in the code below. The test for "ELEMENT" is
necessary ... it just evaluates to true every time in this test code. In
the real app perhaps 90% of tests will also be true.

So my question is how can I speed up what's happening inside the
function write_data()? Only allowed to use vanilla python (no psycho or
other libraries outside of a vanilla python install).

I have a vested interest in showing a colleague that a python app can
yield results in a time comparable to his C-app, which he feels is mch
faster. I'd like to know what I can do within the constraints of the
python language to get the best speed possible. Hope someone can help.

If you can assume that all items have 6 numbers, it appears best to
unroll the inner iteration. Below is my best attempt with ideas from
other replies too, including some alternatives. The timing is only
approximate and had a wide dispersion; median of three. But it's
clear that the main gain comes from calling out.write only once:

Notice that you can't, in general, use i[0] is 'ELEMENT' unless you
can guarantee that i[0] is an interned string (and if it comes from
another process, chances are it isn't). Using intern(i[0]) is
'ELEMENT' would work, but slows down your program.

# initial: 11.66s
def write_data1(out, data):
write = out.write
for i in data:
if i[0] == 'ELEMENT': # sorry but can't guarantee identity

# 6.21s
write("ELEMENT %06d %s\n" % (i[1], "%d %d %d %d %d %d " % i[2]))

# 6.92s
# write("ELEMENT %06d %s \n" % (i[1], " ".join(map(str,i[2]))))

# 8.30s
# i2 = i[2]
# write("ELEMENT %06d %d %d %d %d %d %d \n" % (i[1],
i2[0], i2[1], i2[2], i2[3], i2[4], i2[5]))

# 7.04s __getitem__
# i2 = i[2].__getitem__
# write("ELEMENT %06d %d %d %d %d %d %d \n" % (i[1],
i2(0), i2(1), i2(2), i2(3), i2(4), i2(5)))



--
Gabriel Genellina
Softlab SRL

__________________________________________________
Correo Yahoo!
Espacio para todos tus mensajes, antivirus y antispam ¡gratis!
¡Abrí tu cuenta ya! - http://correo.yahoo.com.ar
 
T

Tim Hochberg

Chris said:
This is just some dummy code to mimic what's being done in the real
code. The actual code is python which is used as a scripting language in
a third party app. The data structure returned by the app is more or
less like the "data" list in the code below. The test for "ELEMENT" is
necessary ... it just evaluates to true every time in this test code. In
the real app perhaps 90% of tests will also be true.

So my question is how can I speed up what's happening inside the
function write_data()? Only allowed to use vanilla python (no psycho or
other libraries outside of a vanilla python install).

Try collecting your output into bigger chunks before writing it out. For
example, take a look at:

def write_data2(out, data):
buffer = []
append = buffer.append
extend = buffer.extend
for i in data:
if i[0] == 'ELEMENT':
append("ELEMENT %06d " % i[1])
extend(map(str, i[2]))
append('\n')
out.write(''.join(buffer))


def write_data3(out, data):
buffer = []
append = buffer.append
for i in data:
if i[0] == 'ELEMENT':
append(("ELEMENT %06d %s" % (i[1],' '.join(map(str,i[2])))))
out.write('\n'.join(buffer))


Both of these run almost twice as fast as the original below (although
admittedly I didn't check that they were actually right). Using some of
the other suggestions mentioned in this thread may make things better
still. It's possible that some intermediate chunk size might be better
than collecting everything into one string, I dunno.

cStringIO might be helpful here as a buffer instead of using lists, but
I don't have time to try it right now.

-tim

I have a vested interest in showing a colleague that a python app can
yield results in a time comparable to his C-app, which he feels is mch
faster. I'd like to know what I can do within the constraints of the
python language to get the best speed possible. Hope someone can help.

def write_data1(out, data):
for i in data:
if i[0] is 'ELEMENT':
out.write("%s %06d " % (i[0], i[1]))
for j in i[2]:
out.write("%d " % (j))
out.write("\n")

import timeit

# basic data mimicing data returned from 3rd party app
data = []
for i in range(500000):
data.append(("ELEMENT", i, (1,2,3,4,5,6)))

# write data out to file
fname = "test2.txt"
out = open(fname,'w')
start= timeit.time.clock()
write_data2(out, data)
out.close()
print timeit.time.clock()-start
 
P

Paddy

Chris said:
I have a vested interest in showing a colleague that a python app can
yield results in a time comparable to his C-app, which he feels is mch
faster. I'd like to know what I can do within the constraints of the
python language to get the best speed possible. Hope someone can help.

Fight smart!
How long did the C-app take to write?
How robust are the C and the Python versions w.r.t. unforeseen inputs?
Mimic the software life-cycle:
* How long would it take to make each program work on Windows?, Mac?
* How long would it take to 'fully' test each program?
How easy is it to explain each prog. to an audience that have
programmed, but never in C or Python?
How long would it take to add another feature?

Best and best speed can have many meanings. good luck.

- Paddy.
 
J

John Machin

with this function I went from 8.04 s to 6.61 s.

And your code became less understandable.
Now running up against
my limited knowledge of python. Any chance of getting faster?

You have saved 1.4 *seconds*. What is the normal running time for this
app with 0.5M records? What is 1.4 seconds as a percentage of that?

Please consider that you are barking up the wrong gum tree. Competing
with a C app on speed is not something that experienced Python
programmers would take on lightly.

Talk to your colleague about some of these factors: time to write code,
robustness, clarity, ease of maintenance.

Cheers,
John
 
D

DarkBlue

Just to show how much a system set up
impacts these results:
Result from suse10.1 64 , python 2.4
with AMD FX-55 cpu and about 12 active apps
running in the background. 7200rpm sata drives.

Preparing data...
[write_data1] Preparing output file...
[write_data1] Writing...
[write_data1] Done in 5.43 seconds.
[write_data4] Preparing output file...
[write_data4] Writing...
[write_data4] Done in 4.41 seconds.
[write_data_flush] Preparing output file...
[write_data_flush] Writing...
[write_data_flush] Done in 5.41 seconds.
[write_data_per_line] Preparing output file...
[write_data_per_line] Writing...
[write_data_per_line] Done in 4.4 seconds.
[write_data_once] Preparing output file...
[write_data_once] Writing...
[write_data_once] Done in 4.28 seconds.
 
J

John Machin

Gabriel said:

We already have a case where the best response to the OP was like
Paddy's response, *not* to answer the question literally.

Then: "loop unrolling"? "assume" with no comments and no assertions?
 
N

nnorwitz

Chris said:
This is just some dummy code to mimic what's being done in the real
code. The actual code is python which is used as a scripting language in
a third party app. The data structure returned by the app is more or
less like the "data" list in the code below. The test for "ELEMENT" is
necessary ... it just evaluates to true every time in this test code. In
the real app perhaps 90% of tests will also be true.

As others have said, without info about what's happening in C, there's
no way to know what's equivalent or fast enough.
So my question is how can I speed up what's happening inside the
function write_data()? Only allowed to use vanilla python (no psycho or
other libraries outside of a vanilla python install).

Generally, don't create objects, don't perform repeated operations. In
this case, batch up I/O.
def write_data1(out, data):
for i in data:
if i[0] is 'ELEMENT':
out.write("%s %06d " % (i[0], i[1]))
for j in i[2]:
out.write("%d " % (j))
out.write("\n")

def write_data1(out, data, map=map, str=str):
SPACE_JOIN = ' '.join
lines = [("ELEMENT %06d " % i1) + SPACE_JOIN(map(str, i2))
for i0, i1, i2 in data if i0 == 'ELEMENT']
out.write('\n'.join(lines))

While perhaps a bit obfuscated, it's a bit faster than the original.
Part of what makes this hard to read is the crappy variable names. I
didn't know what to call them. This version assumes that data will
always be a sequence of 3-element items.

The original version took about 11.5 seconds, the version above takes
just over 5 seconds.

YMMV,
n
 
F

Fredrik Lundh

Generally, don't create objects, don't perform repeated operations. In
this case, batch up I/O.
def write_data1(out, data):
for i in data:
if i[0] is 'ELEMENT':
out.write("%s %06d " % (i[0], i[1]))
for j in i[2]:
out.write("%d " % (j))
out.write("\n")

def write_data1(out, data, map=map, str=str):
SPACE_JOIN = ' '.join
lines = [("ELEMENT %06d " % i1) + SPACE_JOIN(map(str, i2))
for i0, i1, i2 in data if i0 == 'ELEMENT']
out.write('\n'.join(lines))

While perhaps a bit obfuscated, it's a bit faster than the original.
Part of what makes this hard to read is the crappy variable names. I
didn't know what to call them. This version assumes that data will
always be a sequence of 3-element items.

The original version took about 11.5 seconds, the version above takes
just over 5 seconds.

footnote: your version doesn't print the final "\n". here's a variant
that do, and leaves the batching to the I/O subsystem:

def write_data3(out, data, map=map, str=str):
SPACE_JOIN = ' '.join
out.writelines(
"ELEMENT %06d %s\n" % (i1, SPACE_JOIN(map(str, i2)))
for i0, i1, i2 in data if i0 == 'ELEMENT'
)

this runs exactly as fast as your example on my machine, but uses less
memory. and if you, for benchmarking purposes, pass in a "sink" file
object that ignores the data you pass it, it runs in no time at all ;-)

</F>
 
P

Peter Otten

Chris said:
So my question is how can I speed up what's happening inside the
function write_data()? Only allowed to use vanilla python (no psycho or
other libraries outside of a vanilla python install).
def write_data1(out, data):
for i in data:
if i[0] is 'ELEMENT':
out.write("%s %06d " % (i[0], i[1]))
for j in i[2]:
out.write("%d " % (j))
out.write("\n")

# reference, modified to avoid trailing ' '
def write_data(out, data):
for i in data:
if i[0] == 'ELEMENT':
out.write("%s %06d" % (i[0], i[1]))
for j in i[2]:
out.write(" %d" % j)
out.write("\n")

# Norvitz/Lundh
def writelines_data(out, data, map=map, str=str):
SPACE_JOIN = ' '.join
out.writelines(
"ELEMENT %06d %s\n" % (i1, SPACE_JOIN(map(str, i2)))
for i0, i1, i2 in data if i0 == 'ELEMENT'
)

def print_data(out, data):
for name, index, items in data:
if name == "ELEMENT":
print >> out, "ELEMENT %06d" % index,
for item in items:
print >> out, item,
print >> out


import time

data = []
for i in range(500000):
data.append(("ELEMENT", i, (1,2,3,4,5,6)))

for index, write in enumerate([write_data, writelines_data, print_data]):
fname = "test%s.txt" % index
out = open(fname,'w')
start = time.time()
write(out, data)
out.close()
print write.__name__, time.time()-start

for fname in "test1.txt", "test2.txt":
assert open(fname).read() == open("test0.txt").read(), fname

Output on my machine:

$ python2.5 writedata.py
write_data 10.3382301331
writelines_data 5.4960360527
print_data 3.50765490532

Moral: don't forget about good old print. It does have an opcode(*) of its
own, after all.

Peter

(*) or two
 
N

nnorwitz

Peter said:
# Norvitz/Lundh
def writelines_data(out, data, map=map, str=str):
SPACE_JOIN = ' '.join
out.writelines(
"ELEMENT %06d %s\n" % (i1, SPACE_JOIN(map(str, i2)))
for i0, i1, i2 in data if i0 == 'ELEMENT'
)

def print_data(out, data):
for name, index, items in data:
if name == "ELEMENT":
print >> out, "ELEMENT %06d" % index,
for item in items:
print >> out, item,
print >> out

Output on my machine:

$ python2.5 writedata.py
write_data 10.3382301331
writelines_data 5.4960360527
print_data 3.50765490532

Interesting. I timed with python2.4 and get this:

write_data 12.3158090115
writelines_data 5.02135300636
print_data 5.01881980896

A second run yielded:

write_data 11.5980260372
writelines_data 4.8575668335
print_data 4.84622001648

I'm surprised by your numbers a bit because I would expect string ops
to be faster in 2.5 than in 2.4 thanks to /F. I don't remember other
changes that would cause such an improvement for print between 2.4 and
2.5. (2.3 shows print doing a bit better than the times above.)

It could be that the variability is high due to lots of I/O or even
different builds. I'm on Linux.
Moral: don't forget about good old print. It does have an opcode(*) of its
own, after all.

Using print really should be faster as less objects are created.
(*) or two

or 5 :)

$ grep 'case PRINT_' Python/ceval.c
case PRINT_EXPR:
case PRINT_ITEM_TO:
case PRINT_ITEM:
case PRINT_NEWLINE_TO:
case PRINT_NEWLINE:

n
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,982
Messages
2,570,186
Members
46,740
Latest member
JudsonFrie

Latest Threads

Top