Do you think multiprocessing can help you seriously ?
Can you benefit from multiple cpu ?
did you try to enhance your code with numpy ?
Olivier
(installed a backported multiprocessing on his 2.5.1 Python, but need
installation of Xcode first)
Multithreading / multiprocessing can help me with my problem. As you
know, database reading is typically I/O bound so it helps to put it in
a separate thread. I might not even notice the GIL if I used SQL
access in the first place. As it is, DBFPY is pretty CPU intensive
since it's a pure Python DBF implementation.
To continue: the second major stage (summary calculations) is
completely CPU bound. Using numpy might or might not help with it.
Those are simple calculations, mostly additions. I try not to put the
entire database in arrays to save memory and so I mostly just add
counters where I can. Soe functions simply require arrays, but they
are more rare, so I guess I'm safe with that. You wouldn't believe how
complex some reports can be. Threading + memory saving is a must and
even so, I'll probably have to implement some sort of serialization
later on, so that the stuff can run on more memory constrained
devices.
The third major stage, rendering engine, is again mostly CPU bound,
but at the same time it's I/O bound as well when outputting the
result.
All three major parts are more or less independent from each other and
can run simultaneously, just with a bit of a delay. I can perform
calculations while waiting for the next record and I can also start
rendering immediately after I have all the data for the first group
available.
I may use multiprocessing, but I believe it introduces more
communication overhead than threads and am so reluctant to go there.
Threads were perfect, other stuff wasn't. To make things worse, no
particular extension / fork / branch helps me here. So if I wanted to
just do the stuff in Python, I'd have to move to Jthon or IronPython
and hope cPython eventually improves in this area. I do actually need
cPython since the other two aren't supported on all platforms my
company intends to support.
The main issue I currently have with GIL is that execution time is
worse when I use threading. Had it been the same, I wouldn't worry too
much about it. Waiting for a permenent solution would be much easier
then...