M
malkarouri
Hi everyone,
I have written a function that runs functions in separate processes. I
hope you can help me improving it, and I would like to submit it to
the Python cookbook if its quality is good enough.
I was writing a numerical program (using numpy) which uses huge
amounts of memory, the memory increasing with time. The program
structure was essentially:
for radius in radii:
result = do_work(params)
where do_work actually uses a large number of temporary arrays. The
variable params is large as well and is the result of computations
before the loop.
After playing with gc for some time, trying to convince it to to
release the memory, I gave up. I will be happy, by the way, if
somebody points me to a web page/reference that says how to call a
function then reclaim the whole memory back in python.
Meanwhile, the best that I could do is fork a process, compute the
results, and return them back to the parent process. This I
implemented in the following function, which is kinda working for me
now, but I am sure it can be much improved. There should be a better
way to return the result that a temporary file, for example. I
actually thought of posting this after noticing that the pypy project
had what I thought was a similar thing in their testing, but they
probably dealt with it differently in the autotest driver [1]; I am
not sure.
Here is the function:
def run_in_separate_process(f, *args, **kwds):
from os import tmpnam, fork, waitpid, remove
from sys import exit
from pickle import load, dump
from contextlib import closing
fname = tmpnam()
pid = fork()
if pid > 0: #parent
waitpid(pid, 0) # should have checked for correct finishing
with closing(file(fname)) as f:
result = load(f)
remove(fname)
return result
else: #child
result = f(*args, **kwds)
with closing(file(fname,'w')) as f:
dump(result, f)
exit(0)
To be used as:
for radius in radii:
result = run_in_separate_process (do_work, params)
[1] http://codespeak.net/pipermail/pypy-dev/2006q3/003273.html
Regards,
Muhammad Alkarouri
I have written a function that runs functions in separate processes. I
hope you can help me improving it, and I would like to submit it to
the Python cookbook if its quality is good enough.
I was writing a numerical program (using numpy) which uses huge
amounts of memory, the memory increasing with time. The program
structure was essentially:
for radius in radii:
result = do_work(params)
where do_work actually uses a large number of temporary arrays. The
variable params is large as well and is the result of computations
before the loop.
After playing with gc for some time, trying to convince it to to
release the memory, I gave up. I will be happy, by the way, if
somebody points me to a web page/reference that says how to call a
function then reclaim the whole memory back in python.
Meanwhile, the best that I could do is fork a process, compute the
results, and return them back to the parent process. This I
implemented in the following function, which is kinda working for me
now, but I am sure it can be much improved. There should be a better
way to return the result that a temporary file, for example. I
actually thought of posting this after noticing that the pypy project
had what I thought was a similar thing in their testing, but they
probably dealt with it differently in the autotest driver [1]; I am
not sure.
Here is the function:
def run_in_separate_process(f, *args, **kwds):
from os import tmpnam, fork, waitpid, remove
from sys import exit
from pickle import load, dump
from contextlib import closing
fname = tmpnam()
pid = fork()
if pid > 0: #parent
waitpid(pid, 0) # should have checked for correct finishing
with closing(file(fname)) as f:
result = load(f)
remove(fname)
return result
else: #child
result = f(*args, **kwds)
with closing(file(fname,'w')) as f:
dump(result, f)
exit(0)
To be used as:
for radius in radii:
result = run_in_separate_process (do_work, params)
[1] http://codespeak.net/pipermail/pypy-dev/2006q3/003273.html
Regards,
Muhammad Alkarouri