K
kj
I have a script that calls the function write_tmpfile, which looks
something like this:
def write_tmpfile(f, tmpfile):
# set-up code omitted
in_f = popen("""grep -v '^\\[eof\\]$' %s |\
grep '[^[:space:]]' |\
sort -u""" % f)
out_f = open(tmpfile, 'w')
try:
while 1:
line = in_f.readline()
if not line: break
# i omit the code that munges line
out_f.write(line)
finally:
in_f.close()
out_f.close()
The script calls this function several thousand times. (The average
size of the input file f is 70K lines (0.5MB); the maximum size is
about 35M lines, or 200MB.) This function works perfectly most of
the time, but it deadlocks sporadically. (And it's a deadlock!
The script can be stuck for hours, until I kill it.)
I can't say for sure where the deadlock is happening (and I'd
appreciate suggestions on how to pinpoint this), but I *think* it
is at the in_f.readline() statement. So maybe the problem is with
the pipe. (But FWIW, I've used exactly the same pipe in another
script that processes the same set of files (but does not write a
temporary file when it does this), and this script terminates
without any problem. I.e. the input files are not too large for
the pipe.)
I suppose that I could use some timeout mechanism to unwedge the
script when it deadlocks and then repeat the call to write_tmpfile,
but I'd prefer to avoid the deadlock in the first place.
I'd appreciate suggestions on how to troubleshoot and debug this
thing.
TIA!
Kynn
something like this:
def write_tmpfile(f, tmpfile):
# set-up code omitted
in_f = popen("""grep -v '^\\[eof\\]$' %s |\
grep '[^[:space:]]' |\
sort -u""" % f)
out_f = open(tmpfile, 'w')
try:
while 1:
line = in_f.readline()
if not line: break
# i omit the code that munges line
out_f.write(line)
finally:
in_f.close()
out_f.close()
The script calls this function several thousand times. (The average
size of the input file f is 70K lines (0.5MB); the maximum size is
about 35M lines, or 200MB.) This function works perfectly most of
the time, but it deadlocks sporadically. (And it's a deadlock!
The script can be stuck for hours, until I kill it.)
I can't say for sure where the deadlock is happening (and I'd
appreciate suggestions on how to pinpoint this), but I *think* it
is at the in_f.readline() statement. So maybe the problem is with
the pipe. (But FWIW, I've used exactly the same pipe in another
script that processes the same set of files (but does not write a
temporary file when it does this), and this script terminates
without any problem. I.e. the input files are not too large for
the pipe.)
I suppose that I could use some timeout mechanism to unwedge the
script when it deadlocks and then repeat the call to write_tmpfile,
but I'd prefer to avoid the deadlock in the first place.
I'd appreciate suggestions on how to troubleshoot and debug this
thing.
TIA!
Kynn