Reaping dead children

Mike Dowling · Nov 28, 2004

I'm having problems reaping my dead children.

I have written a perl script that starts a monster program that
determines optimal schedules for my companies power plants. It can
happen that the optimisation software takes hours, months, or years.
Obviously, we would like to interrupt the program when that happens, but
my perl script is blocked until the monster finishes, or until it's
interrupted. So I start a second process with "fork" that writes info
from the log file that the monster is continually writing. I don't want
that second process to become a zombie, so I reap dead children ising:

use POSIX ":sys_wait_h";

$SIG{CHLD} = \&REAPER;

sub REAPER { # reap my dead children to avoid zombies
my $stiff;
while (($stiff = waitpid(-1, WNOHANG)) > 0) { }
$SIG{CHLD} = \&REAPER;
}

Now my child can rest in peace.

But, and here's the problem, if monster terminates with an error
message, I want to cath that, and analyse why. With any kind of
reaping, including

$SIG{CHLD} = 'IGNORE'

(I am really not interested in why my child dies), the status code
returned by monster is changed. I don't care that happens to dead
children started by "fork", but it is vital to check why a child dies
that is started with "system". What can be done?

Cheers,
Mike Dowling

Brian McCauley · Nov 28, 2004

I seem to be dogged with Usenet posting problems. Let's try again...

Mike said:
I'm having problems reaping my dead children.

I have written a perl script that starts a monster program that
determines optimal schedules for my companies power plants. It can
happen that the optimisation software takes hours, months, or years.
Obviously, we would like to interrupt the program when that happens, but
my perl script is blocked until the monster finishes, or until it's
interrupted. So I start a second process with "fork" that writes info
from the log file that the monster is continually writing. I don't want
that second process to become a zombie,

Why not just let the zombies persist until the monster has finished then
explicitly reap using waitpid($pid,0)? No need for handlers.

Ben Morrow · Nov 30, 2004

Quoth (e-mail address removed):

I'm having problems reaping my dead children.

I have written a perl script that starts a monster program that
determines optimal schedules for my companies power plants. It can
happen that the optimisation software takes hours, months, or years.
Obviously, we would like to interrupt the program when that happens, but
my perl script is blocked until the monster finishes, or until it's
interrupted. So I start a second process with "fork" that writes info
from the log file that the monster is continually writing. I don't want
that second process to become a zombie, so I reap dead children ising:

use POSIX ":sys_wait_h";

$SIG{CHLD} = \&REAPER;

sub REAPER { # reap my dead children to avoid zombies
my $stiff;
while (($stiff = waitpid(-1, WNOHANG)) > 0) { }
$SIG{CHLD} = \&REAPER;
}

Now my child can rest in peace.

But, and here's the problem, if monster terminates with an error
message, I want to cath that, and analyse why. With any kind of
reaping, including

$SIG{CHLD} = 'IGNORE'

(I am really not interested in why my child dies), the status code
returned by monster is changed. I don't care that happens to dead
children started by "fork", but it is vital to check why a child dies
that is started with "system". What can be done?

Create the 'monster' child yourself with fork/exec. Then you don't need
the other child: the main program can deal with reading the logfile and
killing it when necessary.

To answer your actual question, $? is only valid *inside* the
sighandler, and if you have one the return value of system is invalid.
So, find out the pid of your 'monster' child (you will need to do the
system by hand with fork/exec to do this), and store it in a global; in
the sighandler, check if $stiff == $monster and if it is stick $? into
another global that you can look at back in the main program. You *will*
find you get nasty effects if you have a waitpid($monster) in the main
program and a waitpid(-1) in the sighandler running at the same time;
I'd probably use eval { POSIX:

ause } in the main program and then die
in the sighandler if we've got the monster child.

Yes, all this business with globals is pretty nasty; but that's signals
for you, I'm afraid...

. You may be able to make the sighandler a
closure: something like

sub run_monster {
my $monster_pid = ...;
my $monster_status;

$SIG{CHLD} = sub {
...;
if ($stiff == $monster_pid) {
$monster_status = $?;
die;
}
};
}

but I haven't tested that.

Ben

problem in POSIX module with handling SIGCHLD	1	Jul 15, 2005
Marc the Reaper	9	Sep 15, 2010
killing my children	8	Jun 7, 2004
Fork (and exec) in a threaded script.	4	Aug 16, 2011
Dealing With Zombies in a Daemon	3	Aug 30, 2005
perlipc doc / another race (?)	1	Jun 27, 2005
help with timed command, CHLD signals, return codes	5	Apr 12, 2006
This is strange!!! What happens in here???	3	Jan 12, 2007

Reaping dead children

Mike Dowling

Brian McCauley

Ben Morrow

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads