killing my children

O

Oliver

hi all -
I'm struggling to reap my children.

I have a parent process that is forking to do work in parallel.

I need to keep track and manage the number of child processes.

I am doing this by creating a communication link between child and
parent using "socketpair" - the idea being that the parent can check
on the state of the children by checking the state of the handle to
the child (* comments/crits. on this approach welcome).

all this avoids signals - yipee!! - will nearly :( - I find I have to
have a signal handler anyway to read my children.

I don't really want to do this - so I have tried: $SIG{CHILD} =
'IGNORE';

... but whatever I do I seem to frequently (but not always) get
children hanging around after the parent dies.

I've tried every variant on the usual reaper code imaginable - this
sort of thing:

sub REAPER {
my $child;
print "reaping - \n";
while ((my $waitedpid = waitpid(-1,WNOHANG)) > 0) {
print "reaped $waitedpid" . ($? ? " with exit $?" : '') .
"\n";
}

$SIG{CHLD} = \&REAPER; # loathe sysV
}

I've also tried to make sure that the handles created by socketpair
are closed/undefined - but no joy.

any ideas out there?

cheers,
Oli
 
C

ctcgag

hi all -
I'm struggling to reap my children.

I have a parent process that is forking to do work in parallel.

I need to keep track and manage the number of child processes.

I am doing this by creating a communication link between child and
parent using "socketpair" - the idea being that the parent can check
on the state of the children by checking the state of the handle to
the child (* comments/crits. on this approach welcome).

all this avoids signals - yipee!! - will nearly :( - I find I have to
have a signal handler anyway to read my children.

I assume you meant to type "reap", not "read"?
I don't really want to do this - so I have tried: $SIG{CHILD} =
'IGNORE';

... but whatever I do I seem to frequently (but not always) get
children hanging around after the parent dies.

You have this backwards. If you don't reap your children, they will become
zombies, and hang around until the parents exists. Reaping your children
when they happen to die is not related to killing your children when you
die.
any ideas out there?

The two easiest that spring to mind are storing the PID of your children
and then using "kill" to kill them before you die; or better IMHO is to
have the children check on the pipe from the parent, and then exit
gracefully when the parent pipe closes.

But without knowing more about what you are doing, and more specifically,
why the existing parallel modules on CPAN aren't an option for you, it is
hard to give more specific advice.

Xho
 
B

Ben Morrow

Quoth (e-mail address removed) (Oliver):
I have a parent process that is forking to do work in parallel.

I need to keep track and manage the number of child processes.

I am doing this by creating a communication link between child and
parent using "socketpair" - the idea being that the parent can check
on the state of the children by checking the state of the handle to
the child (* comments/crits. on this approach welcome).

If you need this to communicate with the child, fine. If you just have
it to see when the child dies, there's no need: you can just keep a list
of child pids and check them with

waitpid $pid, WNOHANG
all this avoids signals - yipee!! - will nearly :( - I find I have to
have a signal handler anyway to read my children.

Avoiding signals is not necessarily a good thing: they are there to tell
you about asynchronous events, such as a child exiting. If you use the
signal, then there's no need to check whether a child has exited or not:
if and when it does, you'll be told. OTOH, if you need or want to check
anyway, there's no need for the signal so you can ignore it.
I don't really want to do this - so I have tried: $SIG{CHILD} =
'IGNORE';

If you have a system this works on (most modern Unices), and the parent
doesn't need to know when the children die, this is the best solution.
Note that the signal is called CHLD or CLD, not CHILD.
... but whatever I do I seem to frequently (but not always) get
children hanging around after the parent dies.

Ah... you have misunderstood something crucial :). SIGCHLD is only sent
when a child *dies*: i.e., when it calls exit or is sent a fatal signal.
If your children have died and become zombies (because you didn't reap
them) and the parent exits, they will be inherited by init which will
clean them up for you. If your children are outliving their parent, then
they never exited: you need to make sure they do when they've finished
their work.

If you want to kill all your children when the parent dies, you will
need to do this explicitly: something like

my @kids;
my $parent = $$;

END {
if ($parent = $$) {
kill TERM => $_ for @kids;
}
}

where you populate @kids with all your child pids. You can then set up a
SIGTERM handler in the child if you want to catch this event and do some
cleanup (and then exit).
I've tried every variant on the usual reaper code imaginable - this
sort of thing:

sub REAPER {
my $child;
print "reaping - \n";
while ((my $waitedpid = waitpid(-1,WNOHANG)) > 0) {
print "reaped $waitedpid" . ($? ? " with exit $?" : '') .
"\n";
}

$SIG{CHLD} = \&REAPER; # loathe sysV

There's no need to loath SysV unless you're using it: on any modern
system you almost certainly have reliable signals. You can check with

perl -V:d_sigaction

: if this shows 'define' then perl is using reliable signals. If it
doesn't, then you need to read your signal(2) to find out if you have
BSDish or SysVish signal handling. (Of course, if you want
ultra-portability to ancient systems then you can't assume you have
reliable signals; but that's always the price of portablility.)

You should also consider using one of the modules on CPAN for this task.

Ben
 
O

Oliver

ok - thanks very much - I still have problems getting my children to
exit - all works ok if I use POSIX::_exit :(

problem - children not going away when they've finished their work.

solution: fork as per normal - then keep track like this:

sub is_alive {
my $pid = shift;
my $status = waitpid $pid, WNOHANG;

return ($status == -1) ? 0 : 1;
}

and use SIG{CHLD} = 'IGNORE'
it works!... nearly...

it only works if the child exits like this:

POSIX::_exit(0); # force exit now

...this is the crux of my problem.
if I use normal perl 'exit' then the children hang around - and my
parent process has also stopped exiting too.

I guess what's happening is that some DESTROY or END method is hanging
- and _exit skips these.

Each child is doing this and only this:

my $smtp = Net::SMTP->new('mail', Timeout => 60) || die $!;
$smtp->mail($from) or die "net_smtp mail from failed";
$smtp->to($to) or die "net_smtp rcpt to failed";
$smtp->data($data) or die "net_smtp data failed";
$smtp->quit() or die "net_smtp quit failed";
undef($smtp);

anyone got any idea why Net::SMTP is causing this, or whether
POSIX:_exit is likely to cause me problems ??
 
C

ctcgag

it only works if the child exits like this:

POSIX::_exit(0); # force exit now

..this is the crux of my problem.
if I use normal perl 'exit' then the children hang around - and my
parent process has also stopped exiting too.

I guess what's happening is that some DESTROY or END method is hanging
- and _exit skips these.

If the child inherits large data structures from the parent, it can
take a *long* time to clean these up upon normal exit.

Are the children burning CPU, or just sitting around idle?

Xho
 
O

Oliver

If the child inherits large data structures from the parent, it can
take a *long* time to clean these up upon normal exit.

ahh - I feel enlightened - there shouldn't be large datastructures,
but maybe filehandles...or something
Are the children burning CPU, or just sitting around idle?

just sitting around idle...
 
B

Ben Morrow

Quoth (e-mail address removed) (Oliver):
ok - thanks very much - I still have problems getting my children to
exit - all works ok if I use POSIX::_exit :(

problem - children not going away when they've finished their work.

solution: fork as per normal - then keep track like this:

sub is_alive {
my $pid = shift;
my $status = waitpid $pid, WNOHANG;

return ($status == -1) ? 0 : 1;

This is wrong: waitpid with WNOHANG returns 0 if the child isn't dead.

return $status != -1 && $status;
}

and use SIG{CHLD} = 'IGNORE'
it works!... nearly...

it only works if the child exits like this:

POSIX::_exit(0); # force exit now

..this is the crux of my problem.
if I use normal perl 'exit' then the children hang around - and my
parent process has also stopped exiting too.

I guess what's happening is that some DESTROY or END method is hanging
- and _exit skips these.

Note that children will *also* call any END blocks they've inherited
from the parent: if one of these is not expecting to be run in the
child, it may be hanging because some needed resource is no longer
available.

You can avoid this be recording the pid in tha parent and checking in
the END block that that is still your pid.
or whether
POSIX:_exit is likely to cause me problems ??

If your children contain any objects (created or inherited) that need
cleanup, then that cleanup won't occur. You will need to look at the
code to see whether that will be a problem (I would suggest that if the
children are inheriting complex objects they don;t need, you refactor
the code so that these are not created until after you fork...).

Ben
 
N

Nick Howes

(e-mail address removed) (Oliver) wrote:

You have this backwards. If you don't reap your children, they will become
zombies, and hang around until the parents exists. Reaping your children
when they happen to die is not related to killing your children when you
die.

I'm not trolling or anything, it just makes me laugh reading stuff like
this... imagining if somebody who didn't know what perl was getting the
wrong end of the stick of this thread :)
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,997
Messages
2,570,240
Members
46,830
Latest member
HeleneMull

Latest Threads

Top