A problem with fork() and managing processes

M

Michele Dondi

I hope that the subject is not too imprecise and that this is not too
OT (I am aware that part of the issue is OS-specific rather than
Perlish)...

I was experimenting with fork(), signals, and other process management
functions in perl. I am reproducing a minimal example here:


#!/usr/bin/perl -l

use strict;
use warnings;

sub busy ();
sub rest();

sleep 5;
busy;

sub busy () { # Let the fun begin!! ;-)
my $forked=0;
my @family;
for (1..7) {
defined (my $pid=fork) or
warn "Couldn't fork: $!\n";
$forked *= 2;
if ($pid) {
push @family, $pid;
} else {
@family=();
$forked++;
}
}

local $SIG{TERM} = sub {
kill 15, @family;
waitpid $_, 0 for @family;
die "[$forked:$$] Stopping now: ",
"signaled (@family)\n";
};

while (1) {
next if $forked;
kill 15, @family;
goto &rest;
}
}

sub rest () {
sleep 10 while 1;
}

__END__


It should (i) fork() up to 127 copies of itself, (ii) kill them and
then go to rest.

Now, *generally* it works, *but* (i) it still leaves around exactly
seven <defunct> childs; (ii) what's worst, *in some cases*
(appearently at random), it leaves a small but variable number of
active childs too.


Michele
 
A

Anno Siegel

Michele Dondi said:
I hope that the subject is not too imprecise and that this is not too
OT (I am aware that part of the issue is OS-specific rather than
Perlish)...

I was experimenting with fork(), signals, and other process management
functions in perl. I am reproducing a minimal example here:


#!/usr/bin/perl -l

use strict;
use warnings;

sub busy ();
sub rest();

sleep 5;
busy;

sub busy () { # Let the fun begin!! ;-)
my $forked=0;
my @family;
for (1..7) {
defined (my $pid=fork) or
warn "Couldn't fork: $!\n";
$forked *= 2;
if ($pid) {
push @family, $pid;
} else {
@family=();
$forked++;
}
}

local $SIG{TERM} = sub {
kill 15, @family;
waitpid $_, 0 for @family;
die "[$forked:$$] Stopping now: ",
"signaled (@family)\n";
};

while (1) {
next if $forked;
kill 15, @family;
goto &rest;
}
}

sub rest () {
sleep 10 while 1;
}

__END__


It should (i) fork() up to 127 copies of itself, (ii) kill them and
then go to rest.

Now, *generally* it works, *but* (i) it still leaves around exactly
seven <defunct> childs; (ii) what's worst, *in some cases*
(appearently at random), it leaves a small but variable number of
active childs too.

You have no SIGCHLD handler anywhere, so the seven zombies are simply
the seven original children of your main process. They *must* be
zombies as long as the parent runs. The second-generation processes
don't become zombies because their parents are dead.

I don't know about the active processes, but that may clear itself
up when you fix the missing handler.

Anno
 
M

Michele Dondi

I hope that the subject is not too imprecise and that this is not too
OT (I am aware that part of the issue is OS-specific rather than
Perlish)...

I was experimenting with fork(), signals, and other process management
functions in perl. I am reproducing a minimal example here:

Any idea?

Note: if you think that it would be appropriate, I can make a followup
restating the problem and crossposting it to an OS specific newsgroup.
Any recommendation?


Michele
 
M

Michele Dondi

local $SIG{TERM} = sub {
kill 15, @family;
waitpid $_, 0 for @family; [snip]
next if $forked;
kill 15, @family;
goto &rest;
You have no SIGCHLD handler anywhere, so the seven zombies are simply
the seven original children of your main process. They *must* be
zombies as long as the parent runs. The second-generation processes
don't become zombies because their parents are dead.

First of all let me thank you so much for answering. As it must be
clear I'm trying my hand at 'this kind of things' but I don't have
much specific knowledge about them. So all in all this experience is
revealing to be very instructive and rewarding...
I don't know about the active processes, but that may clear itself
up when you fix the missing handler.

I must admit that it took me some time to understand what you meant,
or rather to understand why it is actually so. In fact I did read some
relevant docs (both from signal(7) and 'perldoc perlipc').

I had naively assumed that signaling on the one hand and handling
signals on the other one would have been all that was needed. I added
the

waitpid $_, 0 for @family;

line "only" because I thought it would have been better to die only
after all of one's childs are gone.

Following your advice I was about to add

local $SIG{CHLD} = sub { waitpid $_, 0 for @family };

too, when I guessed that maybe I could simply do for the firestarting
code what I was already doing for the propagation one instead. So I
*tried* adding a

waitpid $_, 0 for @family;

line soon after the second

kill 15, @family;

in the code quoted above and... et voila', indeed it does seem to work
as expected. Do you think I should also install a SIGCHLD handler
anyway?

Also, as far as the randomic persistence of non-zombie instances goes,
I *think* that it may be related to the process removal phase
beginning too early, possibly "interfering" with the "diffusion" one
which may not be terminated yet for some of the forked processes: to
be fair I cannot see clearly what was going on, but indeed even before
making the modifications described above I tried adding

sleep 5;

soon after

next if $forked;

and that solved the problem in a repeated series of tests (in a
perfectly identical situation to the previous ones): i.e. I "only" got
the seven little zombies. And now they're gone too!

As a qualitative observation, it *seems* to me that if I leave out the
C<sleep 5;> statement in the version that does correct child
management, also these randomic misbehaviours are more rare -
certainly not absent! but I wouldn't swear on this.

Now I don't know if 5 seconds is appropriate or if there are
alternative stategies at all (I mean: that simple!) but as I said it
seems to work well in all tests I've done so far.

For reference the complete minimal example program now looks like
this:


#!/usr/bin/perl -l

use strict;
use warnings;

sub busy ();
sub rest();
sub killnwait;

sleep 5;
busy;

sub busy () { # Let the fun begin!! ;-)
my $forked=0;
my @family;
for (1..7) {
defined (my $pid=fork) or
warn "Couldn't fork: $!\n";
$forked *= 2;
if ($pid) {
push @family, $pid;
} else {
@family=();
$forked++;
}
}

local $SIG{TERM} = sub {
killnwait @family;
die "[$forked:$$] Stopping now: ",
"signaled (@family)\n"
};

while (1) {
next if $forked;
sleep 5;
killnwait @family;
goto &rest;
}
}

sub rest () {
sleep 10 while 1;
}

sub killnwait {
kill 15, @_;
waitpid $_, 0 for @_;
}

__END__


Michele
 
X

xhoster

Michele Dondi said:
local $SIG{TERM} = sub {
kill 15, @family;
waitpid $_, 0 for @family;

Here you wait for your family.
die "[$forked:$$] Stopping now: ",
"signaled (@family)\n";
};

while (1) {
next if $forked;
kill 15, @family;

Here you don't.
goto &rest;
}
}

sub rest () {
sleep 10 while 1;
}

__END__

It should (i) fork() up to 127 copies of itself, (ii) kill them and
then go to rest.

Now, *generally* it works, *but* (i) it still leaves around exactly
seven <defunct> childs; (ii) what's worst, *in some cases*
(appearently at random), it leaves a small but variable number of
active childs too.

Xho
 
X

xhoster

Michele Dondi said:
Also, as far as the randomic persistence of non-zombie instances goes,
I *think* that it may be related to the process removal phase
beginning too early, possibly "interfering" with the "diffusion" one
which may not be terminated yet for some of the forked processes:

You don't install the sig term handler until after the for loop. The
parent finishes all seven trips through the loop and starts killing its
family. But some members of the family maybe haven't quite finished their
trips through the loop at the time they are sent a TERM by the parent.
Since they haven't yet installed the handler, they don't execute the
handler code. So *their* children never get killed. I think that just
moving the handler assingment to before the for loop should fix the
problem.

Xho
 
M

Michele Dondi

You don't install the sig term handler until after the for loop. The
parent finishes all seven trips through the loop and starts killing its
family. But some members of the family maybe haven't quite finished their
trips through the loop at the time they are sent a TERM by the parent.
Since they haven't yet installed the handler, they don't execute the
handler code. So *their* children never get killed. I think that just
moving the handler assingment to before the for loop should fix the
problem.

Thank you very much for the suggestion. Moving it before the loop now
seems the Right Thing(TM) anyway. But it didn't totally remove the
problem, even if it became certainly less frequent. So I simply left
the sleep() statement in place, and won't bother any more, especially
since the *real* code will have to do something similar anyway...


Michele
 
M

Michele Dondi

kill 15, @family;
waitpid $_, 0 for @family;

Here you wait for your family.
die "[$forked:$$] Stopping now: ",
"signaled (@family)\n";
};

while (1) {
next if $forked;
kill 15, @family;

Here you don't.

As you can see by my other post eventually I realized/learned this
myself (with the help of Anno, of course!) In practice due to my
absymal ignorance on the subject I didn't even know it to be
necessary. I did it in the first snippet 'cause I thought it would
have been sensible to wait for one's family *before committing
suicide*. Hopefully now I'm slightly less ignorant!


Michele
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,164
Messages
2,570,898
Members
47,439
Latest member
shasuze

Latest Threads

Top