Dealing With Zombies in a Daemon

H

Hal Vaughan

I have a Perl program that runs as a daemon, with a specified loop, doing
certain things at certain seconds of the minute (like at 5 seconds after
the start of the minute it runs one set of routines, 20 seconds later
another set, and 30 seconds after that, another set, sleeping between each
set of routines). At the end of the minute, I have it reaping any zombies
that have been created. One thing it does is launch programs. To do this,
I fork() and run the program with backticks around the program name. For
some reason, after I've launched a program, if I call the reaper() routine
(below) before the launched program is done, the daemon will freeze in
reaper() until the launched program finishes.

So here's a few questions:

1) Do the unreaped zombies actually take up memory space, or just a number
in the list of PIDs?
2) I understand I can reap by setting $SIG{CHLD} to IGNORE, is that by just
using an assignment ($SIG{CHLD} = IGNORE), and how well does it work? The
name makes me think it just ignores zombies without doing anything.
3) Am I doing something wrong in the routines below, or is there a better
way to do this?
4) Why is the program hanging and waiting for the program that is still
running?

Thanks!

Hal


Subroutines:

#In the start, I have:

use POSIX ":sys_wait_h";
our $zombies = 0;
$SIG{CHLD} = sub{ $zombies++ };

Then I have a routine I call at the end of each loop the daemon makes:

sub reaper {
my (%Kid_Status)
if ($zombies) {
print "Zombie count: $zombie\n";
while (($zombie = waitpid(-1, WNOHANG)) != -1) {
$Kid_Status{$zombie} = $?;
}
}
return;
}
 
X

xhoster

For some reason, after I've launched a program, if I call
the reaper() routine (below) before the launched program is done, the
daemon will freeze in reaper() until the launched program finishes.

So here's a few questions:

1) Do the unreaped zombies actually take up memory space, or just a
number in the list of PIDs?

Yes, they take up space. Even just numbers in a list are still taking up
space. And not just any space, but space in a very important, finite
kernel memory structure.
2) I understand I can reap by setting $SIG{CHLD} to IGNORE, is that by
just using an assignment ($SIG{CHLD} = IGNORE), and how well does it

You need quotes around IGNORE.
work? The name makes me think it just ignores zombies without doing
anything.

In my experience, it works very well. I don't know exactly what you think
"just ignoring zombies without doing anything" means, so I don't know if
you are right. Obviously, this does something different than what not
setting SIG{CHLD} to IGNORE does.

3) Am I doing something wrong in the routines below, or is
there a better way to do this?

Yes, you are doing something wrong.
4) Why is the program hanging and waiting for the program that is still
running?

If waitpid returns a zero meaning that children exist but are not ready for
reaping, which it is documented to do on some systems, then your code
enters a tight loop. Rather, do,
while (($zombie = waitpid(-1, WNOHANG)) <= 0) {
sub reaper {
my (%Kid_Status)

What is the point of making a hash that never does anything useful?
Also, that is a syntax error.
if ($zombies) {
print "Zombie count: $zombie\n";
while (($zombie = waitpid(-1, WNOHANG)) != -1) {
$Kid_Status{$zombie} = $?;
}
}
return;
}

Xho
 
D

Dan Wilga

Hal Vaughan said:
1) Do the unreaped zombies actually take up memory space, or just a number
in the list of PIDs?

Both. The worst part is that most operating systems have a limit on the
number of processes that is lower than the range of possible PIDs. When
you hit this limit, weird things start to happen, because all of the
various daemons can no longer fork.
2) I understand I can reap by setting $SIG{CHLD} to IGNORE, is that by just
using an assignment ($SIG{CHLD} = IGNORE), and how well does it work? The
name makes me think it just ignores zombies without doing anything.

That's exactly what it does. Technically, it ignores children that
terminate. They never have time to become zombies.
3) Am I doing something wrong in the routines below, or is there a better
way to do this?

Instead of cleaning up at set intervals, why not just do it in the
$SIG{CHLD} handler?

$SIG{CHLD} = \&reaper;
....
sub reaper {
1 while( ($waitedpid = waitpid( -1, WNOHANG )) > 0 );
$SIG{CHLD} = \&reaper; # apparently needed on some OSes
}
4) Why is the program hanging and waiting for the program that is still
running?

That, I'm not so sure about. It could be because your OS is one that
requires the signal handler to be reset in %SIG every time the signal is
caught. My example shows this, as does `perldoc perlipc`.
 
H

Hal Vaughan

Dan said:
Both. The worst part is that most operating systems have a limit on the
number of processes that is lower than the range of possible PIDs. When
you hit this limit, weird things start to happen, because all of the
various daemons can no longer fork.


That's exactly what it does. Technically, it ignores children that
terminate. They never have time to become zombies.


Instead of cleaning up at set intervals, why not just do it in the
$SIG{CHLD} handler?

$SIG{CHLD} = \&reaper;
...
sub reaper {
1 while( ($waitedpid = waitpid( -1, WNOHANG )) > 0 );
$SIG{CHLD} = \&reaper; # apparently needed on some OSes
}


That, I'm not so sure about. It could be because your OS is one that
requires the signal handler to be reset in %SIG every time the signal is
caught. My example shows this, as does `perldoc perlipc`.

I will check on that. I forgot that I should have mentioned my OS (Linux).

Thanks!

Hal
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,997
Messages
2,570,240
Members
46,830
Latest member
HeleneMull

Latest Threads

Top