Kernel warnings about wait from Perl daemon

S

Simon Andrews

I'm running a small daemon written in Perl which starts of a load of
jobs and keeps track of when they've finished. It's been running for
ages and works well, but following an OS upgrade its now spitting out
kernel warnings.

The script watches an Input directory for new files, if it finds them it
adds them to a queue and moves them to an output directory. It then
starts a separate processing routine. It keeps track of how many jobs
it is running and holds a queue for the rest.

Because I don't know what order the jobs will finish in I set
SIG{'CHLD'} = 'IGNORE' in the script to avoid picking up zombies and
dispense with wait. However I am now getting kernel warnings saying:

Jan 15 15:19:20 bilin3 kernel: application bug: cgi_blast_demon(2418)
has SIGCHLD set to SIG_IGN but calls wait().

I've cut the script down as much as I can whilst still preserving it's
basic structure. Can anyone see what would be calling wait here to
cause this error? Could I be getting this if the exec'd script calls
wait instead (it does, as it uses a system command)?

Any help is appreciated.

Simon.

#!/usr/bin/perl -w
use strict;

$SIG{CHLD} = 'IGNORE';

my %pids;
my @queue;

########################################
####### MAIN LOOP ######################
########################################

# We're a daemon so we should loop forever!
while (1) {

my @inputlist = `ls -1tr /data/temp/www/blast/Input`;

chomp @inputlist;

foreach my $id (@inputlist){
push (@queue,$id);
}


# Now check our existing processes are running
foreach my $pid (keys(%pids)){
delete $pids{$pid} unless (kill (0,$pid));
}

while (scalar @queue){

# We should start a new job off and add it's pid
# to the %pids hash

my $new_job = shift @queue;

rename("/data/Input/$new_job","/data/Output/$new_job") || die
"Can't move $new_job to Outout dir: $!";

my $new_pid = fork;

if ($new_pid){
# We're the parent, so we add the pid to the
# list of running PIDs
$pids{$new_pid}=$new_job;
}

else {
# We're the child
# Move to the correct directory
chdir("/data/Output/$new_job") || die "Can't move to
/data/BlastCGI/Output/$new_job : $!";
exec ("/usr/local/bin/actually_do_job.pl $new_job");
}
} # End lauch new jobs

sleep 3;
}
 
B

Brian McCauley

Simon Andrews said:
However I am now getting kernel warnings saying:

Jan 15 15:19:20 bilin3 kernel: application bug: cgi_blast_demon(2418)
has SIGCHLD set to SIG_IGN but calls wait().

I've cut the script down as much as I can whilst still preserving it's
basic structure. Can anyone see what would be calling wait here to
cause this error?
$SIG{CHLD} = 'IGNORE';
my @inputlist = `ls -1tr /data/temp/www/blast/Input`;

Backticks set $? which means they must call wait().

--
\\ ( )
. _\\__[oo
.__/ \\ /\@
. l___\\
# ll l\\
###LL LL\\
 
I

Ilya Zakharevich

[A complimentary Cc of this posting was sent to
Brian McCauley
Backticks set $? which means they must call wait().

<pedantic>

True on legacy systems only, which *must* use fork() to start a child.

</pedantic>

[Of course, the system in question *is* legacy...]

Hope this does not hurt,
Ilya
 
B

Ben Morrow

Simon Andrews said:
Because I don't know what order the jobs will finish in I set
SIG{'CHLD'} = 'IGNORE' in the script to avoid picking up zombies and
dispense with wait. However I am now getting kernel warnings saying:

Jan 15 15:19:20 bilin3 kernel: application bug: cgi_blast_demon(2418)
has SIGCHLD set to SIG_IGN but calls wait().

Set a proper SIGCHLD, then; something like

use POSIX qw/:sys_wait_h/;

$SIG{CHLD} = sub { 1 while 0 < waitpid -1, WNOHANG };

Ben
 
P

pkent

Ilya Zakharevich said:
<pedantic>
True on legacy systems only, which *must* use fork() to start a child.
</pedantic>

Slightly off-topic, but could you expand on this 'not using a fork to
start a child', or point me to a URL. Unless you're talking about
starting processes on Windows or MacOS which are a whole different way
of doing things.

P
 
I

Ilya Zakharevich

[A complimentary Cc of this posting was sent to
pkent
Slightly off-topic, but could you expand on this 'not using a fork to
start a child', or point me to a URL. Unless you're talking about
starting processes on Windows or MacOS which are a whole different way
of doing things.

[This was blind-Cc-ed; please do not do this again. If you Cc a
newsgroup posting, explicitly mark it as such. Thanks.]

No OS I know is *consistently* designed using the baggage of knowledge
available, say, 20 years ago (probably the same is true if you replace
20 by 35; sigh). But in the domain of child startup most systems of
vintage later than 1970 use the same paradigm; actually, the same
paradigm underlines the Perl's system() too. Unix is the notable
exception - the fact that its design was mostly a prank lets itself
be known...

Hope this helps,
Ilya
 
S

Simon Andrews

Brian said:
Backticks set $? which means they must call wait().
Yes! That's it. I should have known that my moment of laziness of
using a shell to get a list of files would come back and bite me :)

I replaced the above line with:

my @rawlist = </data/temp/www/blast/Input/*>;

chomp @rawlist;

@rawlist = sort {(stat($a))[9] <=> (stat($b))[9]} @rawlist;

# Now remove the path from the list to
# leave just the job ids
my @inputlist;

foreach (@rawlist){
s/^.*\///;
push @inputlist,$_;
}

...and all the warnings went away.

Cheers.

Simon.
 
M

Michele Dondi

I replaced the above line with:

my @rawlist = </data/temp/www/blast/Input/*>;

chomp @rawlist;

For heaven's sake, why?!?

You may well have filenames containing \n as the last char, but
*then*...
@rawlist = sort {(stat($a))[9] <=> (stat($b))[9]} @rawlist;

....stat() would either fail (if the "chomp()ed file" doesn't exist -
most probably!) or stat the wrong file.
# Now remove the path from the list to
# leave just the job ids

IMNSHO (wrt this particular topic) you should avoid redundant comments
to self-explanatory code.
my @inputlist;

foreach (@rawlist){
s/^.*\///;
push @inputlist,$_;
}

You could save yourself the annoyance of quoting "/" by e.g.

s|.*/||

*Personally*, I'd rather do

my @inputlist=map { (my $tmp=$_) =~ s|.*/||; $tmp } @rawlist;

that's because the semantic significance of map() is straightforward
to me, but that is a personal thing, I guess.

Also, you may want to use File::Basename and then the above line would
become:

my @inputlist=map basename($_), @rawlist;


HTH,
Michele
 
P

pkent

Ilya Zakharevich said:
No OS I know is *consistently* designed using the baggage of knowledge
available, say, 20 years ago (probably the same is true if you replace
20 by 35; sigh). But in the domain of child startup most systems of
vintage later than 1970 use the same paradigm; actually, the same
paradigm underlines the Perl's system() too.

Well, I'm interested in finding out how some OSes start process
_without_ fork() or some similar paradigm, and (now I think of it)
whether the parent/child relationship exists in all OSes. As you point
out, the system()/fork()/exec() functions are in perl even on systems
like MacOS which might have some totally different approach to spawning
processes (or they might not, I don't know for sure)

P
 
J

John W. Kennedy

pkent said:
Well, I'm interested in finding out how some OSes start process
_without_ fork() or some similar paradigm,

Outside of Unix, a new "process" or "task" is normally invoked in an
opsys function that combines new-process with call semantics. (Which,
of course, most Unix forks end up doing, anyway.) Load-callable-binary
semantics are often included in the same function. Most people with
experience on other opsys's tend to regard the Unix fork when they meet
it as somewhere between crufty and obscene.

And, indeed, coming from a non-Unix background (IBM mainframe), myself,
about the only good thing I see about fork is that it's much easier to
emulate non-Unix process-creating semantics using fork than to emulate
fork from the other operating systems.

--
John W. Kennedy
"But now is a new thing which is very old--
that the rich make themselves richer and not poorer,
which is the true Gospel, for the poor's sake."
-- Charles Williams. "Judgement at Chelmsford"
 
A

Anno Siegel

John W. Kennedy said:
Outside of Unix, a new "process" or "task" is normally invoked in an
opsys function that combines new-process with call semantics. (Which,
of course, most Unix forks end up doing, anyway.) Load-callable-binary
semantics are often included in the same function. Most people with
experience on other opsys's tend to regard the Unix fork when they meet
it as somewhere between crufty and obscene.

My reaction was different. Process creation (and annihilation) used to
be something the system has ways of doing, never mind how it does it.

Unix shows how that operation can be reduced to two more elementary
operations: process cloning (fork), and process mutation (exec).
Only the creation of the initial process is the OS's secret, everything
else happens in user space.

Somehow it adds to the satisfaction that fork() and exec(), looked at
in isolation, don't appear to do anything particularly useful.

fork()? So it creates another process, just like the first one. At
best they'll do the same thing twice, at worst they'll run over each
other and make a mess.

exec()? Why, that's Basic's "CHAIN" at OS level, it won't ever return
to the caller. Okay, may occasionally come in handy.

Together they turn out to make up a perfectly general system of process
management.

Anno
 
I

Ilya Zakharevich

[A complimentary Cc of this posting was sent to
Anno Siegel
My reaction was different. Process creation (and annihilation) used to
be something the system has ways of doing, never mind how it does it.

Unix shows how that operation can be reduced to two more elementary
operations: process cloning (fork), and process mutation (exec).

The major drawback of this approach is that process cloning is a much
more complicated operation than new process creation.

What TomC said (when he was still active here): the only reason to
have fork()/exec() pair was lazyness:

with the (initial?) implementation of Unix process architecture the
part of the "process state" information which should be inherited
from parent to child was not kept in one tight C structure. Thus it
was easier to clone the *whole* process space than to chaise all
these pieces of the state and copy them into a new "process header".

Do not know how true TomC's narrative was (or how true my recollection is ;-).

Hope this helps,
Ilya
 
B

Ben Morrow

Ilya Zakharevich said:
[A complimentary Cc of this posting was sent to
Anno Siegel
My reaction was different. Process creation (and annihilation) used to
be something the system has ways of doing, never mind how it does it.

Unix shows how that operation can be reduced to two more elementary
operations: process cloning (fork), and process mutation (exec).

The major drawback of this approach is that process cloning is a much
more complicated operation than new process creation.

This was true, but once you've got a proper virtual memory system
implementing CoW is not really hard. The advantages, both in terms of
being able to multi-process where with other models you would have to
multi-thread and in being able to make arbitrary modifications to a
new process's working environment before calling exec(), are huge.
What TomC said (when he was still active here): the only reason to
have fork()/exec() pair was lazyness:

with the (initial?) implementation of Unix process architecture the
part of the "process state" information which should be inherited
from parent to child was not kept in one tight C structure. Thus it
was easier to clone the *whole* process space than to chaise all
these pieces of the state and copy them into a new "process header".

Do not know how true TomC's narrative was (or how true my recollection is ;-).

This would not surprise me at all... :)

Ben
 
I

Ilya Zakharevich

[A complimentary Cc of this posting was sent to
Ben Morrow
This was true, but once you've got a proper virtual memory system
implementing CoW is not really hard.

Given that CoW has little use outside of this context, how is the
separation relevant? You also forget that CoW is just a tip of the
iceberg; a lot of things should taken care of, like semaphores, opened
files, etc. (And after taken care of, all this hard work should be
gone to ancestors on the next instruction, which is exec() :=[).
The advantages, both in terms of
being able to multi-process where with other models you would have to
multi-thread

This is an advantage only if you think of multi-process as an easier
model than multi-thread. [This is indeed so for toy programs;
however, with an increase of complexity of IPC, the distinction is
quickly erased.]
and in being able to make arbitrary modifications to a
new process's working environment before calling exec(), are huge.

Trust me, in *this* area the advantages are almost non-existent. I
have an experience with both models...

Hope this helps,
Ilya
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,997
Messages
2,570,241
Members
46,831
Latest member
RusselWill

Latest Threads

Top