parallel child processes not working properly

V

vkinger

Hi,

I have a problem where I am creating multiple parallel child processes
to connect to multiple servers. Each child process has a different
filehandle to read starttime from a text file at the time of fork and
then write into this text file each time it runs. Each child process
has a forever loop so that it never exits unless killed or reboot
occured at server and then it tries to reconnect. My problem is it
works for couple of times and logs data from different servers into
syslog but on 2nd or 3rd attempt(after being stopped by kill -9
<process id>) it only works with the one server( it can be 1st or 2nd
or 3rd as I am testing with only three servers at present) I am not
sure what is wrong, may be I can use some help here from the experts.
Here is what I am doing.

sub fork_pids
{
my @pids;
my $i = 0;
my $length = scalar(@lines);
my $pm = new Parallel::ForkManager($length);
print " Connecting to " , $length , " servers\n ";
for $i ( 1 .. $length )
{
print "spawning child processes $i\n";
$pm->start and next; #do the fork
print "Inside the child process\n";
&GetEventData($i-1);
$pm->finish;
}
$pm->wait_all_children;
}

Here I am checking the length that how many servers are configured in
the config.txt file and GetEventData is my function that takes index
into the number of server and calls the forever loop function. Same
index is used to create another text file inside this function to
record the timestamp associated with the data. Each server here
essentially does the same thing and logs into the syslog. It works
once or twice and then works with only one server.
 
X

xhoster

Hi,

I have a problem where I am creating multiple parallel child processes
to connect to multiple servers. Each child process has a different
filehandle to read starttime from a text file at the time of fork and
then write into this text file each time it runs. Each child process
has a forever loop so that it never exits unless killed or reboot
occured at server and then it tries to reconnect.

How does the no longer existent child process try to reconnect?

My problem is it
works for couple of times and logs data from different servers into
syslog but on 2nd or 3rd attempt(after being stopped by kill -9
<process id>)

What is killed by kill -9, the parent or the child or something else? And
why -9?
it only works with the one server( it can be 1st or 2nd
or 3rd as I am testing with only three servers at present)

What do you mean by "works" in this case?
I am not
sure what is wrong, may be I can use some help here from the experts.
Here is what I am doing.

sub fork_pids
{
my @pids;

This variable doesn't seem to be used.
my $i = 0;

premature declaration, no?
my $length = scalar(@lines);
my $pm = new Parallel::ForkManager($length);
print " Connecting to " , $length , " servers\n ";
for $i ( 1 .. $length )
{
print "spawning child processes $i\n";

How many of these get printed?
$pm->start and next; #do the fork
print "Inside the child process\n";

How many of these get printed?
&GetEventData($i-1);
$pm->finish;
}
$pm->wait_all_children;
}

Xho
 
J

J. Gleixner

Hi,

I have a problem where I am creating multiple parallel child processes
to connect to multiple servers. [...] My problem is it
works for couple of times and logs data from different servers into
syslog but on 2nd or 3rd attempt(after being stopped by kill -9
<process id>) it only works with the one server( it can be 1st or 2nd
or 3rd as I am testing with only three servers at present) I am not
sure what is wrong, may be I can use some help here from the experts.
Here is what I am doing.

sub fork_pids
{
my @pids;
@pids isn't used..
my $i = 0; No need for that.
my $length = scalar(@lines);
No need for 'scalar'.
my $pm = new Parallel::ForkManager($length);
print " Connecting to " , $length , " servers\n ";
for $i ( 1 .. $length ) for my $i ( 1 .. $length )
{
print "spawning child processes $i\n";
$pm->start and next; #do the fork
print "Inside the child process\n";
&GetEventData($i-1);

Probably want to call GetEventData( $i - 1 ) there.
$pm->finish;
}
$pm->wait_all_children;
}

Nothing wrong with that, so it must be in another part of your code.
 
V

vkinger

How does the no longer existent child process try to reconnect?


What is killed by kill -9, the parent or the child or something else? And
why -9?
From cygwin I do ps -ef and then use kill -9 <process id>, -9 is flag
to force it to die, the result is same even if I press ctrl-c.
What do you mean by "works" in this case?
it works once or twice, that means I can see the events coming from
the both servers but most of the time it show events coming from only
one server and the other server just shuts up:
This variable doesn't seem to be used.


premature declaration, no?


How many of these get printed?
I am working one server and opening two connections with two child
processes and third server configured in the text file has a wrong
address, to handle a condition where the server address is wrong, get
a open failed condition and exit that child process. here is what I
get:
Connecting to 3 servers
spawning child processes 1
spawning child processes 2
Inside the child process
<Here it calls GetEventData(0) function and stops just before entering
the forever loop>
Inside the child process
spawning child processes 3
<Here it calls the GetEventData(1) function and stops just before
entering the forever loop>
Inside the child process
<Here it calls the GetEventData(2) function and stops just before
entering the forever loop, it also tell me here that it can not
connect to third server as the IP address is wrong, for testing I gave
it 1.2.3.4>
How many of these get printed?
Here is the layout of my program:

sub GetEventData
{
my $i = shift; #index into the config.text file that was passed
from the fork below
$s = Net::SDEE->new(
returnXML=> 1,
debug => 1,
callback => \&xmlCallback , #it is huge function I created to log
events into syslog and creat text file to take timestamp of the event
came from server so next time server starts from that time
debug_callback => \&local_debug
);
$s->Server($items[0]);
$s->Port($items[1]);
$s->Username($items[2]);
$s->Password($items[3]);
my $subs = Net::SDEE::Subscription->new();
$subs->maxNbrOfEvents(30);
$subs->startTime($lasttimearray[$i]);
$s->open($subs);
for(;;)
{
print "Getting events $$ and line $i\n";
$s->get($varsubid); #varsubid is a global variable
}
}
sub fork_pids
{
my $length = @lines;
my $pm = new Parallel::ForkManager($length);
print " Connecting to " , $length , " servers\n ";
for $i ( 1 .. $length )
{
print "spawning child processes $i\n";
$pm->start and next; #do the fork
print "Inside the child process\n";
GetEventData($i-1);
$pm->finish;
}
$pm->wait_all_children;



}
 
X

xhoster

From cygwin I do ps -ef and then use kill -9 <process id>, -9 is flag
to force it to die, the result is same even if I press ctrl-c.

So once the program has been running once you kill it with ctrl-c, and
then it starts acting up the next time you run it. What does it take to
"clear" this problem so that you can run it again? A machine reboot? Or
if you just leave it for a few minutes, will it start working? Also,
aren't the children still running? At this point, the parent wasn't doing
anything but waiting on the children anyway, so the death of the parent
should have no effect--the children should still be running the same as
before. But that is from my experience on Linux, since you are using
cygwin then that very well may not apply.

it works once or twice, that means I can see the events coming from
the both servers but most of the time it show events coming from only
one server and the other server just shuts up:

What do you see in the way of internal debugging information?
I am working one server and opening two connections with two child
processes and third server configured in the text file has a wrong
address, to handle a condition where the server address is wrong, get
a open failed condition and exit that child process. here is what I
get:

(To clean it up, I snipped some editorial comments from the below)
Connecting to 3 servers
spawning child processes 1
spawning child processes 2
Inside the child process
Inside the child process
spawning child processes 3
Inside the child process

Do you see this both when your code "works" and when your code "doesn't
work"? If so, then the problem probably lies in the part of your code that
uses Net::SDEE, rather than here. If not, then what does it print when the
code doesn't work?

Xho
 
V

vkinger

I got it to work, it seems my code was fine. The problem I found was
in converting the .pl file into .exe file using perl2exe utility. .pl
did exactly what I was looking for where as .exe choked!!
I do need exe so I am looking into fixing it.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,995
Messages
2,570,236
Members
46,822
Latest member
israfaceZa

Latest Threads

Top