Perl thread question

E

Eric

Hello,

I was given a task that requires me to launch multiple executions of
the same script using different args as input. Instead of waiting for
the first to complete before starting the second one, I need to run
them in their own thread (i.e. parallel).

Here is what I've done so far. The script I am executing is called
'log_on_off.exp', which is an Expect script that logs on, then off of
a console, which is provided as input to the command:

vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv
use Config;
use threads;

my @cmdline = ("log_on_off.exp mach003-con",
"log_on_off.exp mach022-con",
"log_on_off.exp mach030-con");

foreach (@cmdline) {
my $thr = threads->new(\&runExpectScript, $_);
}

sub runExpectScript {
my $cmd = $_;
my $runScript = `$cmd`;
}
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

After running this script, I can see that all three executions are
running as separate processes, which is what I would expect:

$ ps
PID TTY TIME CMD
2150 pts/1 00:00:23 bash
30151 pts/1 00:00:00 log_on_off.exp
30153 pts/1 00:00:00 log_on_off.exp
30155 pts/1 00:00:00 log_on_off.exp
30268 pts/1 00:00:00 ps
$

It turns out that I need the result of the subroutine to determine if
the login, logout was successful. So I tried using the join function
as follows:

vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv
use Config;
use threads;

my @cmdline = ("log_on_off.exp mach003-con",
"log_on_off.exp mach022-con",
"log_on_off.exp mach030-con");

foreach (@cmdline) {
my $thr = threads->new(\&runExpectScript, $_);
my $retResponse = $thr->join;
}

sub runExpectScript {
my $cmd = $_;
my $runScript = `$cmd`;

if ($runScript =~ m/successful/) {
print "SUCCESS\n";
} else {
print "FAILURE\n";
}
}
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

When I run this Perl script in the background, I never see more than
one log_on_off.exp process running at any time. According to the
document I am using as a learning tool (http://perldoc.perl.org/
perlthrtut.html), the join function waits for the thread to exit
before continuing. I interpret this as saying that the second command
will not be started until the first returns, which defeats the purpose
of my using Perl threads to begin with, since these commands are being
executed serially (which is no different than running them in a loop
without even using Perl threads). Is my interpretation correct?

Is there any way I can execute these commands and parse the input in
their own thread instead of waiting for the previous command to
finish? Of course, if this can be done, I need to be able to determine
what thread returned what value.

Thanks in advance to all that respond.

Eric
 
J

J. Gleixner

Parallel::ForkManagerEric said:
Hello,

I was given a task that requires me to launch multiple executions of
the same script using different args as input. Instead of waiting for
the first to complete before starting the second one, I need to run
them in their own thread (i.e. parallel).

Maybe take a look at Parallel::ForkManager.
 
E

Eric

Eric said:
I was given a task that requires me to launch multiple executions of
the same script using different args as input. Instead of waiting for
the first to complete before starting the second one, I need to run
them in their own thread (i.e. parallel).
[...]

It turns out that I need the result of the subroutine to determine if
the login, logout was successful. So I tried using the join function
as follows:
vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv
use Config;
use threads;
my @cmdline = ("log_on_off.exp mach003-con",
"log_on_off.exp mach022-con",
"log_on_off.exp mach030-con");
foreach (@cmdline) {
my $thr = threads->new(\&runExpectScript, $_);
my $retResponse = $thr->join;
}
sub runExpectScript {
my $cmd = $_;
my $runScript = `$cmd`;
if ($runScript =~ m/successful/) {
print "SUCCESS\n";
} else {
print "FAILURE\n";
}
}
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
When I run this Perl script in the background, I never see more than
one log_on_off.exp process running at any time. According to the
document I am using as a learning tool (http://perldoc.perl.org/
perlthrtut.html), the join function waits for the thread to exit
before continuing. I interpret this as saying that the second command
will not be started until the first returns, which defeats the purpose
of my using Perl threads to begin with, since these commands are being
executed serially (which is no different than running them in a loop
without even using Perl threads). Is my interpretation correct?

Yes. You're mixing up two different things here: the execution
of the threads and the collection of their results. In your example
you're in fact waiting for the result of one thread before starting
the next one and making the use of threads obsolete.

If you want to parallelise threads, you have to first start all of
them, then collect all of them, like with

my @thrdlist;

foreach( 0 .. $#cmdline )
{
$thrdlist[$_] = threads->new( \&runExpectScript, $cmdline[$_] );

}

my @responses;

foreach( 0 .. $#cmdline )
{
$responses[$_] = $thrdlist[$_]->join();

}

In that example the threads are all fired as quickly as Perl permits,
then the result collection loop runs until all of them are finished
and their results can be fetched.

Note that it does hardly matter in what order the threads finish, as
when one of the first threads takes longer than later ones they simply
wait in the queue until they are join()ed. So the speed loss is just
the execution time of the remaining loop iterations (fetching the
return value and destroying the thread).
Is there any way I can execute these commands and parse the input in
their own thread instead of waiting for the previous command to
finish? Of course, if this can be done, I need to be able to determine
what thread returned what value.

That threads topic can give a lot of headache, that I know from
experience, but usually things tend to be a lot simpler than one
would have thought :)

btw., I noticed that in your script you have
sub runExpectScript {
my $cmd = $_;

You surely want to use
my $cmd = shift;
or
my $cmd = $_[0];
here, as it's more of an accident that $_ holds the correct value at
this point and the correct place to look for subroutine args is @_.

-Chris- Hide quoted text -

- Show quoted text -

Thanks for your response, Chris. After I submitted my initial entry, I
had an enlightenment and tried the following:

vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv
#!/usr/bin/perl

use strict;
use warnings;

use Config;
use threads;

$Config{useithreads} or die "Recompile Perl with threads to run this
program.";

# Put all of the command line args in an array.
my @cmdline = ("log_on_off.exp mach003_con",
"log_on_off.exp mach022_con",
"log_on_off.exp mach030_con");

# Execute each command in the array in it's own thread.
foreach (@cmdline) {
my $thr = threads->new(\&runExpectScript, $_);
sleep 1;
print "Value of \$thr is: $$thr\n";
my $ps = system("ps"); ##
}

# Specify the join function for each thread in the list.
for my $t (threads->list) {
print "Value of \$t is: $$t\n"; ##
$t->join;
}

sub runExpectScript {
my $cmd = shift;
print "The command line is: $cmd\n";
my $runScript = `$cmd`;
#print "The contents of \$runScript are: $runScript\n";
if ($runScript =~ m/successful/) {
print "SUCCESS: $_\n";
} else {
print "FAILURE: $_\n";
}
}
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

This seems to work. Here is the output I get:

vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv
$ m.pl
The command line is: log_on_off.exp pdp003 pdp003-ilo
Value of $thr is: 140820824
PID TTY TIME CMD
2150 pts/1 00:00:24 bash
4194 pts/1 00:00:00 m.pl
4196 pts/1 00:00:00 log_on_off.exp
4235 pts/1 00:00:00 ps
The command line is: log_on_off.exp pdp022 pdp022-ilo
Value of $thr is: 141287696
PID TTY TIME CMD
2150 pts/1 00:00:24 bash
4194 pts/1 00:00:00 m.pl
4196 pts/1 00:00:00 log_on_off.exp
4237 pts/1 00:00:00 log_on_off.exp
4276 pts/1 00:00:00 ps
The command line is: log_on_off.exp osdc-pdp030 osdc-pdp030-ilo
Value of $thr is: 141228056
PID TTY TIME CMD
2150 pts/1 00:00:24 bash
4194 pts/1 00:00:00 m.pl
4196 pts/1 00:00:00 log_on_off.exp
4237 pts/1 00:00:00 log_on_off.exp
4278 pts/1 00:00:00 log_on_off.exp
4316 pts/1 00:00:00 ps
Value of $t is: 140820824
SUCCESS: log_on_off.exp pdp003 pdp003-ilo
Value of $t is: 141287696
SUCCESS: log_on_off.exp osdc-pdp030 osdc-pdp030-ilo
SUCCESS: log_on_off.exp pdp022 pdp022-ilo
Value of $t is: 141228056
[ecarlson@ecarlson-dev1 remboot]$
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

As you can see, it appears that the processes are all started and
running as separate processes, as I would expect. The results are not
given in any particular order (which is ok), but appear to be
associated with the correct process based on command line. (I tried
some failure conditions and found that they match up.) The main
difference between my and your approach is that you went to extra
effort to identify each process by a number, which is probably a more
sure way than what I did. Do you feel that my approach will work
despite the fact that I didn't do this?

Thanks.

Eric
 
E

Eric

Eric wrote:

[fire-first-collect-later code snipped]




PID TTY TIME CMD
2150 pts/1 00:00:24 bash
4194 pts/1 00:00:00 m.pl
4196 pts/1 00:00:00 log_on_off.exp
4237 pts/1 00:00:00 log_on_off.exp
4278 pts/1 00:00:00 log_on_off.exp
4316 pts/1 00:00:00 ps
Value of $t is: 140820824
SUCCESS: log_on_off.exp pdp003 pdp003-ilo
Value of $t is: 141287696
SUCCESS: log_on_off.exp osdc-pdp030 osdc-pdp030-ilo
SUCCESS: log_on_off.exp pdp022 pdp022-ilo
Value of $t is: 141228056
[ecarlson@ecarlson-dev1 remboot]$
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
As you can see, it appears that the processes are all started and
running as separate processes, as I would expect. The results are not
given in any particular order (which is ok), but appear to be
associated with the correct process based on command line. (I tried
some failure conditions and found that they match up.) The main
difference between my and your approach is that you went to extra
effort to identify each process by a number, which is probably a more
sure way than what I did. Do you feel that my approach will work
despite the fact that I didn't do this?

If you're only going for some lines of text to the screen, I don't
see why not. Though you could also return a more complex data
structure that holds both the task identification and the result and
avoid the counter this way - like always, TMTOWTDI[1] - if you
want to work on the results some more later on in your script.

But I see in your code that you're still using $_ in the subroutine
and laying out traps for yourself (and I admit not being completely
clean there with my own example). Especially with threads, where
things may happen in random order, it's always wise to keep from
$_ as far as possible. Shift into variables in the sub, and use a
named iterator in the loops, like

foreach my $count ( 0 .. $#cmdline ) {
do_something_with( $count );

}

or

foreach my $nextcommand ( @commandline ) {
do_something_with( $nextcommand );

}

or tracing back random errors may become a hell of a job once the
projects get a bit more complex.

-Chris

1) There's More Than One Way To Do It, the Perl(5?) philosophy.- Hide quoted text -

- Show quoted text -

Good point on the $_, Chris. I usually try to take the Perl shortcut
to doing things. But sometimes that may not always be the best
approach, and possibly threads is one such case where it is not the
wisest thing to do.

Eric
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,995
Messages
2,570,225
Members
46,815
Latest member
treekmostly22

Latest Threads

Top