parallel processing

  • Thread starter Stuart Kendrick
  • Start date
S

Stuart Kendrick

Hi,

I'm not quite sure what 'feature' i'm looking for ... any input
appreciated.

I want to parallelize a particular task.

#/usr/bin/perl -w
use strict;
my @target;
our @result;

for (my $i = 0; $i < @target; $i++) {
$result[$i] = &do_some_work($target[$i]);
}
&report_results;
....

&do_some_work requires a minute or so to complete. @target contains
several hundred elements. Therefore, total execution time runs in the
hundreds of minutes.

Also, @target is not ordered ... e.g. there are no dependencies within
@target ... if &do_some_work finishes processing $target[159] before
it starts (or finishes) $target[17], no problems.

I figure that if i could find a way to spawn lots of copies of
&do_some_work ... that i could reduce total execution time. Assuming
that my machine has sufficient resources, I might even get total
execution time down to a minute or so. This would be a major win for
me -- I would like this app to complete within ten minutes at the
outside.

What Perl 'feature' should I explore to do this? Am I walking into
'threads' here?

--sk

Stuart Kendrick
FHCRC
 
E

Eric Schwartz

I'm not quite sure what 'feature' i'm looking for ... any input
appreciated.

I want to parallelize a particular task.

#/usr/bin/perl -w
use strict;

You should probably prefer 'use warnings;' to the -w flag these days.
I still use -w, but it's mostly finger macros I haven't retrained yet.
my @target;
our @result;

for (my $i = 0; $i < @target; $i++) {
$result[$i] = &do_some_work($target[$i]);
}
&report_results;
...

Ack, don't *do* that. Specifically, don't call subs with &. See
perlfaq7, "What's the difference between calling a function as &foo
and foo()?"

You can probably get away with just fork()ing inside do_some_work()
(note lack of '&'). 'perldoc -f fork' should give you the skinny.
See also perlipc for a slightly broader view.

-=Eric
 
B

Ben Morrow

I want to parallelize a particular task.

#/usr/bin/perl -w
use strict;
my @target;
our @result;

Why 'our'?
for (my $i = 0; $i < @target; $i++) {

for my $i (0..$#target) {

or, better,
push $result, do_some_work($_) for @target;
$result[$i] = &do_some_work($target[$i]);

Don't call subs with &.
}
&report_results;
...

&do_some_work requires a minute or so to complete. @target contains
several hundred elements. Therefore, total execution time runs in the
hundreds of minutes.

Also, @target is not ordered ... e.g. there are no dependencies within
@target ... if &do_some_work finishes processing $target[159] before
it starts (or finishes) $target[17], no problems.

I figure that if i could find a way to spawn lots of copies of
&do_some_work ... that i could reduce total execution time.

This will only help if either your machine has more than one processor
or do_some_work spends time doing nothing: say, waiting for results
from the network. If the task is pure computation, multi-threading on
a single-processor machine will increase the time taken to
complete, due to threading overheads.
Assuming that my machine has sufficient resources, I might even get
total execution time down to a minute or so. This would be a major
win for me -- I would like this app to complete within ten minutes
at the outside.
What Perl 'feature' should I explore to do this?
Am I walking into 'threads' here?

Yup. Probably 'async'. Make sure you are using a post-5.8.0 perl, and
read perldoc perlthrtut. If your tasks really are independant, about
the only tricky bit should be making sure all the threads have
finished before reporting the results.

Ben
 
S

Simon Taylor

Hello Stuart,
I'm not quite sure what 'feature' i'm looking for ... any input
appreciated.

I want to parallelize a particular task.

Have a look at the documentation for the Parallel::ForkManager
module, we've used it to great effect for certain tasks.

Hope this helps,

Simon Taylor
 
G

Gunnar Hjalmarsson

Stuart said:
I want to parallelize a particular task.

Forking multiple child processes is very easily done by help of the
CPAN module Parallel::ForkManager.
 
G

Gregory Toomey

It was a dark and stormy night, and Stuart Kendrick managed to scribble:
Hi,

I'm not quite sure what 'feature' i'm looking for ... any input
appreciated.

I want to parallelize a particular task.

Depending on the task, it may not run any faster unless you have more than 1 CPU.

gtoomey
 
S

Stuart Kendrick

thanx for all the input. turns out that the parallel processes needed
read/write access to data structures within the main process ... so i
used threads and threads:shared. thanx also for the stylistic
pointers ... i'm pulling out & and -w from my scripts now.

i'm pleased with the result ...
http://www.skendric.com/device/Cisco/shutdown-network ... a script
which disables the access layer of our network in about a minute,
thanx to the use of threading ... one thread per ethernet switch. i
hope we'll never use it ... but in the event of a catastrophic worm
infection, i'm going to be real grateful that i have this tool
available to me.

--sk

Stuart Kendrick
FHCRC
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,145
Messages
2,570,825
Members
47,371
Latest member
Brkaa

Latest Threads

Top