D
DJ Stunks
Hey all,
I have a question about performing two long-running selects from a
database in parallel. I've only written a few scripts which do things
in parallel so I'm not an expert by any means.
I have two straightforward SELECT statements, but both take about two
minutes to complete. I'd like to run them in parallel, but I obviously
need access to all the rows - what's the best way to do so?
I was thinking something along these lines (pseudocode below) but I was
hoping there would be some way to give the parent access to the
statement handle itself so it could pull the rows once the queries were
complete rather than pulling all the rows in the child, and serializing
and passing to the parent.
Any ideas or maybe modules which could be handy? (I looked at both
Acme::Spork and Parallel::ForkManager but neither are appropriate)
TIA,
-jp
#!/usr/bin/perl <pseudocode>
use strict;
use warnings;
my @kids = (
{ query => 'select * from big_table' },
{ query => 'select * from another_big_table' },
);
my $pid;
for my $kid (@kids) {
$pid = open( my $fh, '|-');
die "Can't fork: $!\n" if not defined $pid;
@{ $kid }{ 'pid','handle' } = ($pid,$fh);
}
if ( $pid == 0) { # I'm one of the children
# connect to the db
# prepare query
# execute query
# wait for results
# foreach @row = $sth->fetchrow_array
# print join( $;, @row ), "\n"
# exit; (exit or waitpid? do I know the parent read everything?)
}
else { # I'm the parent
# while ( my $line = < $kids[0]{handle} >) {
# @row = split $;, $line;
# do whatever with @row
#
# while ( my $line = < $kids[1]{handle} >) {
# etc.
}
__END__
I have a question about performing two long-running selects from a
database in parallel. I've only written a few scripts which do things
in parallel so I'm not an expert by any means.
I have two straightforward SELECT statements, but both take about two
minutes to complete. I'd like to run them in parallel, but I obviously
need access to all the rows - what's the best way to do so?
I was thinking something along these lines (pseudocode below) but I was
hoping there would be some way to give the parent access to the
statement handle itself so it could pull the rows once the queries were
complete rather than pulling all the rows in the child, and serializing
and passing to the parent.
Any ideas or maybe modules which could be handy? (I looked at both
Acme::Spork and Parallel::ForkManager but neither are appropriate)
TIA,
-jp
#!/usr/bin/perl <pseudocode>
use strict;
use warnings;
my @kids = (
{ query => 'select * from big_table' },
{ query => 'select * from another_big_table' },
);
my $pid;
for my $kid (@kids) {
$pid = open( my $fh, '|-');
die "Can't fork: $!\n" if not defined $pid;
@{ $kid }{ 'pid','handle' } = ($pid,$fh);
}
if ( $pid == 0) { # I'm one of the children
# connect to the db
# prepare query
# execute query
# wait for results
# foreach @row = $sth->fetchrow_array
# print join( $;, @row ), "\n"
# exit; (exit or waitpid? do I know the parent read everything?)
}
else { # I'm the parent
# while ( my $line = < $kids[0]{handle} >) {
# @row = split $;, $line;
# do whatever with @row
#
# while ( my $line = < $kids[1]{handle} >) {
# etc.
}
__END__