M
Michele Dondi
Just copying here from: <http://perlmonks.org/?node_id=638552>
(Please check the thread for other interventions. BTW: I'm *not* the
author of the post...)
I have two very long (>64k) strings of equal lengths - $s1 and $s2.
They are strings of bytes, meaning that any value from chr(0) to
chr(255) is legal. $s2, however, will not have any chr(0). $s1 may or
may not have any. What I need to do is look at each byte in $s1 and if
it is chr(0), replace it with the corresponding byte in $s2. So,
something like the following code:
sub foo {
my ($s1, $s2) = @_;
my @s1 = split //, $s1;
my @s2 = split //, $s2;
foreach my $idx ( 0 .. $#s1 ) {
if ( $s1[$idx] eq chr(0) ) {
$s1[$idx] = $s2[$idx];
}
}
return join '', @s1;
}
foo() could return the resulting string or it could modify $s1 in
place. If foo() returns $s1, I'm going to be doing $s1 = foo( $s1, $s2
); in all cases.
Here's what I've got so far, including Benchmark harness. Whoever
comes up with the fastest version earns a meter of beer from me
whenever we see each other.
#!/usr/bin/perl
use 5.6.0;
use strict;
use warnings FATAL => 'all';
use Benchmark qw( cmpthese );
my $s1 = join '', (do_rand(1) x 100_000);
my $s2 = join '', (do_rand(0) x 100_000);
cmpthese( -2, {
'split1' => sub { my $s3 = split1( $s1, $s2 ) },
'substr1' => sub { my $s3 = substr1( $s1, $s2 ) },
});
sub split1 {
my ($s1, $s2) = @_;
my @s1 = split //, $s1;
my @s2 = split //, $s2;
foreach my $idx ( 0 .. $#s1 ) {
if ( $s1[$idx] eq chr(0) ) {
$s1[$idx] = $s2[$idx];
}
}
return join '', @s1;
}
sub substr1 {
my ($s1, $s2) = @_;
for my $idx ( 0 .. length($s1) ) {
if ( substr($s1,$idx,1) eq chr(0) ) {
substr($s1, $idx, 1) = substr($s2, $idx, 1);
}
}
return $s1;
}
# This makes sure that $s1 has chr(0)'s in it and $s2 does not.
sub do_rand {
my $n = (shift) ? int(rand(255)) : int(rand(254)) + 1;
return chr( $n );
}
__END__
Update: It looks like there is a 2-way tie between avar and moritz. I
went ahead and wrote an in-place version of moritz's code. Thanks to
SuicideJunkie for fixing my stupidity in the test data. The script now
looks like:
#!/usr/bin/perl
use 5.6.0;
use strict;
use warnings FATAL => 'all';
#use Test::More no_plan => 1;
use Benchmark qw( cmpthese );
my $s1 = do_rand(0, 100_000);
my $s2 = do_rand(1, 100_000);
my $expected = split1( \$s1, \$s2 );
cmpthese( -3, {
'avar2' => sub {
my $s3 = $s1; avar2( \$s3, \$s2 );
# is( $s3, $expected, "avar2" );
},
'moritz' => sub {
my $s3 = $s1; moritz( \$s3, \$s2 );
# is( $s3, $expected, "moritz" );
},
});
sub split1 {
my ($s1, $s2) = @_;
my @s1 = split //, $$s1;
my @s2 = split //, $$s2;
foreach my $idx ( 0 .. $#s1 ) {
if ( $s1[$idx] eq chr(0) ) {
$s1[$idx] = $s2[$idx];
}
}
$$s1 = join '', @s1;
}
sub avar2 {
my ($s1, $s2) = @_;
use bytes;
$$s1 =~ s/\0/substr $$s2, pos($$s1), 1/eg;
}
sub moritz {
my ($s1, $s2) = @_;
my $pos = 0;
while ( 0 < ( $pos = index $$s1, "\000", $pos ) ) {
substr( $$s1, $pos, 1 ) = substr( $$s2, $pos, 1 );
}
}
sub do_rand {
my ($min, $len) = @_;
my $n = "";
for (1 .. $len) {
$n .= chr( rand(255-$min)+$min )
}
return $n;
}
__END__
I'm going to keep it open until 24 hours have passed from the initial
posting of this node. If no-one gets any faster, both moritz and avar
have a meter of beer from me.
Michele
(Please check the thread for other interventions. BTW: I'm *not* the
author of the post...)
I have two very long (>64k) strings of equal lengths - $s1 and $s2.
They are strings of bytes, meaning that any value from chr(0) to
chr(255) is legal. $s2, however, will not have any chr(0). $s1 may or
may not have any. What I need to do is look at each byte in $s1 and if
it is chr(0), replace it with the corresponding byte in $s2. So,
something like the following code:
sub foo {
my ($s1, $s2) = @_;
my @s1 = split //, $s1;
my @s2 = split //, $s2;
foreach my $idx ( 0 .. $#s1 ) {
if ( $s1[$idx] eq chr(0) ) {
$s1[$idx] = $s2[$idx];
}
}
return join '', @s1;
}
foo() could return the resulting string or it could modify $s1 in
place. If foo() returns $s1, I'm going to be doing $s1 = foo( $s1, $s2
); in all cases.
Here's what I've got so far, including Benchmark harness. Whoever
comes up with the fastest version earns a meter of beer from me
whenever we see each other.
#!/usr/bin/perl
use 5.6.0;
use strict;
use warnings FATAL => 'all';
use Benchmark qw( cmpthese );
my $s1 = join '', (do_rand(1) x 100_000);
my $s2 = join '', (do_rand(0) x 100_000);
cmpthese( -2, {
'split1' => sub { my $s3 = split1( $s1, $s2 ) },
'substr1' => sub { my $s3 = substr1( $s1, $s2 ) },
});
sub split1 {
my ($s1, $s2) = @_;
my @s1 = split //, $s1;
my @s2 = split //, $s2;
foreach my $idx ( 0 .. $#s1 ) {
if ( $s1[$idx] eq chr(0) ) {
$s1[$idx] = $s2[$idx];
}
}
return join '', @s1;
}
sub substr1 {
my ($s1, $s2) = @_;
for my $idx ( 0 .. length($s1) ) {
if ( substr($s1,$idx,1) eq chr(0) ) {
substr($s1, $idx, 1) = substr($s2, $idx, 1);
}
}
return $s1;
}
# This makes sure that $s1 has chr(0)'s in it and $s2 does not.
sub do_rand {
my $n = (shift) ? int(rand(255)) : int(rand(254)) + 1;
return chr( $n );
}
__END__
Update: It looks like there is a 2-way tie between avar and moritz. I
went ahead and wrote an in-place version of moritz's code. Thanks to
SuicideJunkie for fixing my stupidity in the test data. The script now
looks like:
#!/usr/bin/perl
use 5.6.0;
use strict;
use warnings FATAL => 'all';
#use Test::More no_plan => 1;
use Benchmark qw( cmpthese );
my $s1 = do_rand(0, 100_000);
my $s2 = do_rand(1, 100_000);
my $expected = split1( \$s1, \$s2 );
cmpthese( -3, {
'avar2' => sub {
my $s3 = $s1; avar2( \$s3, \$s2 );
# is( $s3, $expected, "avar2" );
},
'moritz' => sub {
my $s3 = $s1; moritz( \$s3, \$s2 );
# is( $s3, $expected, "moritz" );
},
});
sub split1 {
my ($s1, $s2) = @_;
my @s1 = split //, $$s1;
my @s2 = split //, $$s2;
foreach my $idx ( 0 .. $#s1 ) {
if ( $s1[$idx] eq chr(0) ) {
$s1[$idx] = $s2[$idx];
}
}
$$s1 = join '', @s1;
}
sub avar2 {
my ($s1, $s2) = @_;
use bytes;
$$s1 =~ s/\0/substr $$s2, pos($$s1), 1/eg;
}
sub moritz {
my ($s1, $s2) = @_;
my $pos = 0;
while ( 0 < ( $pos = index $$s1, "\000", $pos ) ) {
substr( $$s1, $pos, 1 ) = substr( $$s2, $pos, 1 );
}
}
sub do_rand {
my ($min, $len) = @_;
my $n = "";
for (1 .. $len) {
$n .= chr( rand(255-$min)+$min )
}
return $n;
}
__END__
I'm going to keep it open until 24 hours have passed from the initial
posting of this node. If no-one gets any faster, both moritz and avar
have a meter of beer from me.
Michele