submatch scoping in while

J

Julian Bradfield

Consider the following:

@x = ( 'aaa','bbb');
while ( $x[$i] !~ /^(.)b/ && $i <= $#x ) { $i++; }
print "\$1 is *$1*, i is $i\n";

The loop terminates at $i == 1 when 'bbb' matches ^(.)b
The enclosing block for the match construct is the whole file.
Therefore $1 should be 'b'.

But it isn't (in Perl 5.8.5).

What am I missing?

Compare

@x = ( 'aaa','bbb');
if ( $x[$i] !~ /^(.)a/ && $i <= $#x ) { $i++; }
print "\$1 is *$1*, i is $i\n";

which behaves as expected.
 
D

Dr.Ruud

Julian Bradfield schreef:
Consider the following:

@x = ( 'aaa','bbb');
while ( $x[$i] !~ /^(.)b/ && $i <= $#x ) { $i++; }
print "\$1 is *$1*, i is $i\n";

The loop terminates at $i == 1 when 'bbb' matches ^(.)b
The enclosing block for the match construct is the whole file.
Therefore $1 should be 'b'.

(Perl 5.8.6)
It seems that $1 is localized inside a while(...){...} or for(...){...}.

#!/usr/bin/perl
use warnings ;
use strict ;

'X' =~ /(.)/ ; # sets $1 to 'X'

my @x = ('aaa', 'bbb') ;

for (@x) {
/^(.)a/ and print "1:$1:\n" and last ;
}
print "2:$1:\n";
 
G

Gunnar Hjalmarsson

Julian said:
Consider the following:

@x = ( 'aaa','bbb');
while ( $x[$i] !~ /^(.)b/ && $i <= $#x ) { $i++; }
print "\$1 is *$1*, i is $i\n";

The loop terminates at $i == 1 when 'bbb' matches ^(.)b
The enclosing block for the match construct is the whole file.
Therefore $1 should be 'b'.

No. Even if 'bbb' matches ^(.)b at the last loop iteration, you don't
test that. You test whether they *do not* match, so the test fails, and
$1 is not set.
What am I missing?

Combining capturing parentheses and the !~ operator is not a good idea.
In addition to that, as Mumia pointed out, the dollar-digit variables
(when set) seem to be scoped to the while block.
 
B

Brian McCauley

Julian said:
Consider the following:

@x = ( 'aaa','bbb');
while ( $x[$i] !~ /^(.)b/ && $i <= $#x ) { $i++; }
print "\$1 is *$1*, i is $i\n";

Compare

@x = ( 'aaa','bbb');
if ( $x[$i] !~ /^(.)a/ && $i <= $#x ) { $i++; }
print "\$1 is *$1*, i is $i\n";

which behaves as expected.

Actually, for what it's worth I find while's behaviour is expected and
if's is not.

With while() lexical variables, dynamic (package) variables and the
match capture variables behave consistantly. With if() the lexical
variable is inconsistant with the other two. But of course lexical
variables are and order of magnitude more prevalent in Perl programming
so my intuative expectation is based on their behaviour.

use strict;
use warnings;

'unchanged' =~ /(.*)/; # Assign $1
our $pkg = 'unchanged';
my $lex = 'unchanged';

while ( 'x'=~/(.*)/ and my $lex='y' and local $pkg='z' and 0 ) { die }
print "$1 $lex $pkg\n"; # unchanged unchanged unchanged

if ( 'x'=~/(.*)/ and my $lex='y' and local $pkg='z' and 0 ) { die }
print "$1 $lex $pkg\n"; # x unchanged y

__END__
 
J

Julian Bradfield

@x = ( 'aaa','bbb');
while ( $x[$i] !~ /^(.)b/ && $i <= $#x ) { $i++; }
print "\$1 is *$1*, i is $i\n";
....
No. Even if 'bbb' matches ^(.)b at the last loop iteration, you don't
test that. You test whether they *do not* match, so the test fails, and
$1 is not set.[/QUOTE]

Wrong. As demonstrated by the if example later in my post, match variables
are set by a !~ . (Otherwise $a !~ /foo/ would not be equivalent to
! ($a =~ /foo/) !)
In addition to that, as Mumia pointed out, the dollar-digit variables
(when set) seem to be scoped to the while block.

This seems to be the case, but it's not what the manual says.
So there's a bug either in Perl or in the manual.
 
C

Charles DeRykus

Mumia said:
[...]
$i = 0;
@x = ('aaa','bbb');

$i++ while ( $x[$i] !~/^(.)b/ && $i<@x ) ;

print "\$1 is *$1*, i is $i\n";


Regards

Mirco

That's pretty good. It's just an inverted version of the OP's code; I
wish I'd tried it. This idea crossed my mind for about ¼ a second, but I
assumed that the same blocking problem would be there.

The inverted version is a statement modifier which doesn't create a
local block scope for $1.


$ perl -le 'while ("aa" =~ /^(.)a/) {last;};print $1' # undef
$ perl -le 'print $1 and exit while ("aa" =~ /^(.)a/)' # a


I didn't find this explained in perlsyn although maybe it's hiding
elsewhere.
 
G

Gunnar Hjalmarsson

Julian said:
@x = ( 'aaa','bbb');
while ( $x[$i] !~ /^(.)b/ && $i <= $#x ) { $i++; }
print "\$1 is *$1*, i is $i\n";

No. Even if 'bbb' matches ^(.)b at the last loop iteration, you don't
test that. You test whether they *do not* match, so the test fails, and
$1 is not set.

Wrong. As demonstrated by the if example later in my post, match variables
are set by a !~ . (Otherwise $a !~ /foo/ would not be equivalent to
! ($a =~ /foo/) !)

Hmm.. Yes, apparently I was wrong. Don't remember how I reached that
faulty conclusion. Sorry for the confusion. :(
 
A

anno4000

Julian Bradfield said:
Consider the following:

@x = ( 'aaa','bbb');
while ( $x[$i] !~ /^(.)b/ && $i <= $#x ) { $i++; }
print "\$1 is *$1*, i is $i\n";

The loop terminates at $i == 1 when 'bbb' matches ^(.)b
The enclosing block for the match construct is the whole file.
Therefore $1 should be 'b'.

But it isn't (in Perl 5.8.5).

What am I missing?

Compare

@x = ( 'aaa','bbb');
if ( $x[$i] !~ /^(.)a/ && $i <= $#x ) { $i++; }
print "\$1 is *$1*, i is $i\n";

which behaves as expected.

My advice is to avoid the match variables whenever possible. It is safer
and saner to match in list context and catch the results in normal Perl
variables with no surprises, and meaningful names to boot.

To do so, first rewrite the loop control to use =~ instead of !~

while ( ! ( $x[$i] =~ /^(.)b/) && $i <= $#x ) { $i++ }

That doesn't change the behavior. Now catch the match:

while ( ! ( ( $capt) = $x[$i] =~ /^(.)b/) && $i <= $#x ) { $i++ }
print "\$capt is *$capt*, i is $i\n";

That gives you the expected capture of "b" without fuss.

BTW, your loop control is slightly off. If no match occurs, you'll
increase the index beyond the array and try that element.

Check the index first.

while ( $i <= $#x && ! ( ( $capt) = $x[$i] =~ /^(.)b/)) { $i++ }

Now the access is protected by the condition. That's the beauty of
short-circuiting booleans.

Anno
 
B

Ben Morrow

Quoth (e-mail address removed)-berlin.de:
while ( $i <= $#x && ! ( ( $capt) = $x[$i] =~ /^(.)b/)) { $i++ }

or

until ( $i > $#x || ($capt) = $x[$i] =~ /^(.)b/ ) { $i++ }

or (cleaner IMHO)

use List::MoreUtils qw/firstidx/;

my $capt;
my $i = firstidx { ($capt) = /^(.)b/ } @x;

Ben
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,201
Messages
2,571,049
Members
47,652
Latest member
Campbellamy

Latest Threads

Top