Perl RE bug with keys(%+)

C

Clint O

Maybe this is a bug, maybe not. I am using the named capture buffers
to reduce bugs as I change grouping of my regular expressions over
time. In a lexical analysis application, I'm using it over a series
of alternations.

my $re = qr/ (?<ALT1>pattern) | (?<ALT2>pattern) | ...

One of the alternations happens to be nested:

my $foo = qr{
(?<CODEBEGIN>
\{
(?<CODE>
(?:
(?> [^{}\n]+ ) # Non-parens without
backtracking
|
(?&CODEBEGIN) # Recurse to start of
pattern
)*
)
\}
)
}x;

However, when I ask for the keys of %+, I only get back CODEBEGIN yet
the CODE capture is there when I ask for it. My hope was to use the
keys to determine what I matched so I didn't have to do a series of
tests on %+, but apparently I will have to continue doing this since
this method won't work.

This is Perl 5.10.0.

Thanks,

-Clint
 
S

sln

Maybe this is a bug, maybe not. I am using the named capture buffers
to reduce bugs as I change grouping of my regular expressions over
time. In a lexical analysis application, I'm using it over a series
of alternations.

my $re = qr/ (?<ALT1>pattern) | (?<ALT2>pattern) | ...

One of the alternations happens to be nested:

my $foo = qr{
(?<CODEBEGIN>
\{
(?<CODE>
(?:
(?> [^{}\n]+ ) # Non-parens without
^^
This is not good here, "\n" is never consumed and most likely
the result is a non-match.
This can also be written more effectively as [^{}]++
backtracking
|
(?&CODEBEGIN) # Recurse to start of
pattern
)*
)
\}
)
}x;

However, when I ask for the keys of %+, I only get back CODEBEGIN yet
the CODE capture is there when I ask for it. My hope was to use the
keys to determine what I matched so I didn't have to do a series of
tests on %+, but apparently I will have to continue doing this since
this method won't work.

This is Perl 5.10.0.

Thanks,

-Clint

You are right, it probably is a bug. However, %+ seems to be private
within recursion the way you have it because acording to the docs
CODEBEGIN can't know about CODE and visa-versa.

That $+{CODE} can be tested and contain a value outside of CODEBEGIN
is a mystery and worrysome. You can of course maintain your own private
hash to store results.

Below, shows this behavior in more detail. Let me know if you find
a satisfactory answer to this.

-sln
---------
use strict;
use warnings;
use Devel::peek;
use Data::Dumper;

my %CodeAll = ();
my $container = '';

my $string = " func { subfunc { some {code }; more code } {last block}";

my $foo = qr/
(?<CODEBEGIN>
\{
(?<CODE>
(?:
[^{}]++ # Non-parens without backtracking
|
(?&CODEBEGIN) # Recurse to start of pattern
)*
)
(?{ print " * ",Dumper(\%+);
$container = $+{CODE};
})
\}
)
(?{ print ">>* ",Dumper(\%+);
$CodeAll{CODEBEGIN} = $+{CODEBEGIN};
$CodeAll{CODE} = $+{CODE};
})
/x;

print "______________________\n\n";

while ($string =~ /$foo/g)
{
print "\n\n====================\n";
Dump \%+;
print "\n( \%+ )\n",Dumper(\%+);
print "( \%CodeAll )\n",Dumper(\%CodeAll),"\n";
print "______________________\n\n";
}
__END__
 
C

Clint O

This is not good here, "\n" is never consumed and most likely
the result is a non-match.
This can also be written more effectively as   [^{}]++

Yes, I ended up simplifying my life and using this before I saw your
post:

my $code = qr{
(?<CODEBEGIN>
\{
(?<CODE>
(?:
(?> [^{}]+ ) # Non-curly without
backtracking
|
(?&CODEBEGIN) # Recurse to start of
pattern
)*
)
\}
)
}x;

Then I go back and split the token on '\\\n' to weed out the escaped
newlines. My hope was to avoid re-scanning any string, but the RE and
concatenation rules just became unmanageable at some point and I
decided to cut my losses. I'm not familiar with the '++', but I will
look that up as an alternative to using (?> ). So far you are the
only person that has responded to this post, so I'm not hopeful that
I'll get a satisfactory answer from anyone as to what's happening
here.

Thanks,

-Clint
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,995
Messages
2,570,236
Members
46,822
Latest member
israfaceZa

Latest Threads

Top