$string="abcabcabc";
@findall = $string =~ /abcabc/g;
print scalar(@findall), "\n";
The above commands will print 1 rather than 2. Because there are two
overlapping 'abcabc', I'd like to get 2. I'm wondering what is the
correct way to find all overlapping regexes. (Note that I gave
'abcabc' as an example, but it could be any complex regex) Thanks!
Dear Peng Yu,
Here's one way to do it:
while ($string =~ m/(abcabc)/g)
{
push @findall, $1;
pos($string) = $-[0] + 1;
}
If you prefer to implement it in one line of code, you can do this:
push(@findall, $1) and pos($string) = $-[0] + 1
while $string =~ m/(abcabc)/g;
Here's the explanation of what is happening: Normally, m//g and
s///g both make additional matches AFTER (or right at) the end of the
previous match, meaning that you can't directly use them to find
overlapping patterns. However, inside a while($string =~ m//g) loop
you can manipulate the pos($string) variable to force m//g to begin
looking wherever you want -- or in your case, one character after the
start of the last match. (You have to start one (or more) characters
after, because if you started at (or before) the start of the last
match the loop would be infinite.)
As for the $-[0] variable, that's the first element of the @-
array, which you can look up with "perldoc -v @-". $-[0] is basically
the start of the last successful match, so ($-[0] + 1) would be the
earliest where you would want to continue your search for overlapping
patterns.
I hope this helps, Peng Yu.
Cheers,
-- Jean-Luc