I try to match .cpp files that doesn't start with 'main'. But the grep
command below match all the .cpp files. I know that \w* tries to match
as long as possible. Do you know how to fix the regex to get all
the .cpp files that doesn't start with 'main'?
#!/usr/bin/env perl
use strict;
use warnings;
my @array=qw(main.cpp main_xx.cpp uuu.cpp vvv.cpp);
my @non_main_cpp=grep /(?<!main)\w*.cpp/, @array;
print join(', ', @non_main_cpp), "\n";
The form was close but it won't work this way.
This concept is hard to grasp.
You can do it 1 of 2 ways:
With a negative look behind:
@non_main_cpp = grep /^(?:\w(?<!main))*\.cpp$/, @array;
This is what you tried to do.
Looking behind must be "visualized" as if YOU were the
current character as you traverse the string.
Each \w that is found in the accumulating match
must be immediatly tested that ^main isin't behind us.
Using ^ (?: \w (?<!main) )* on "main_xx.cpp" we see the
match progression:
main_xx.cpp
^ at the beginning, no ^main behind us
m^ still ok
ma^ ok
mai^ ok
main^ failed, ^main is behind us
Yours didn't work because the look behind was done
before the first \w was found. It then wen't on to
find all the \w* without even checkin the assertion.
Or, with a negative look ahead:
@non_main_cpp = grep /^(?!main)\w*\.cpp$/, @array;
This is a look ahead. As usual the object were looking
ahead of is to the left of the assertion.
In this case, its the begining of the line ^, before \w*.
The check is done once. If success, \w* will try to match,
but the assertion is never checked.
main_xx.cpp
^ at the beginning, failed ^main is ahead of us
For your circumstances, this is the preferred method.
----
What helps when you write regular expressions is to "be" the
character as you traverse the string. Make it a personal exercise.
I am a 9, I don't want 'a' or 'b' next to me, I want a space or
digit, I need this done five times from the beginning with
only the end of string in front of "us".
Yeah well, something like that..
-sln