conditional regular expressions

Morfys · Aug 13, 2007

hi,

I was having trouble understanding how regular conditional expressions
work. In particular, in this piece of code:

$str = "ttt, 2";

if(($result) = $str =~ /\w\w\w(?(?=, )(\d))/){

print "$result\n";

}

I don't understand why there is empty output versus "2".

The desired effect is that $result contain the digit following ", " if
it exists in the string.

I've tried almost every variation of the regular expression but I'm
missing something. Thanks in advance for any help.

Mirco Wahab · Aug 13, 2007

Morfys said:
$str = "ttt, 2";
if(($result) = $str =~ /\w\w\w(?(?=, )(\d))/){
print "$result\n";
}
I don't understand why there is empty output versus "2".
The desired effect is that $result contain the digit following ", " if
it exists in the string.

The regular expression is stepped through from
left to right. Therefore there is an implicit
"the more right pattern matches only if the more
left pattern already matched", which leads to the
trivial solution 1.:

my $str = 'ttt, 2';
my $result;

# trivial solution w/capturing
($result) = $str=~/\w{3}, (\d)/ and print "1. $result\n";

If you want to do some more fancy stuff, you
could 'ifify' the whole expression with
a "positive lookbehind", like:

# lookbehind (?<= ...), no capturing, /g in list context
($result) = $str=~/(?<=\w{3}, )\d/g and print "2. $result\n";

Regards

M.

xhoster · Aug 13, 2007

Morfys said:
hi,

I was having trouble understanding how regular conditional expressions
work. In particular, in this piece of code:

$str = "ttt, 2";

if(($result) = $str =~ /\w\w\w(?(?=, )(\d))/){

print "$result\n";

}

I don't understand why there is empty output versus "2".

?= is zero-width, meaning that after the ', ' matches, the \d is left
to match starting at the same place, namely, at the ','. Since ',' is not
a \d, it can't match.

The desired effect is that $result contain the digit following ", " if
it exists in the string.

Isn't that just what the simple /\w\w\w, (\d)/ would do?

Xho

Morfys · Aug 14, 2007

Thanks a lot! Indeed my problem was that ?= is zero-width.

Isn't that just what the simple /\w\w\w, (\d)/ would do?

In fact, the example I gave was just to simplify things so that my
question could be clear. In my "real" regular expression, I need to
parse strings like
"open(X,X)"
and
"open(X,X,X)"
and so the conditional seems necessary.

Tad McClellan · Aug 14, 2007

Morfys said:
In fact, the example I gave was just to simplify things so that my
question could be clear. In my "real" regular expression, I need to
parse strings like
"open(X,X)"
and
"open(X,X,X)"

$_ = 'open(X,Y,Z)';
my @parts = /(\w+)(?=[,)])/g;

Gunnar Hjalmarsson · Aug 14, 2007

Morfys said:
In fact, the example I gave was just to simplify things so that my
question could be clear.

Then you didn't succeed very well.

In my "real" regular expression, I need to parse strings like
"open(X,X)"
and
"open(X,X,X)"
and so the conditional seems necessary.

my $str = 'open(4,5,6)';
if ( my ($digits) = $str =~ /open$(\d(?:,\d)*)$/ ) {
my @result = split /,/, $digits;
print "@result\n";
}

Mirco Wahab · Aug 14, 2007

Morfys said:
In fact, the example I gave was just to simplify things so that my
question could be clear. In my "real" regular expression, I need to
parse strings like
"open(X,X)"
and
"open(X,X,X)"
and so the conditional seems necessary.

.... condition to branch on what?

I could understand this intend
to match a string like

"open(X,X,X), 3"
or
"open(X,X), 2"

but your example doesn't provide
an ample motivation for introducting
conditionals.

Regards

M.

Mirco Wahab · Aug 14, 2007

Tad said:
Morfys said:

"open(X,X)"
and
"open(X,X,X)"

Click to expand...

$_ = 'open(X,Y,Z)';
my @parts = /(\w+)(?=[,)])/g;

Just out of couriosity, what's the
purpose of these fancy parenthesis
around \w+ ? Is there any hidden
mechanics to use them this way?

Regards

M.

Gunnar Hjalmarsson · Aug 14, 2007

Tad said:
Morfys said:

I need to parse strings like
"open(X,X)"
and
"open(X,X,X)"

Click to expand...

$_ = 'open(X,Y,Z)';
my @parts = /(\w+)(?=[,)])/g;

You don't need both capturing parentheses and the "(?= ... )" part, do you?

my @parts = /(\w+)[,)]/g;

or

my @parts = /\w+(?=[,)])/g;

anno4000 · Aug 14, 2007

Mirco Wahab said:
Tad said:

Morfys said:

"open(X,X)"
and
"open(X,X,X)"

Click to expand...

$_ = 'open(X,Y,Z)';
my @parts = /(\w+)(?=[,)])/g;

Click to expand...

Just out of couriosity, what's the
purpose of these fancy parenthesis
around \w+ ? Is there any hidden
mechanics to use them this way?

Hmm? What's fancy about them. They're normal capturing parentheses.

Anno

Mirco Wahab · Aug 14, 2007

Mirco Wahab said:
Mirco Wahab said:

Tad said:

"open(X,X)"
and
"open(X,X,X)"
$_ = 'open(X,Y,Z)';
my @parts = /(\w+)(?=[,)])/g;

Click to expand...

Just out of couriosity, what's the
purpose of these fancy parenthesis
around \w+ ? Is there any hidden
mechanics to use them this way?

Click to expand...

Hmm? What's fancy about them. They're normal capturing parentheses.

I wouldn't consider normal in the said example,
therefore my question: "does the perl regex
engine distinguish internal" between

my @foo = /\w+(?= bar)/g;
and
my @foo = /(\w+)(?= bar)/g;

I can't answer this for myself,
thats the reason I asked Tad.

Regards

M.

anno4000 · Aug 14, 2007

Mirco Wahab said:
Mirco Wahab said:

Tad McClellan wrote:
"open(X,X)"
and
"open(X,X,X)"
$_ = 'open(X,Y,Z)';
my @parts = /(\w+)(?=[,)])/g;
Just out of couriosity, what's the
purpose of these fancy parenthesis
around \w+ ? Is there any hidden
mechanics to use them this way?

Click to expand...

Hmm? What's fancy about them. They're normal capturing parentheses.

Click to expand...

I wouldn't consider normal in the said example,

I'm still not sure *why* you'd consider them abnormal. It's true that
/g makes them redundant, but they function as they always do.

therefore my question: "does the perl regex
engine distinguish internal" between

my @foo = /\w+(?= bar)/g;
and
my @foo = /(\w+)(?= bar)/g;

I can't answer this for myself,
thats the reason I asked Tad.

I can't speak for Tad, but the parens make the capture more robust
while you change parts of the regex around.

Anno

Morfys · Aug 14, 2007

I could understand this intended

to match a string like

"open(X,X,X), 3"
or
"open(X,X), 2"

that's what I meant.

Mirco Wahab · Aug 14, 2007

Morfys said:
that's what I meant.

OK, then there are some solutions available,
my first trial would be something like:

....
my @strings = (
"open(X,X,X), 1",
"open(X,X,X), 3",
"open(X,X,X), 31",
"open(X,X), 2",
"open(X,X), 23",
"open(X,X), 32" );

my $n;
no strict 'refs';
my $rg = qr{\(
(\w?)[,)]?
(\w?)[,)]?
(\w?)[,)]?
(\w?)[,)]?
,\s+
(??{ scalar(grep ${$_} ,1..@+) })
$
}x;

print "\n$_" for grep /$rg/, @strings
....

which is still simple and will work up to four
"entities". To make a more comprehensive version -
is left to you ;-)

The "Trick" is to check the capture groups (\w)
for 'trueness' (in grep) and take the 'number'
of positively grepped items as the content
(will be "2" or "3") of the dynamic regex
part (??{ ... }).

Regards

M.

Bill H · Aug 14, 2007

that's what I meant.

Click to expand...

OK, then there are some solutions available,
my first trial would be something like:

...
my @strings = (
"open(X,X,X), 1",
"open(X,X,X), 3",
"open(X,X,X), 31",
"open(X,X), 2",
"open(X,X), 23",
"open(X,X), 32" );

my $n;
no strict 'refs';
my $rg = qr{\(
(\w?)[,)]?
(\w?)[,)]?
(\w?)[,)]?
(\w?)[,)]?
,\s+
(??{ scalar(grep ${$_} ,1..@+) })
$
}x;

print "\n$_" for grep /$rg/, @strings
...

which is still simple and will work up to four
"entities". To make a more comprehensive version -
is left to you ;-)

The "Trick" is to check the capture groups (\w)
for 'trueness' (in grep) and take the 'number'
of positively grepped items as the content
(will be "2" or "3") of the dynamic regex
part (??{ ... }).

Regards

M.

I know this may not be the most elegant way of doing it, but, using
the examples given couldn't you just do this:

my @strings = (
"open(X,X,X), 1",
"open(X,X,X), 3",
"open(X,X,X), 31",
"open(X,X), 2",
"open(X,X), 23",
"open(X,X), 32" );

foreach $temp (@strings)
{
@dbf=split(/\,/,$temp);
$numpart[@numpart] = $dbf[@dbf - 1];
}

And @numpart would hold all the decimal parts.

Bill H

Regular Expressions: Greedy Matching	7	Mar 1, 2011
FAQ 6.17 How do I efficiently match many regular expressions at once?	0	Apr 28, 2011
know-how(-not) about regular expressions	11	Feb 12, 2010
FAQ 6.1 How can I hope to use regular expressions without creating illegible and unmaintainable code	0	Feb 25, 2011
problems with regular expressions and parenthesis	3	Nov 1, 2008
Regular expressions and long text	13	Jun 20, 2008
Regular expression help	2	Sep 24, 2009
Regular Expressions	4	Jun 17, 2008

conditional regular expressions

Morfys

Mirco Wahab

xhoster

Morfys

Tad McClellan

Gunnar Hjalmarsson

Mirco Wahab

Mirco Wahab

Gunnar Hjalmarsson

anno4000

Mirco Wahab

anno4000

Morfys

Mirco Wahab

Bill H

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads