conditional regular expressions

M

Morfys

hi,

I was having trouble understanding how regular conditional expressions
work. In particular, in this piece of code:

$str = "ttt, 2";

if(($result) = $str =~ /\w\w\w(?(?=, )(\d))/){

print "$result\n";

}

I don't understand why there is empty output versus "2".

The desired effect is that $result contain the digit following ", " if
it exists in the string.

I've tried almost every variation of the regular expression but I'm
missing something. Thanks in advance for any help.
 
M

Mirco Wahab

Morfys said:
$str = "ttt, 2";
if(($result) = $str =~ /\w\w\w(?(?=, )(\d))/){
print "$result\n";
}
I don't understand why there is empty output versus "2".
The desired effect is that $result contain the digit following ", " if
it exists in the string.

The regular expression is stepped through from
left to right. Therefore there is an implicit
"the more right pattern matches only if the more
left pattern already matched", which leads to the
trivial solution 1.:

my $str = 'ttt, 2';
my $result;

# trivial solution w/capturing
($result) = $str=~/\w{3}, (\d)/ and print "1. $result\n";

If you want to do some more fancy stuff, you
could 'ifify' the whole expression with
a "positive lookbehind", like:

# lookbehind (?<= ...), no capturing, /g in list context
($result) = $str=~/(?<=\w{3}, )\d/g and print "2. $result\n";

Regards

M.
 
X

xhoster

Morfys said:
hi,

I was having trouble understanding how regular conditional expressions
work. In particular, in this piece of code:

$str = "ttt, 2";

if(($result) = $str =~ /\w\w\w(?(?=, )(\d))/){

print "$result\n";

}

I don't understand why there is empty output versus "2".

?= is zero-width, meaning that after the ', ' matches, the \d is left
to match starting at the same place, namely, at the ','. Since ',' is not
a \d, it can't match.
The desired effect is that $result contain the digit following ", " if
it exists in the string.

Isn't that just what the simple /\w\w\w, (\d)/ would do?

Xho
 
M

Morfys

Thanks a lot! Indeed my problem was that ?= is zero-width.
Isn't that just what the simple /\w\w\w, (\d)/ would do?

In fact, the example I gave was just to simplify things so that my
question could be clear. In my "real" regular expression, I need to
parse strings like
"open(X,X)"
and
"open(X,X,X)"
and so the conditional seems necessary.
 
T

Tad McClellan

Morfys said:
In fact, the example I gave was just to simplify things so that my
question could be clear. In my "real" regular expression, I need to
parse strings like
"open(X,X)"
and
"open(X,X,X)"


$_ = 'open(X,Y,Z)';
my @parts = /(\w+)(?=[,)])/g;
 
G

Gunnar Hjalmarsson

Morfys said:
In fact, the example I gave was just to simplify things so that my
question could be clear.

Then you didn't succeed very well. :)
In my "real" regular expression, I need to parse strings like
"open(X,X)"
and
"open(X,X,X)"
and so the conditional seems necessary.

my $str = 'open(4,5,6)';
if ( my ($digits) = $str =~ /open\((\d(?:,\d)*)\)/ ) {
my @result = split /,/, $digits;
print "@result\n";
}
 
M

Mirco Wahab

Morfys said:
In fact, the example I gave was just to simplify things so that my
question could be clear. In my "real" regular expression, I need to
parse strings like
"open(X,X)"
and
"open(X,X,X)"
and so the conditional seems necessary.

.... condition to branch on what?

I could understand this intend
to match a string like

"open(X,X,X), 3"
or
"open(X,X), 2"

but your example doesn't provide
an ample motivation for introducting
conditionals.

Regards

M.
 
M

Mirco Wahab

Tad said:
Morfys said:
"open(X,X)"
and
"open(X,X,X)"
$_ = 'open(X,Y,Z)';
my @parts = /(\w+)(?=[,)])/g;

Just out of couriosity, what's the
purpose of these fancy parenthesis
around \w+ ? Is there any hidden
mechanics to use them this way?

Regards

M.
 
G

Gunnar Hjalmarsson

Tad said:
Morfys said:
I need to parse strings like
"open(X,X)"
and
"open(X,X,X)"

$_ = 'open(X,Y,Z)';
my @parts = /(\w+)(?=[,)])/g;

You don't need both capturing parentheses and the "(?= ... )" part, do you?

my @parts = /(\w+)[,)]/g;

or

my @parts = /\w+(?=[,)])/g;
 
A

anno4000

Mirco Wahab said:
Tad said:
Morfys said:
"open(X,X)"
and
"open(X,X,X)"
$_ = 'open(X,Y,Z)';
my @parts = /(\w+)(?=[,)])/g;

Just out of couriosity, what's the
purpose of these fancy parenthesis
around \w+ ? Is there any hidden
mechanics to use them this way?

Hmm? What's fancy about them. They're normal capturing parentheses.

Anno
 
M

Mirco Wahab

Mirco Wahab said:
Tad said:
"open(X,X)"
and
"open(X,X,X)"
$_ = 'open(X,Y,Z)';
my @parts = /(\w+)(?=[,)])/g;
Just out of couriosity, what's the
purpose of these fancy parenthesis
around \w+ ? Is there any hidden
mechanics to use them this way?

Hmm? What's fancy about them. They're normal capturing parentheses.

I wouldn't consider normal in the said example,
therefore my question: "does the perl regex
engine distinguish internal" between

my @foo = /\w+(?= bar)/g;
and
my @foo = /(\w+)(?= bar)/g;

I can't answer this for myself,
thats the reason I asked Tad.

Regards

M.
 
A

anno4000

Mirco Wahab said:
Mirco Wahab said:
Tad McClellan wrote:
"open(X,X)"
and
"open(X,X,X)"
$_ = 'open(X,Y,Z)';
my @parts = /(\w+)(?=[,)])/g;
Just out of couriosity, what's the
purpose of these fancy parenthesis
around \w+ ? Is there any hidden
mechanics to use them this way?

Hmm? What's fancy about them. They're normal capturing parentheses.

I wouldn't consider normal in the said example,

I'm still not sure *why* you'd consider them abnormal. It's true that
/g makes them redundant, but they function as they always do.
therefore my question: "does the perl regex
engine distinguish internal" between

my @foo = /\w+(?= bar)/g;
and
my @foo = /(\w+)(?= bar)/g;

I can't answer this for myself,
thats the reason I asked Tad.

I can't speak for Tad, but the parens make the capture more robust
while you change parts of the regex around.

Anno
 
M

Mirco Wahab

Morfys said:
that's what I meant. ;)

OK, then there are some solutions available,
my first trial would be something like:

....
my @strings = (
"open(X,X,X), 1",
"open(X,X,X), 3",
"open(X,X,X), 31",
"open(X,X), 2",
"open(X,X), 23",
"open(X,X), 32" );

my $n;
no strict 'refs';
my $rg = qr{\(
(\w?)[,)]?
(\w?)[,)]?
(\w?)[,)]?
(\w?)[,)]?
,\s+
(??{ scalar(grep ${$_} ,1..@+) })
$
}x;

print "\n$_" for grep /$rg/, @strings
....

which is still simple and will work up to four
"entities". To make a more comprehensive version -
is left to you ;-)

The "Trick" is to check the capture groups (\w)
for 'trueness' (in grep) and take the 'number'
of positively grepped items as the content
(will be "2" or "3") of the dynamic regex
part (??{ ... }).


Regards

M.
 
B

Bill H

that's what I meant. ;)

OK, then there are some solutions available,
my first trial would be something like:

...
my @strings = (
"open(X,X,X), 1",
"open(X,X,X), 3",
"open(X,X,X), 31",
"open(X,X), 2",
"open(X,X), 23",
"open(X,X), 32" );

my $n;
no strict 'refs';
my $rg = qr{\(
(\w?)[,)]?
(\w?)[,)]?
(\w?)[,)]?
(\w?)[,)]?
,\s+
(??{ scalar(grep ${$_} ,1..@+) })
$
}x;

print "\n$_" for grep /$rg/, @strings
...

which is still simple and will work up to four
"entities". To make a more comprehensive version -
is left to you ;-)

The "Trick" is to check the capture groups (\w)
for 'trueness' (in grep) and take the 'number'
of positively grepped items as the content
(will be "2" or "3") of the dynamic regex
part (??{ ... }).

Regards

M.

I know this may not be the most elegant way of doing it, but, using
the examples given couldn't you just do this:

my @strings = (
"open(X,X,X), 1",
"open(X,X,X), 3",
"open(X,X,X), 31",
"open(X,X), 2",
"open(X,X), 23",
"open(X,X), 32" );

foreach $temp (@strings)
{
@dbf=split(/\,/,$temp);
$numpart[@numpart] = $dbf[@dbf - 1];
}

And @numpart would hold all the decimal parts.

Bill H
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,997
Messages
2,570,241
Members
46,831
Latest member
RusselWill

Latest Threads

Top