I
Iain Barnett
Hi,
I've some more regex questions. I wrote a pattern to check for valid =
regexes and inspect the parts (we all have our reasons for the things we =
do It wasn't working so I went down to simpler and simpler patterns, =
but I'm a bit surprised at the way Ruby 1.9 is handling the regexes. I =
tested the same pattern in Perl and it came out with the answers I'd =
expect.
Is this down to me using perl regexes for so long, or is there something =
I'm missing about Ruby's implementation? It appears ^ at the beginning =
of a string doesn't bind as strongly as I'd expect.
I believe this test should fail as <delim> should be bound to the =
beginning of the string by the ^ , and the match result is a little bit =
crazy - shouldn't the main capture be "d\\d" if it's following the =
logical route it's chosen?
$ ruby -e ' =
=20
md =3D =
/^(?<mors>m)?(?<delim>.)(?<pat>.+?)\g<delim>/.match( %q!/\d\d\\d! )=20
puts md.inspect
'
#<MatchData "/\\d" mors:nil delim:"d" pat:"\\">
Here I add on a trailing slash to the string, and (I believe) it should =
bring me back what's between the / / :
$ ruby -e '
md =3D =
/^(?<mors>m)?(?<delim>.)(?<pat>.+?)\g<delim>/.match( %q!/\d\d\\d/! )
puts md.inspect
'
#<MatchData "/\\d" mors:nil delim:"d" pat:"\\">
Here's the first string in perl 5.12 :
$ perl -e '
if ( q(/\d\d\\d) =3D~ /^(?<mors>m)?(?<delim>.)(?<pat>.+?)\g{delim}/ ) { =
=20
while ( my ($key, $value) =3D each(%+) ) {
print "$key =3D> $value\n";
}
}
'
<nothing here, what I'd expect>
And here it is with the "valid" string:
$ perl -e '
if ( q(/\d\d\\d/) =3D~ /^(?<mors>m)?(?<delim>.)(?<pat>.+?)\g{delim}/ ) =
{
while ( my ($key, $value) =3D each(%+) ) {
print "$key =3D> $value\n";
}
}
'
pat =3D> \d\d\d
delim =3D> /
These are the answers I'd expect.
Even this seems unexpected to me, if I remove the <mors> then surely ^ =
should bind <delim> to the beginning???
$ ruby -e '
md =3D /^(?<delim>.)(?<pat>.+?)\g<delim>/.match( =
%q!/\d\d\\d/! )=20
puts md.inspect =20
' =20
#<MatchData "/\\d" delim:"d" pat:"\\">
These work as I'd expect by using the end of line $ :
$ ruby -e '=20
md =3D /^(?<delim>.)(?<pat>.+?)\g<delim>$/.match( =
%q!/\d\d\\d/! )
puts md.inspect
'
#<MatchData "/\\d\\d\\d/" delim:"/" pat:"\\d\\d\\d">
$ ruby -e '
md =3D =
/^(?<mors>m)?(?<delim>.)(?<pat>.+?)\g<delim>$/.match( %q!/\d\d\\d/! )
puts md.inspect =20
' =20
#<MatchData "/\\d\\d\\d/" mors:nil delim:"/" pat:"\\d\\d\\d">
And finally, if I remove the caret but leave the $ I get the answer I'd =
expect (or that I'm looking for) :
$ ruby -e '=20
md =3D =
/(?<mors>m)?(?<delim>.)(?<pat>.+?)\g<delim>$/.match( %q!/\d\d\\d/! )=20
puts md.inspect
'
#<MatchData "/\\d\\d\\d/" mors:nil delim:"/" pat:"\\d\\d\\d">
Regards,
Iain
I've some more regex questions. I wrote a pattern to check for valid =
regexes and inspect the parts (we all have our reasons for the things we =
do It wasn't working so I went down to simpler and simpler patterns, =
but I'm a bit surprised at the way Ruby 1.9 is handling the regexes. I =
tested the same pattern in Perl and it came out with the answers I'd =
expect.
Is this down to me using perl regexes for so long, or is there something =
I'm missing about Ruby's implementation? It appears ^ at the beginning =
of a string doesn't bind as strongly as I'd expect.
I believe this test should fail as <delim> should be bound to the =
beginning of the string by the ^ , and the match result is a little bit =
crazy - shouldn't the main capture be "d\\d" if it's following the =
logical route it's chosen?
$ ruby -e ' =
=20
md =3D =
/^(?<mors>m)?(?<delim>.)(?<pat>.+?)\g<delim>/.match( %q!/\d\d\\d! )=20
puts md.inspect
'
#<MatchData "/\\d" mors:nil delim:"d" pat:"\\">
Here I add on a trailing slash to the string, and (I believe) it should =
bring me back what's between the / / :
$ ruby -e '
md =3D =
/^(?<mors>m)?(?<delim>.)(?<pat>.+?)\g<delim>/.match( %q!/\d\d\\d/! )
puts md.inspect
'
#<MatchData "/\\d" mors:nil delim:"d" pat:"\\">
Here's the first string in perl 5.12 :
$ perl -e '
if ( q(/\d\d\\d) =3D~ /^(?<mors>m)?(?<delim>.)(?<pat>.+?)\g{delim}/ ) { =
=20
while ( my ($key, $value) =3D each(%+) ) {
print "$key =3D> $value\n";
}
}
'
<nothing here, what I'd expect>
And here it is with the "valid" string:
$ perl -e '
if ( q(/\d\d\\d/) =3D~ /^(?<mors>m)?(?<delim>.)(?<pat>.+?)\g{delim}/ ) =
{
while ( my ($key, $value) =3D each(%+) ) {
print "$key =3D> $value\n";
}
}
'
pat =3D> \d\d\d
delim =3D> /
These are the answers I'd expect.
Even this seems unexpected to me, if I remove the <mors> then surely ^ =
should bind <delim> to the beginning???
$ ruby -e '
md =3D /^(?<delim>.)(?<pat>.+?)\g<delim>/.match( =
%q!/\d\d\\d/! )=20
puts md.inspect =20
' =20
#<MatchData "/\\d" delim:"d" pat:"\\">
These work as I'd expect by using the end of line $ :
$ ruby -e '=20
md =3D /^(?<delim>.)(?<pat>.+?)\g<delim>$/.match( =
%q!/\d\d\\d/! )
puts md.inspect
'
#<MatchData "/\\d\\d\\d/" delim:"/" pat:"\\d\\d\\d">
$ ruby -e '
md =3D =
/^(?<mors>m)?(?<delim>.)(?<pat>.+?)\g<delim>$/.match( %q!/\d\d\\d/! )
puts md.inspect =20
' =20
#<MatchData "/\\d\\d\\d/" mors:nil delim:"/" pat:"\\d\\d\\d">
And finally, if I remove the caret but leave the $ I get the answer I'd =
expect (or that I'm looking for) :
$ ruby -e '=20
md =3D =
/(?<mors>m)?(?<delim>.)(?<pat>.+?)\g<delim>$/.match( %q!/\d\d\\d/! )=20
puts md.inspect
'
#<MatchData "/\\d\\d\\d/" mors:nil delim:"/" pat:"\\d\\d\\d">
Regards,
Iain