Possible bug in regexp engine?

G

Greg Hurrell

Given the following regular expression:

/([^*#]+|#(?!#|\*)|\*(?!#))+/

I wanted to make it more readable by inserting some comments, so I
tried adding the "x" option and it no longer compiled:

/([^*#]+|#(?!#|\*)|\*(?!#))+/x

If you try it in irb you'll see a message similar to this:

SyntaxError: compile error
(irb):31: unmatched :) /([^*#]+|#(?!#|\*)|\*(?!#))+/

To get this to compile I had to add additional backslashes to escape
the '#' character in the negative lookahead subexpressions:

/([^*#]+|#(?!\#|\*)|\*(?!\#))+/x

The '#' character normally matches itself in a regular expression.
With the "x" option I expect it to have a special meaning (indicating
a comment) but in one special position (immediately after the opening
brace and question mark):

(?# comment )

Is this a bug in the regular expression engine, undocumented or am I
missing something?

No big deal, the thing is compiling, but I'd like to understand this a
bit better.

Cheers,
Greg
 
R

Robert Klemme

Given the following regular expression:

/([^*#]+|#(?!#|\*)|\*(?!#))+/

I wanted to make it more readable by inserting some comments, so I
tried adding the "x" option and it no longer compiled:

/([^*#]+|#(?!#|\*)|\*(?!#))+/x

If you try it in irb you'll see a message similar to this:

SyntaxError: compile error
(irb):31: unmatched :) /([^*#]+|#(?!#|\*)|\*(?!#))+/

To get this to compile I had to add additional backslashes to escape
the '#' character in the negative lookahead subexpressions:

/([^*#]+|#(?!\#|\*)|\*(?!\#))+/x

The '#' character normally matches itself in a regular expression.
With the "x" option I expect it to have a special meaning (indicating
a comment) but in one special position (immediately after the opening
brace and question mark):

(?# comment )

Is this a bug in the regular expression engine, undocumented or am I
missing something?

No big deal, the thing is compiling, but I'd like to understand this a
bit better.

A "#" all by itself introduces a line comment as in normal code. So
everything after it is treated as a comment:

15:43:52 [~]: ruby -e 'r=/foo#bar
baz/x; p r; p r =~ "foobaz"'
/foo#bar
baz/x
0

The regexp matches "foobaz". /x allows line comments - (?#...) works always:

15:46:08 [~]: ruby -e 'r=/foo(?#bar)baz/; p r; p r =~ "foobaz"'
/foo(?#bar)baz/
0

Kind regards

robert
 
G

Greg Hurrell

A "#" all by itself introduces a line comment as in normal code. So
everything after it is treated as a comment

Thanks for the info, Robert.

Cheers,
Greg
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,237
Messages
2,571,189
Members
47,823
Latest member
eipamiri

Latest Threads

Top