Regexp backtracking issues in 1.8

  • Thread starter Vasile Gaburici
  • Start date
V

Vasile Gaburici

Basically Ruby 1.8 fails to backtrack correctly a number of regular
expressions. The simplest one is probably:

irb(main):036:0> /a(ba|.)*?a/.match('axba')
=> nil

There is more to it, and since I had to write it down for my students,
I'm just going to link to the gory details:
http://www.cs.umd.edu/~gaburici/rubyre.html
 
R

Robert Dober

Basically Ruby 1.8 fails to backtrack correctly a number of regular
expressions. The simplest one is probably:

irb(main):036:0> /a(ba|.)*?a/.match('axba')
=> nil

There is more to it, and since I had to write it down for my students,
I'm just going to link to the gory details:
http://www.cs.umd.edu/~gaburici/rubyre.html
I just followed the link to find this jewel:

Ruby 1.8 has a rather buggy regexp engine. Some constructs using
alternation followed by a Kleene star operator will not behave as
expected, or as they do in Python or Perl.

very funny I think

Cheers
Robert
 
W

Wolfgang Nádasi-Donner

Vasile said:
Basically Ruby 1.8 fails to backtrack correctly a number of regular
expressions. The simplest one is probably:

irb(main):036:0> /a(ba|.)*?a/.match('axba')
=> nil

There is more to it, and since I had to write it down for my students,
I'm just going to link to the gory details:
http://www.cs.umd.edu/~gaburici/rubyre.html

It is an Error in Ruby 1.8. In Ruby 1.9 (with Oniguruma) it works correct
("ruby-445" is a special naming of myself for some experiments):

C:\Dokumente und Einstellungen\wolfgang\Desktop>type regexerr.rb
p "axba".match(/a(ba|.)*?a/)
C:\Dokumente und Einstellungen\wolfgang\Desktop>ruby -v
ruby 1.8.5 (2006-08-25) [i386-mswin32]

C:\Dokumente und Einstellungen\wolfgang\Desktop>ruby regexerr.rb
nil

C:\Dokumente und Einstellungen\wolfgang\Desktop>ruby-445 regexerr.rb
#<MatchData:0x2ac7ab8>

C:\Dokumente und Einstellungen\wolfgang\Desktop>ruby-445 -v
ruby 1.9.0 (2006-10-28) [i386-mingw32]

Wolfgang Nádasi-Donner
 
N

Nikolas Coukouma

Vasile said:
Basically Ruby 1.8 fails to backtrack correctly a number of regular
expressions. The simplest one is probably:

irb(main):036:0> /a(ba|.)*?a/.match('axba')
=> nil

There is more to it, and since I had to write it down for my students,
I'm just going to link to the gory details:
http://www.cs.umd.edu/~gaburici/rubyre.html

I've posted a lot of very short test cases with varying quantifiers. The
results are interesting:
http://nikolasco.livejournal.com/374793.html

Regards,
-Nikolas
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,236
Messages
2,571,185
Members
47,820
Latest member
HortenseKo

Latest Threads

Top