A bug in Ruby regexp lib?

  • Thread starter ArtÅ«ras Å lajus
  • Start date
A

Artūras Šlajus

ruby 1.8.7 (2008-08-11 patchlevel 72) [i486-linux]

x11@www:~$ irb
irb(main):001:0> s = "www.myspace.com/djmamania
www.myspace.com/djmantini"
=> "www.myspace.com/djmamania www.myspace.com/djmantini"
irb(main):002:0> s1 = s.gsub(%r{(\s|^)(www\..*?)(\s|$)}m, '\1<a
href="http://\2">\2</a>\3')
=> "<a
href=\"http://www.myspace.com/djmamania\">www.myspace.com/djmamania</a>
www.myspace.com/djmantini"
irb(main):003:0> s1.gsub(%r{(\s|^)(www\..*?)(\s|$)}m, '\1<a
href="http://\2">\2</a>\3')
=> "<a
href=\"http://www.myspace.com/djmamania\">www.myspace.com/djmamania</a>
<a
href=\"http://www.myspace.com/djmantini\">www.myspace.com/djmantini</a>"

Why I have to call gsub two times for this to work? Same regexp works
fine in Firefox JS :)
 
T

Tim Greer

Artc5abras said:
ruby 1.8.7 (2008-08-11 patchlevel 72) [i486-linux]

x11@www:~$ irb
irb(main):001:0> s = "www.myspace.com/djmamania
www.myspace.com/djmantini"
=> "www.myspace.com/djmamania www.myspace.com/djmantini"
irb(main):002:0> s1 = s.gsub(%r{(\s|^)(www\..*?)(\s|$)}m, '\1<a
href="http://\2">\2</a>\3')
=> "<a
href=\"http://www.myspace.com/djmamania\">www.myspace.com/djmamania said:
www.myspace.com/djmantini"
irb(main):003:0> s1.gsub(%r{(\s|^)(www\..*?)(\s|$)}m, '\1<a
href="http://\2">\2</a>\3')
=> "<a
href=\"http://www.myspace.com/djmamania\">www.myspace.com/djmamania said:
href=\"http://www.myspace.com/djmantini\">www.myspace.com/djmantini said:
Why I have to call gsub two times for this to work? Same regexp works
fine in Firefox JS :)

Did you mean:

s1 = s.gsub(%r{(^|\s)?(www\..*?)(\s|$)}m, '\1<a
href="http://\2">\2</a>\3')

irb(main):035:0> s1 = s.gsub(%r{(^|\s)?(www\..*?)(\s|$)}m, '\1<a
href="http://\2">\2</a>\3')
=> "<a
href=\"http://www.myspace.com/djmamania\">www.myspace.com/djmamania</a>
<a
href=\"http://www.myspace.com/djmantini\">www.myspace.com/djmantini</a>"

Note the \1 is using (^|\s), as it's either the start of the string (^)
or a white space between the two URLs (\s), but you also have \3, which
is either the end of the string ($) or white space between the URLs (or
following) (\s), and since there's only one white space between the two
URLs, it throws is off.

To account for both \1 and \3, above I've set it to be optional (^|\s)?
because this will allow you to use \3 without is breaking it. There
are other ways to do this, but just working with what you were using,
that's a change you could make to get the desired results on the first
one... unless I misunderstood what you were trying to do?
 
T

Tim Greer

Tim said:
Artc5abras said:
ruby 1.8.7 (2008-08-11 patchlevel 72) [i486-linux]

x11@www:~$ irb
irb(main):001:0> s = "www.myspace.com/djmamania
www.myspace.com/djmantini"
=> "www.myspace.com/djmamania www.myspace.com/djmantini"
irb(main):002:0> s1 = s.gsub(%r{(\s|^)(www\..*?)(\s|$)}m, '\1<a
href="http://\2">\2</a>\3')
=> "<a
href=\"http://www.myspace.com/djmamania\">www.myspace.com/djmamania said:
www.myspace.com/djmantini"
irb(main):003:0> s1.gsub(%r{(\s|^)(www\..*?)(\s|$)}m, '\1<a
href="http://\2">\2</a>\3')
=> "<a
href=\"http://www.myspace.com/djmantini\">www.myspace.com/djmantini said:
Why I have to call gsub two times for this to work? Same regexp works
fine in Firefox JS :)

Did you mean:

s1 = s.gsub(%r{(^|\s)?(www\..*?)(\s|$)}m, '\1<a
href="http://\2">\2</a>\3')

irb(main):035:0> s1 = s.gsub(%r{(^|\s)?(www\..*?)(\s|$)}m, '\1<a
href="http://\2">\2</a>\3')
=> "<a
href=\"http://www.myspace.com/djmamania\">www.myspace.com/djmamania said:
href=\"http://www.myspace.com/djmantini\">www.myspace.com/djmantini said:
Note the \1 is using (^|\s), as it's either the start of the string
(^) or a white space between the two URLs (\s), but you also have \3,
which is either the end of the string ($) or white space between the
URLs (or following) (\s), and since there's only one white space
between the two URLs, it throws is off.

To account for both \1 and \3, above I've set it to be optional
(^|\s)?
because this will allow you to use \3 without is breaking it. There
are other ways to do this, but just working with what you were using,
that's a change you could make to get the desired results on the first
one... unless I misunderstood what you were trying to do?

Geez, pardon the typos I've made above. Apparently I'm having trouble
working my keyboard (some of those "is" should be "it")
 
A

Artūras Šlajus

Tim said:
Note the \1 is using (^|\s), as it's either the start of the string (^)
or a white space between the two URLs (\s), but you also have \3, which
is either the end of the string ($) or white space between the URLs (or
following) (\s), and since there's only one white space between the two
URLs, it throws is off.

To account for both \1 and \3, above I've set it to be optional (^|\s)?
because this will allow you to use \3 without is breaking it. There
are other ways to do this, but just working with what you were using,
that's a change you could make to get the desired results on the first
one... unless I misunderstood what you were trying to do?

Ah, thank you. It seems that Ruby is parsing that string after getting
last \s down there. But shouldn't \3 insert it right back? :)

Anyways, I have another problem then ;]
it "should link http links" do
"http://www.myspace.com/djmamania".htmlize.should == \
'<p><a
href="http://www.myspace.com/djmamania">www.myspace.com/djmamania</a></p>'
end

2)
'String#htmlize should link http links' FAILED
expected: "<p><a
href=\"http://www.myspace.com/djmamania\">www.myspace.com/djmam
ania</a></p>",
got: "<p>http://<a
href=\"http://www.myspace.com/djmamania\">www.myspace.com/djmamania</a></p>"
(using ==)

What do you suggest?
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,995
Messages
2,570,225
Members
46,815
Latest member
treekmostly22

Latest Threads

Top