Bug in sub?

R

Robert Feldt

Hi,

I assume it's been a long day and I'm missing something but line 3 below
sure looks like a bug to me:

$ irb
irb(main):001:0> "a".sub("a", "+")
=> "+"
irb(main):002:0> "a".sub("a", "\\*")
=> "\\*"
irb(main):003:0> "a".sub("a", "\\+")
=> ""
irb(main):004:0> exit

robert_feldt@it002473 /tmp/ruby
$ ruby -v
ruby 1.9.0 (2004-04-04) [i386-cygwin]

The same behavior is seen in 1.8.1 i386-cygwin and 1.8.1 i386-mswin32
and for gsub so it's been in there for some time.

What am I missing?

Regards,

Robert Feldt
 
T

ts

R> irb(main):003:0> "a".sub("a", "\\+")
R> => ""

svg% ruby -e 'p "abcdef".sub(/(.)./, "\\+")'
"acdef"
svg%


svg% ruby -e 'p "abcdef".sub(/(.).(..)/, "\\+")'
"cdef"
svg%



Guy Decoux
 
R

Robert Feldt

ts said:
R> irb(main):003:0> "a".sub("a", "\\+")
R> => ""

svg% ruby -e 'p "abcdef".sub(/(.)./, "\\+")'
"acdef"
svg%


svg% ruby -e 'p "abcdef".sub(/(.).(..)/, "\\+")'
"cdef"
svg%
Yeah, that's right the "last matched group"; I don't use that one much
apparently...

Thanks Guy!

/Robert
 
M

Martin Hart

Yeah, that's right the "last matched group"; I don't use that one much
apparently...

Sorry, call me thick but I don't understand that :)

Did Guy confirm that it was a bug, or show that it is not a bug?

Thanks
Martin
 
A

Ara.T.Howard

Sorry, call me thick but I don't understand that :)

Did Guy confirm that it was a bug, or show that it is not a bug?

__not_a_bug__ # that's python for not a bug


svg% ruby -e 'p "abcdef".sub(/(.)./, "\\+")'
"acdef"
svg%


/(.)./ and '\+'

- match a char, followed by another char. remember the first char
- replace the entire match with the remembered first char

abcdef -> acdef
--
- -

svg% ruby -e 'p "abcdef".sub(/(.).(..)/, "\\+")'
"cdef"
svg%


/(.).(..)/ and '\+'

- match a char, followed by another char. followed by two more chars.
remember the last two chars (and the first)
- replace the entire match with the remembered last two chars (which is
the 'last matched group')

abcdef -> cdef
----
-- --



-a
--
===============================================================================
| EMAIL :: Ara [dot] T [dot] Howard [at] noaa [dot] gov
| PHONE :: 303.497.6469
| ADDRESS :: E/GC2 325 Broadway, Boulder, CO 80305-3328
| URL :: http://www.ngdc.noaa.gov/stp/
| TRY :: for l in ruby perl;do $l -e "print \"\x3a\x2d\x29\x0a\"";done
===============================================================================
 
R

Robert Feldt

Ara.T.Howard said:
__not_a_bug__ # that's python for not a bug


svg% ruby -e 'p "abcdef".sub(/(.)./, "\\+")'
"acdef"
svg%


/(.)./ and '\+'

- match a char, followed by another char. remember the first char
- replace the entire match with the remembered first char

abcdef -> acdef
--
- -

svg% ruby -e 'p "abcdef".sub(/(.).(..)/, "\\+")'
"cdef"
svg%


/(.).(..)/ and '\+'

- match a char, followed by another char. followed by two more chars.
remember the last two chars (and the first)
- replace the entire match with the remembered last two chars (which is
the 'last matched group')

abcdef -> cdef
Yes, and further: If you want to do substitutions and aren't sure there
will be no backslash sequences in the replacement string you should use
the block form, like so:

$ ruby -e 'p "a".sub("a") {"\\+"}'
"\\+"

I added a note to my code review checklist to "always" use the block
form... ;)

Sorry for wasting bandwidth on this; I should have read the docs more
closely.

/Robert
 
M

Mark Hubbart

I added a note to my code review checklist to "always" use the block
form... ;)

only one problem with that:

mark@imac% cat test.rb
require 'profile'
def gsub_block_test(string, pat, str)
10000.times{ string.gsub(pat){str} }
end
def gsub_arg_test(string, pat, str)
10000.times{ string.gsub(pat,str) }
end

gsub_arg_test("testing this out", /[aeiou]/, ".")
gsub_block_test("testing this out", /[aeiou]/, ".")


mark@imac% ruby test.rb
% cumulative self self total
time seconds seconds calls ms/call ms/call name
54.97 13.65 13.65 20000 0.68 0.68 String#gsub
44.95 24.81 11.16 2 5580.00 12405.00 Integer#times
0.48 24.93 0.12 1 120.00 120.00
Profiler__.start_profile
0.00 24.93 0.00 2 0.00 0.00
Module#method_added
0.00 24.93 0.00 1 0.00 16550.00
Object#gsub_block_test
0.00 24.93 0.00 1 0.00 8260.00
Object#gsub_arg_test
0.00 24.93 0.00 1 0.00 24830.00 #toplevel

the block version of this same test takes twice as long :)

--Mark
 
B

Benedikt Huber

ts wrote:
...
Yeah, that's right the "last matched group"; I don't use that one much
apparently...
Perhaps I'm missing the point, but what semantic have consecutive
backslashes substituting a string ?
I feels like a bug, but probably isn't:
irb(main):008:0> puts "abcd".sub('abcd',"\\").length
1
irb(main):014:0> puts "abcd".sub('abcd',"\\"*2).length
1
irb(main):009:0> puts "abcd".sub('abcd',"\\"*10).length
5

Thx a lot,
benedikt
 
M

Mark Hubbart

Perhaps I'm missing the point, but what semantic have consecutive
backslashes substituting a string ?
I feels like a bug, but probably isn't:
irb(main):008:0> puts "abcd".sub('abcd',"\\").length
1
irb(main):014:0> puts "abcd".sub('abcd',"\\"*2).length
1
irb(main):009:0> puts "abcd".sub('abcd',"\\"*10).length
5

Thx a lot,
benedikt

normal strings are escaped thusly:

puts "\\","\\\\","\\\\\\"
\
\\
\\\
=> nil

regexp strings have and extra level of escaping, so that you can
include literal "\1"'s in your substitution. So, they are escaped
thusly:

puts "".sub(//,"\\\\"), "".sub(//,"\\\\\\\\"),
"".sub(//,"\\\\\\\\\\\\")
\
\\
\\\
=> nil

The somewhat confusing part is that a backslash in a gsub that doesn't
translate to a substitution expression becomes a literal. So, since
there is nothing for the last backslash to escape:

puts "".sub(//,"\\"), "".sub(//,"\\\\\\"), "".sub(//,"\\\\\\\\\\")
\
\\
\\\
=> nil

... another reason why, as someone pointed out in a thread earlier this
month, it's handy to only use the block form, and avoid the argument
form of (g)sub like the plague :)

cheers,
--Mark
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Similar Threads

question regarding sub!() 3
Strange bug in irb1.9 7
IO.pos bug? 5
:IRB Bug 1
bug in IPAddr eql? (stndard library) 2
is it bug? 3
Why Ruby stops working under cygwin 5
Sprintf bug 8

Members online

No members online now.

Forum statistics

Threads
474,145
Messages
2,570,826
Members
47,371
Latest member
Brkaa

Latest Threads

Top