unintuitive language feature (exclamation functions)

N

Nick Brown

I was surprised to discover that the code

astring.sub!(/hi/, 'bye')

behaves subtly differently from

astring = astring.sub(/hi/, 'bye')

Intuitively, to me, these should be identical. Perhaps the documentation
should make mention of this difference? A note about this unexpected
behavior would have saved me a lot of frustration, and would likely do
the same for many others new to Ruby.

To be honest, I'm still trying to find out exactly why these do
different things. The difference does not manifest itself with trivial
cases in irb; rather it shows up when I'm getting a string from cgi,
modifying it, then inserting it into a database. When using sub!, the
database ends up containing the pre-sub'd value of astring, even though
astring appears to contain the modified version when printed with a
debug statement immediately preceding my database insert.

I'm willing to except the criticism that my intuition is perverse in
some way, but when I started writing in Ruby I was really hoping it
would be a language one could use without having to understand how the C
underneath it all worked (defeating part of the purpose of "high level"
languages).

So what do you think? Would warnings in the documentation on exclamation
functions be useful or pointless?
 
F

F. Senault

Le 20 août 2008 à 21:45, Nick Brown a écrit :
When using sub!, the
database ends up containing the pre-sub'd value of astring, even though
astring appears to contain the modified version when printed with a
debug statement immediately preceding my database insert.

Please provide some code to demonstrate this. I'm willing to bet
there's another, subtler, step that's misleading you.

Fred
 
G

Gregory Brown

I was surprised to discover that the code

astring.sub!(/hi/, 'bye')

behaves subtly differently from

astring = astring.sub(/hi/, 'bye')

Intuitively, to me, these should be identical. Perhaps the documentation
should make mention of this difference? A note about this unexpected
behavior would have saved me a lot of frustration, and would likely do
the same for many others new to Ruby.

If the two were identical, why would we have both sub and sub! methods?
The extra punctuation would be useless if it existed 'just for fun'
To be honest, I'm still trying to find out exactly why these do
different things. The difference does not manifest itself with trivial
cases in irb; rather it shows up when I'm getting a string from cgi,
modifying it, then inserting it into a database. When using sub!, the
database ends up containing the pre-sub'd value of astring, even though
astring appears to contain the modified version when printed with a
debug statement immediately preceding my database insert.

The documentation for String#sub! is:

"Performs the substitutions of String#sub in place, returning str, or
nil if no substitutions were performed. "
I'm willing to except the criticism that my intuition is perverse in
some way, but when I started writing in Ruby I was really hoping it
would be a language one could use without having to understand how the C
underneath it all worked (defeating part of the purpose of "high level"
languages).

This has nothing to do with C. It has to do with interface design,
and is meant to make things more intuitive, not less.
Admittedly there is nothing inherently intuitive about some_method!,
except that it might make you feel like you should pay more attention,
like... Caution!

Once learned, this convention can be very helpful.
So what do you think? Would warnings in the documentation on exclamation
functions be useful or pointless?

Since exclamation points are conventional and not behaviorly enforced
in any way by Ruby itself, all ! methods should come with their own
documentation.
It does not necessarily mean 'modify the receiver in place', so
further explanation is usually needed. Just remember that when you
see foo and foo!, the latter is the one that the developer of the
library you are using has indicated to require more attention, or be
more specialized.

If you're still not convinced, I recommend checking out a post by
David Black on this topic, as it clearly explains the value of the
convention:
http://dablog.rubypal.com/2007/8/15/bang-methods-or-danger-will-rubyist

-greg
 
S

Stefano Crocco

I was surprised to discover that the code

astring.sub!(/hi/, 'bye')

behaves subtly differently from

astring = astring.sub(/hi/, 'bye')

Intuitively, to me, these should be identical. Perhaps the documentation
should make mention of this difference? A note about this unexpected
behavior would have saved me a lot of frustration, and would likely do
the same for many others new to Ruby.

To be honest, I'm still trying to find out exactly why these do
different things. The difference does not manifest itself with trivial
cases in irb; rather it shows up when I'm getting a string from cgi,
modifying it, then inserting it into a database. When using sub!, the
database ends up containing the pre-sub'd value of astring, even though
astring appears to contain the modified version when printed with a
debug statement immediately preceding my database insert.

I'm willing to except the criticism that my intuition is perverse in
some way, but when I started writing in Ruby I was really hoping it
would be a language one could use without having to understand how the C
underneath it all worked (defeating part of the purpose of "high level"
languages).

So what do you think? Would warnings in the documentation on exclamation
functions be useful or pointless?

Unless I misunderstood you, you're asking why two different methods
(String#sub and String#sub!) work differently. The answer is simple: because
they're different. It's like asking why String#upcase and String#downcase work
differently.

The documentation do speak of this difference:

ri String#sub gives:

------------------------------------------------------------- String#sub
str.sub(pattern, replacement) => new_str
str.sub(pattern) {|match| block } => new_str
------------------------------------------------------------------------
Returns a copy of _str_ with the _first_ occurrence of _pattern_
replaced with either _replacement_ or the value of the block. [...]

while ri String#sub! gives:

------------------------------------------------------------ String#sub!
str.sub!(pattern, replacement) => str or nil
str.sub!(pattern) {|match| block } => str or nil
------------------------------------------------------------------------
Performs the substitutions of +String#sub+ in place, returning
_str_, or +nil+ if no substitutions were performed.

You don't need to know about the C implementation of class String, of
String#sub or of String#sub! to understand how these methods work. The
documentation says that sub returns a copy of the string with the replacement
done, which means a different object, which has nothing to do with the
original. In the case of sub!, instead, the substitution is done in place,
that is, the receiver itself (str) is modified, not a copy of it.

As for the fact that the difference doesn't show in irb, this is not true.
Look at this:

irb(main):001:0> str = "this is a test string"
=> "this is a test string"
irb(main):002:0> str1 = str.sub "h", "H"
=> "tHis is a test string"
irb(main):003:0> str
=> "this is a test string"

The above lines show that str is not changed by sub

irb(main):004:0> str.sub "k", "K"
=> "this is a test string"
irb(main):005:0> str.sub! "k", "K"
=> nil

This shows the different behavior concerning the return value when there's
nothing to replace. sub returns a copy of the string without modifications,
while sub! returns nil

irb(main):006:0> str.sub! "a", "A"
=> "this is A test string"
irb(main):007:0> str
=> "this is A test string"
irb(main):008:0>

Here you can see that sub!, unlike sub, changes the original string.

In short, here's the difference between sub and sub!:
* sub creates a new string which has the same contents of the original one,
but is indipendent from, then replaces the pattern with the replacement text
in the copy. The original is not altered in any way. It always returns the
copy and you can see whether a replacement has been made by comparing the
original and the copy.
* sub! performs the replacement on the string itself, thus changing it.
Obviously, you can't compare the 'new' and the 'original' string to see
whether a replacement has been made (since there's no 'new string' and the
original has been changed), so you have to look at the return value: if it is
nil, nothing has been changed; if it is the string itself then a replacement
has been made.

I hope this helps

Stefano
 
N

Nick Brown

F. Senault said:
Please provide some code to demonstrate this.

#!/usr/bin/env ruby

require 'sqlite3'
db = SQLite3::Database.new('test.sqlite')
db.execute ('drop table if exists example') # clean up incase of
multiple runs
db.execute('create table example (aval)')

require 'cgi'
cgi = CGI.new('html4')

a = cgi['a']

a.sub!(/hi/, 'bye')
# to see expected behavior, replace the above with: a = a.sub(/hi/,
'bye')

puts "Inserting value a=#{a} into the database.\n"
sql = "insert into example (aval) values (?)"
db.execute(sql, a)

sql = "select aval from example"
val = db.get_first_value(sql)
puts "What was actually inserted into the database: #{val}\n"


########---------- end of code

To run this, type "a=hi"[enter][ctrl-d] to simulate the behavior of a
cgi session. You will get the output:

Inserting value a=bye into the database.
What was actually inserted into the database: hi

Since other responders seem to think I expect sub to behave the same as
sub!, I don't. I expect str.sub! to modify str, and I expect str.sub to
return a modified copy of the str. This is not the same behavior.
 
S

Stefano Crocco

F. Senault said:
Please provide some code to demonstrate this.

#!/usr/bin/env ruby

require 'sqlite3'
db = SQLite3::Database.new('test.sqlite')
db.execute ('drop table if exists example') # clean up incase of
multiple runs
db.execute('create table example (aval)')

require 'cgi'
cgi = CGI.new('html4')

a = cgi['a']

a.sub!(/hi/, 'bye')
# to see expected behavior, replace the above with: a = a.sub(/hi/,
'bye')

puts "Inserting value a=#{a} into the database.\n"
sql = "insert into example (aval) values (?)"
db.execute(sql, a)

sql = "select aval from example"
val = db.get_first_value(sql)
puts "What was actually inserted into the database: #{val}\n"


########---------- end of code

To run this, type "a=hi"[enter][ctrl-d] to simulate the behavior of a
cgi session. You will get the output:

Inserting value a=bye into the database.
What was actually inserted into the database: hi

Since other responders seem to think I expect sub to behave the same as
sub!, I don't. I expect str.sub! to modify str, and I expect str.sub to
return a modified copy of the str. This is not the same behavior.

If in your first post you'd have stated more clearly what you expected and
what you instead got, we wouldn't have misunderstood your needs. After all,
the only (or, at least, main) difference between sub and sub! is the one I
spoke of in my other answer. However, I can't try your code, as I don't have
the sqlite gem/library. Would you please post what you get using sub and what
you get using sub!?

The line

puts "Inserting value a=#{a} into the database.\n"

displays the correct value (a=bye). If I understand you correctly, the
surprising behavior comes from inserting it in the database. Posting what you
get from the other puts will enable also those who don't have sqlite to help
you.

(By the way, you don't need to put the \n at the end of the string with puts).

Stefano
 
P

Patrick Li

I would agree with Stefano. I doesn't look like an issue with sub and
sub! to me.
I ran into something similar with my webapp. For me, it was because I
didn't call database.commit() after my update statement.
 
F

F. Senault

Le 20 août 2008 à 22:25, Nick Brown a écrit :

Don't ask me why (yet) but...
a = cgi['a']
a = cgi['a'].dup

and...

22:47 fred@balvenie:~/> ruby test.rb
(offline mode: enter name=value pairs on standard input)
a=hi
Inserting value a=bye into the database.
What was actually inserted into the database: bye

....

It seems that CGI does horrible, horrible things to its strings :

require 'cgi'
cgi = CGI.new('html4')

a = cgi['a'] #.dup
b = cgi['a'].dup

puts "A :"
a.sub!(/hi/, 'bye')
puts a.to_s
puts a.inspect
puts a.class

puts "B :"
b.sub!(/hi/, 'bye')
puts b.to_s
puts b.inspect
puts b.class

Gives :

22:53 fred@balvenie:~> ruby test.rb
(offline mode: enter name=value pairs on standard input)
a=hi
A :
hi
"bye"
String
B :
bye
"bye"
String

Ugh !

Fred
 
B

brabuhr

F. Senault said:
Please provide some code to demonstrate this.

a = cgi['a']

Internal to the CGI object, it appears that "a" in the @params hash is
an array of strings not a string:

irb(main):007:0> cgi = CGI.new('html4')
(offline mode: enter name=value pairs on standard input)
a=foohibyebar
a.sub!(/hi/, 'bye')
# to see expected behavior, replace the above with: a = a.sub(/hi/,
'bye')

To run this, type "a=hi"[enter][ctrl-d] to simulate the behavior of a
cgi session. You will get the output:

Inserting value a=bye into the database.
What was actually inserted into the database: hi
#!/usr/bin/env ruby

require 'rubygems'
require 'sqlite3'

db = SQLite3::Database.new('test.sqlite')
db.execute ('drop table if exists example') # clean up incase of multiple runs
db.execute('create table example (aval)')

require 'cgi'
cgi = CGI.new('html4')

a = cgi['a'][0]

a.sub!(/hi/, 'bye')
# to see expected behavior, replace the above with: a = a.sub(/hi/, 'bye')

puts "Inserting value a=#{a} into the database.\n"
sql = "insert into example (aval) values (?)"
db.execute(sql, a)

sql = "select aval from example"
val = db.get_first_value(sql)
puts "What was actually inserted into the database: #{val}\n"
ruby z.rb
z.rb:7: warning: don't put space before argument parentheses
(offline mode: enter name=value pairs on standard input)
a=foohibyebar
z.rb:13:CAUTION! cgi['key'] == cgi.params['key'][0]; if want Array,
use cgi.params['key']
Inserting value a=foobyebyebar into the database.
What was actually inserted into the database: foobyebyebar
 
N

Nick Brown

I'm starting to wonder if this is actually a bug in Ruby? The
documentation of sub! says it should modify the string in place.

The code I posted does something different. After the a.sub!, executing
"puts #{a}" outputs the modified version of a, but inserting that *exact
same* string object into a database puts an UNMODIFIED version of the
string into the DB. It's as if db.execute looks back in time to before
the sub! when it gets the value of a. Something unexplained is going on
here (unless the database module includes a time machine).
 
B

brabuhr

F. Senault said:
Please provide some code to demonstrate this.

a = cgi['a']

Internal to the CGI object, it appears that "a" in the @params hash is
an array of strings not a string:

irb(main):007:0> cgi = CGI.new('html4')
(offline mode: enter name=value pairs on standard input)
a=foohibyebar
=> #<CGI:0xb7c998cc @params={"a"=>["foohibyebar"]}, @multipart=false,
@output_cookies=nil, @output_hidden=nil, @cookies={}>
ruby z.rb
z.rb:7: warning: don't put space before argument parentheses
(offline mode: enter name=value pairs on standard input)
a=foohibybar
a=zizzlesticks
z.rb:13:CAUTION! cgi['key'] == cgi.params['key'][0]; if want Array,
use cgi.params['key']
Inserting value a=foobyebybar into the database.
What was actually inserted into the database: foobyebybar
 
N

Nick Brown

So the problem doesn't seem to be with sub! at all. It's with cgi.

If I get the variable "a" using the cgi code above, then I create string
"b":

a = cgi['a']
b = String.new

a.class
=> String
b.class
=> String

So they should have the same methods since they are the same class.
However, "a" seems to have extra methods.
a.first
=> "hi"
b.first
NoMethodError: undefined method `first' for "":String


It seems the cgi object is returning strings that aren't really strings.
If that's the case, isn't it a bug that cgi['a'].class returns "String"
when it is really something else?
 
B

brabuhr

"puts #{a}" outputs the modified version of a, but inserting that *exact
same* string object into a database puts an UNMODIFIED version of the
string into the DB. It's as if db.execute looks back in time to before
the sub! when it gets the value of a. Something unexplained is going on
here (unless the database module includes a time machine).

#!/usr/bin/env ruby

require 'rubygems'
require 'sqlite3'

module SQLite3
class Statement
def bind_params( *bind_vars )
index = 1
p self.class, "bind_params()"
p bind_vars
p *bind_vars
bind_vars.flatten.each do |var|
p var
if Hash === var
var.each { |key, val| bind_param key, val }
else
bind_param index, var
index += 1
end
end
end
end
end

db = SQLite3::Database.new('test.sqlite')
db.execute ('drop table if exists example') # clean up incase of multiple runs
db.execute('create table example (aval)')

require 'cgi'
cgi = CGI.new('html4')

a = cgi['a']

a.sub!(/hi/, 'bye')
# to see expected behavior, replace the above with: a = a.sub(/hi/, 'bye')

puts "Inserting value a=#{a} into the database.\n"
sql = "insert into example (aval) values (?)"
db.execute(sql, a)

sql = "select aval from example"
val = db.get_first_value(sql)
puts "What was actually inserted into the database: #{val}\n"


(offline mode: enter name=value pairs on standard input)
a=hi
Inserting value a=bye into the database.
SQLite3::Statement
"bind_params()"
["bye"]
"bye"
"hi"
What was actually inserted into the database: hi
 
B

brabuhr

p self.class, "bind_params()"
p bind_vars
p *bind_vars
bind_vars.flatten.each do |var|
p var

irb(main)> cgi = CGI.new('html4')
(offline mode: enter name=value pairs on standard input)
a=hi
=> #<CGI:0xb7c367b8 @params={"a"=>["hi"]}, @multipart=false,
@output_cookies=nil, @output_hidden=nil, @cookies={}>

irb(main)> a = cgi['a']
=> "hi"

irb(main)> a.sub!(/hi/, 'bye')
=> "bye"

irb(main)> a
=> "bye"

irb(main)> [a]
=> ["bye"]

irb(main)> [a].flatten
=> ["hi"]
 
B

brabuhr

irb(main)> cgi = CGI.new('html4')
(offline mode: enter name=value pairs on standard input)
a=hi
=> #<CGI:0xb7c367b8 @params={"a"=>["hi"]}, @multipart=false,
@output_cookies=nil, @output_hidden=nil, @cookies={}>

irb(main)> a = cgi['a']
=> "hi"

irb(main)> a.sub!(/hi/, 'bye')
=> "bye"

irb(main)> a
=> "bye"

irb(main)> [a]
=> ["bye"]

irb(main)> [a].flatten
=> ["hi"]

irb(main):061:0> b = "hi"
=> "hi"
irb(main):062:0> b.sub!(/hi/, 'bye')
=> "bye"
irb(main):063:0>
=> ["bye"]
irb(main):064:0> .flatten
=> ["bye"]
 
S

Stefano Crocco

So the problem doesn't seem to be with sub! at all. It's with cgi.

If I get the variable "a" using the cgi code above, then I create string
"b":

a = cgi['a']
b = String.new

a.class
=> String
b.class
=> String

So they should have the same methods since they are the same class.
However, "a" seems to have extra methods.
a.first
=> "hi"
b.first
NoMethodError: undefined method `first' for "":String


It seems the cgi object is returning strings that aren't really strings.
If that's the case, isn't it a bug that cgi['a'].class returns "String"
when it is really something else?

The string returned by Cgi#[] are extended by the CGI::QueryExtension::Value
module, which is an intentionally undocumented module defined in cgi.rb and
adds methods like to_ary, first and last and modifies others (like []). I
don't know whether this is documented or not, since I've never used this
library. This fact, however, doesn't explain (at least I don't think so) the
weird behaviors which have been reported in this thread.

Stefano
 
B

brabuhr

And just for comparison:

require 'cgi'
cgi = CGI.new('html4')

a = cgi['a']

a.sub!(/hi/, 'bye')

p a
p [a]
p [a].flatten

ruby:
(offline mode: enter name=value pairs on standard input)
a=hi
"bye"
["bye"]
["hi"]

jruby:
(offline mode: enter name=value pairs on standard input)
a=hi
"bye"
["bye"]
["hi"]

oh, and also:
irb(main):001:0> a = "hi"
=> "hi"
irb(main):002:0> a.sub!(/hi/, "bye")
=> "bye"
irb(main):003:0> a
=> "bye"
irb(main):004:0> [a]
=> ["bye"]
irb(main):005:0> [a].flatten
=> ["bye"]
 
D

David A. Black

Hi --

So the problem doesn't seem to be with sub! at all. It's with cgi.

If I get the variable "a" using the cgi code above, then I create string
"b":

a = cgi['a']
b = String.new

a.class
=> String
b.class
=> String

So they should have the same methods since they are the same class.

Not necessarily. Objects do what they do. Classes are mainly a way to
launch objects into object-space, after which they may or may not
continue to behave the way they did when they were first created.
However, "a" seems to have extra methods.
a.first
=> "hi"
b.first
NoMethodError: undefined method `first' for "":String


It seems the cgi object is returning strings that aren't really strings.
If that's the case, isn't it a bug that cgi['a'].class returns "String"
when it is really something else?

No, as long as the API of the object is correctly documented.


David
 
G

Gregory Brown

So the problem doesn't seem to be with sub! at all. It's with cgi.

If I get the variable "a" using the cgi code above, then I create string
"b":

a = cgi['a']
b = String.new

a.class
=> String
b.class
=> String

So they should have the same methods since they are the same class.
a = "foo" => "foo"
def a.definitely_not
"The same"
end => nil
a.definitely_not => "The same"
a.class
=> String
NoMethodError: undefined method `definitely_not' for "bar":String
from (irb):7
from :0=> String

-greg
 
M

Michael Morin

Nick said:
I was surprised to discover that the code

astring.sub!(/hi/, 'bye')

behaves subtly differently from

astring = astring.sub(/hi/, 'bye')

Intuitively, to me, these should be identical. Perhaps the documentation
should make mention of this difference? A note about this unexpected
behavior would have saved me a lot of frustration, and would likely do
the same for many others new to Ruby.

To be honest, I'm still trying to find out exactly why these do
different things. The difference does not manifest itself with trivial
cases in irb; rather it shows up when I'm getting a string from cgi,
modifying it, then inserting it into a database. When using sub!, the
database ends up containing the pre-sub'd value of astring, even though
astring appears to contain the modified version when printed with a
debug statement immediately preceding my database insert.

I'm willing to except the criticism that my intuition is perverse in
some way, but when I started writing in Ruby I was really hoping it
would be a language one could use without having to understand how the C
underneath it all worked (defeating part of the purpose of "high level"
languages).

So what do you think? Would warnings in the documentation on exclamation
functions be useful or pointless?

This works. Doing it this way returns the actual string, not some
string extended by undocumented modules that's actually an array in
disguise or something like that.

require 'cgi'

cgi = CGI.new('html4')
a = cgi.params['a'][0]

a.sub!(/hi/, 'bye')
puts a

--
Michael Morin
Guide to Ruby
http://ruby.about.com/
Become an About.com Guide: beaguide.about.com
About.com is part of the New York Times Company
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,995
Messages
2,570,236
Members
46,822
Latest member
israfaceZa

Latest Threads

Top