why Hash corrupts 'key' object ?

D

Dmitry Perfilyev

Hi, I have next script:
t.rb:
===========================
class TestStr < String
attr_accessor :dupstr
def initialize ( str )
@dupstr = str
super(str)
end
end
ts=TestStr.new("aaa")
puts ts.dupstr
h=Hash.new()
h[ts]=true
puts h.keys.first.class
puts h.keys.first
puts h.keys.first.dupstr
===========================

run it:

$ ruby t.rb
aaa
TestStr
aaa
nil
$

The question is - why there is 'nil' in the last line instead of "aaa" ?
 
M

Mike Gold

Dmitry said:
Hi, I have next script:
t.rb:
===========================
class TestStr < String
attr_accessor :dupstr
def initialize ( str )
@dupstr = str
super(str)
end
end
ts=TestStr.new("aaa")
puts ts.dupstr
h=Hash.new()
h[ts]=true
puts h.keys.first.class
puts h.keys.first
puts h.keys.first.dupstr
===========================

run it:

$ ruby t.rb
aaa
TestStr
aaa
nil
$

The question is - why there is 'nil' in the last line instead of "aaa" ?

It must be the T_STRING test in hash.c, an optimization, I assume.

VALUE
rb_hash_aset(hash, key, val)
VALUE hash, key, val;
{
rb_hash_modify(hash);
if (TYPE(key) != T_STRING || st_lookup(RHASH(hash)->tbl, key, 0)) {
st_insert(RHASH(hash)->tbl, key, val);
}
else {
st_add_direct(RHASH(hash)->tbl, rb_str_new4(key), val);
}
return val;
}

You can still use your code as is, with this adjustment:

require 'delegate'

class TestStr < DelegateClass(String)
# ...
end

Also consider delegation as a primary strategy, not just as a workaround
for this case.
 
D

Dmitry Perfilyev

Mike said:
It must be the T_STRING test in hash.c, an optimization, I assume.

VALUE
rb_hash_aset(hash, key, val)
VALUE hash, key, val;
{
rb_hash_modify(hash);
if (TYPE(key) != T_STRING || st_lookup(RHASH(hash)->tbl, key, 0)) {
st_insert(RHASH(hash)->tbl, key, val);
}
else {
st_add_direct(RHASH(hash)->tbl, rb_str_new4(key), val);
}
return val;
}

You can still use your code as is, with this adjustment:

require 'delegate'

class TestStr < DelegateClass(String)
# ...
end

Also consider delegation as a primary strategy, not just as a workaround
for this case.

Thanks! It works (although I had to implement hash() and eql?() methods
for TestStr)
BTW, why delegation is more primary than standard inheritance ?
 
R

Robert Klemme

2008/12/11 Dmitry Perfilyev said:
Hi, I have next script:
t.rb:
===========================
class TestStr < String
attr_accessor :dupstr
def initialize ( str )
@dupstr = str
super(str)
end
end
ts=TestStr.new("aaa")
puts ts.dupstr
h=Hash.new()
h[ts]=true
puts h.keys.first.class
puts h.keys.first
puts h.keys.first.dupstr
===========================

run it:

$ ruby t.rb
aaa
TestStr
aaa
nil
$

The question is - why there is 'nil' in the last line instead of "aaa" ?

Because Hash copies #dup an unfrozen String.

http://www.ruby-doc.org/core/classes/Hash.html#M002878

Try this:

class TestStr < String
attr_accessor :dupstr
def initialize ( str )
@dupstr = str
super(str)
end
end
ts=TestStr.new("aaa")
puts ts.dupstr, ts.object_id
h=Hash.new()
h[ts]=true
puts h.keys.first.class, h.keys.first, h.keys.first.object_id,
h.keys.first.dupstr
puts "----"
ts.freeze
puts ts.dupstr, ts.object_id
h=Hash.new()
h[ts]=true
puts h.keys.first.class, h.keys.first, h.keys.first.object_id,
h.keys.first.dupstr

It is generally a bad idea to inherit core classes. Delegation is
much better (see Mike's reply).

Cheers

robert
 
B

Brian Adkins

Robert Klemme said:
2008/12/11 Dmitry Perfilyev said:
Hi, I have next script:
t.rb:
===========================
class TestStr < String
attr_accessor :dupstr
def initialize ( str )
@dupstr = str
super(str)
end
end
ts=TestStr.new("aaa")
puts ts.dupstr
h=Hash.new()
h[ts]=true
puts h.keys.first.class
puts h.keys.first
puts h.keys.first.dupstr
===========================

run it:

$ ruby t.rb
aaa
TestStr
aaa
nil
$

The question is - why there is 'nil' in the last line instead of "aaa" ?

Because Hash copies #dup an unfrozen String.

Why would that cause this effect though? The docs say "a String passed
as a key will be duplicated and frozen", but duplicating and freezing
a TestStr doesn't nullify dupstr:

$ cat -n temp.rb
1 class TestStr < String
2 attr_accessor :dupstr
3 def initialize ( str )
4 @dupstr = str
5 super(str)
6 end
7 end
8 ts=TestStr.new("aaa")
9 puts ts.dupstr, ts.object_id
10 puts '---'
11 ts2 = ts.dup
12 ts2.freeze
13 puts ts2.dupstr, ts2.object_id
$ ruby temp.rb
aaa
81910
---
aaa
81880
http://www.ruby-doc.org/core/classes/Hash.html#M002878

Try this:

class TestStr < String
attr_accessor :dupstr
def initialize ( str )
@dupstr = str
super(str)
end
end
ts=TestStr.new("aaa")
puts ts.dupstr, ts.object_id
h=Hash.new()
h[ts]=true
puts h.keys.first.class, h.keys.first, h.keys.first.object_id,
h.keys.first.dupstr
puts "----"
ts.freeze
puts ts.dupstr, ts.object_id
h=Hash.new()
h[ts]=true
puts h.keys.first.class, h.keys.first, h.keys.first.object_id,
h.keys.first.dupstr

It is generally a bad idea to inherit core classes. Delegation is
much better (see Mike's reply).

Cheers

robert
 
M

Mike Gold

Brian said:
Why would that cause this effect though? The docs say "a String passed
as a key will be duplicated and frozen", but duplicating and freezing
a TestStr doesn't nullify dupstr:

The answer lies in the rb_hash_aset() function I quoted. rb_str_new4()
begins like this:

VALUE
rb_str_new4(orig)
VALUE orig;
{
VALUE klass, str;

if (OBJ_FROZEN(orig)) return orig;
/* ... */
}

Like Robert showed, if it's already frozen then the key is untouched (no
dup+freeze).

The docs could be made clearer: An _unfrozen_ String passed as a key
will be duplicated and frozen.
 
M

Mike Gold

Dmitry said:
BTW, why delegation is more primary than standard inheritance ?

Delegation carries a greater degree of decoupling and flexibility.

Sometimes it's useful to replace your car engine while you're cruising
down the highway. That's delegation.

With inheritance, the engine is welded onto the car body. You've
already tossed aside the possibility of hot-swapping. Perhaps that was
an arbitrary decision.

Delegation makes something new out of existing parts. Inheritance
grafts stuff onto existing parts.
 
B

Brian Adkins

Mike Gold said:
The answer lies in the rb_hash_aset() function I quoted. rb_str_new4()
begins like this:

VALUE
rb_str_new4(orig)
VALUE orig;
{
VALUE klass, str;

if (OBJ_FROZEN(orig)) return orig;
/* ... */
}

Like Robert showed, if it's already frozen then the key is untouched (no
dup+freeze).

The docs could be made clearer: An _unfrozen_ String passed as a key
will be duplicated and frozen.

That doesn't answer my question. I showed that dup+freeze did not
adversely affect TestStr (i.e. dupstr was still accessible), so
something else is going on.
 
M

Mike Gold

Brian said:
That doesn't answer my question. I showed that dup+freeze did not
adversely affect TestStr (i.e. dupstr was still accessible), so
something else is going on.

You passed in a frozen String, so Hash#[]= did not touch it. So it was
not adversely affected.

Could you write an assertion which you think should succeed but doesn't?
 
B

Brian Adkins

Mike Gold said:
Brian said:
That doesn't answer my question. I showed that dup+freeze did not
adversely affect TestStr (i.e. dupstr was still accessible), so
something else is going on.

You passed in a frozen String, so Hash#[]= did not touch it. So it was
not adversely affected.

No I didn't. In fact, I didn't even use Hash in the example. Maybe
you're referring to a different post. I'll repost below:

$ cat -n temp.rb
1 class TestStr < String
2 attr_accessor :dupstr
3 def initialize ( str )
4 @dupstr = str
5 super(str)
6 end
7 end
8 ts=TestStr.new("aaa")
9 puts ts.dupstr, ts.object_id
10 puts '---'
11 ts2 = ts.dup
12 ts2.freeze
13 puts ts2.dupstr, ts2.object_id
$ ruby temp.rb
aaa
81910
---
aaa
81880

What the above demonstrates is that dup'ing and freezing a TestStr
does not nilify TestStr#dupstr, so the C code for Hash seems to be
doing more than just dup'ing and freezing. Does that make sense?
 
M

Mike Gold

Brian said:
What the above demonstrates is that dup'ing and freezing a TestStr
does not nilify TestStr#dupstr, so the C code for Hash seems to be
doing more than just dup'ing and freezing. Does that make sense?

Yes, sorry, I misunderstood. I did re-read your post several times but
was still a little puzzled.

You are right that something appears to be amiss. rb_str_new4()
duplicates the String portion of the TestStr object, but does not copy
over the TestStr-specific data. A subclassed String is not duplicated;
the documentation is wrong.

I see the reason for it: the raw string data is being shared between the
object and its "dup". I guess sharing should only be done if it's a
real String and not a subclass thereof. This fixes the problem reported
in this thread.

--- a/ruby-1.8.7-p72/hash.c 2008-06-08 14:25:01.000000000 -0400
+++ b/ruby-1.8.7-p72/hash.c 2008-12-13 13:47:33.000000000 -0500
@@ -986,7 +986,8 @@
VALUE hash, key, val;
{
rb_hash_modify(hash);
- if (TYPE(key) != T_STRING || st_lookup(RHASH(hash)->tbl, key, 0)) {
+ if (rb_obj_class(key) != rb_cString ||
+ st_lookup(RHASH(hash)->tbl, key, 0)) {
st_insert(RHASH(hash)->tbl, key, val);
}
else {
 
B

Brian Adkins

Mike Gold said:
Yes, sorry, I misunderstood. I did re-read your post several times but
was still a little puzzled.

You are right that something appears to be amiss. rb_str_new4()
duplicates the String portion of the TestStr object, but does not copy
over the TestStr-specific data. A subclassed String is not duplicated;
the documentation is wrong.

I see the reason for it: the raw string data is being shared between the
object and its "dup". I guess sharing should only be done if it's a
real String and not a subclass thereof. This fixes the problem reported
in this thread.

--- a/ruby-1.8.7-p72/hash.c 2008-06-08 14:25:01.000000000 -0400
+++ b/ruby-1.8.7-p72/hash.c 2008-12-13 13:47:33.000000000 -0500
@@ -986,7 +986,8 @@
VALUE hash, key, val;
{
rb_hash_modify(hash);
- if (TYPE(key) != T_STRING || st_lookup(RHASH(hash)->tbl, key, 0)) {
+ if (rb_obj_class(key) != rb_cString ||
+ st_lookup(RHASH(hash)->tbl, key, 0)) {
st_insert(RHASH(hash)->tbl, key, val);
}
else {

Wow, complete with a patch! Have you passed this on to the appropriate
channel for possible inclusion ?
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,995
Messages
2,570,236
Members
46,822
Latest member
israfaceZa

Latest Threads

Top