Ruby Hash Keys and Related Questions

T

Terry Michaels

I'm still a bit new to Ruby, so humor me a bit. But I discovered today
(through trial and error) that not only can Strings, numbers, and
symbols be keys for hashes, but also any object, or even a class name!
Ruby is the first language I've used in which I would have even thought
to try that, let alone it actually working:


irb(main):001:0> hsh = {}
=> {}
irb(main):002:0> obj = Object.new
=> #<Object:0x7fa4b83ab1f0>
irb(main):003:0> obj2 = Object.new
=> #<Object:0x7fa4b83a7320>
irb(main):004:0> hsh[obj] = "blah"
=> "blah"
irb(main):005:0> hsh[obj2] = "ble"
=> "ble"
irb(main):006:0> puts hsh[obj2]
ble
=> nil
irb(main):007:0> puts hsh[obj1]
NameError: undefined local variable or method `obj1' for main:Object
from (irb):7
irb(main):008:0> puts hsh[obj]
blah
=> nil
irb(main):009:0> clone = obj
=> #<Object:0x7fa4b83ab1f0>
irb(main):010:0> puts hsh[clone]
blah
=> nil
irb(main):011:0> class Cl
irb(main):012:1> end
=> nil
irb(main):013:0> hsh[Cl] = "blo"
=> "blo"
irb(main):014:0> puts hsh[obj]
blah
=> nil
irb(main):015:0> puts hsh[Cl]
blo
=> nil
irb(main):016:0> class Cl2
irb(main):017:1> end
=> nil
irb(main):018:0> hsh[Cl2] = "blu"
=> "blu"
irb(main):019:0> puts hsh[Cl]
blo
=> nil
irb(main):020:0> puts hsh[Cl2]
blu

Anyway, this raised a few related questions in my mind:

1. If the "key" taken by hash[key]= can be any object, and the key still
works even after it is aliased to another variable, does that mean that
the "key" is just a reference?

2. If I pass in a number, say an Integer, as a key, does Ruby actually
use the Integer? Or does it use a reference to an Integer object?
(Numbers are objects too, right?)

3. If I am allowed to pass in a class as a key, does that mean that
classes are objects too? If not, what exactly is being stored as the
key?

4. When I use irb, and a line returns an object, irb shows me the
object's hexadecimal reference address (or at least, that's what it
looks like). Is there a method one can call on an object to get that
reference when one is not in irb? Just curious.
 
J

Justin Collins

I'm still a bit new to Ruby, so humor me a bit. But I discovered today
(through trial and error) that not only can Strings, numbers, and
symbols be keys for hashes, but also any object, or even a class name!
Ruby is the first language I've used in which I would have even thought
to try that, let alone it actually working:


irb(main):001:0> hsh = {}
=> {}
irb(main):002:0> obj = Object.new
=> #<Object:0x7fa4b83ab1f0>
irb(main):003:0> obj2 = Object.new
=> #<Object:0x7fa4b83a7320>
irb(main):004:0> hsh[obj] = "blah"
=> "blah"
irb(main):005:0> hsh[obj2] = "ble"
=> "ble"
irb(main):006:0> puts hsh[obj2]
ble
=> nil
irb(main):007:0> puts hsh[obj1]
NameError: undefined local variable or method `obj1' for main:Object
from (irb):7
irb(main):008:0> puts hsh[obj]
blah
=> nil
irb(main):009:0> clone = obj
=> #<Object:0x7fa4b83ab1f0>
irb(main):010:0> puts hsh[clone]
blah
=> nil
irb(main):011:0> class Cl
irb(main):012:1> end
=> nil
irb(main):013:0> hsh[Cl] = "blo"
=> "blo"
irb(main):014:0> puts hsh[obj]
blah
=> nil
irb(main):015:0> puts hsh[Cl]
blo
=> nil
irb(main):016:0> class Cl2
irb(main):017:1> end
=> nil
irb(main):018:0> hsh[Cl2] = "blu"
=> "blu"
irb(main):019:0> puts hsh[Cl]
blo
=> nil
irb(main):020:0> puts hsh[Cl2]
blu

Anyway, this raised a few related questions in my mind:

1. If the "key" taken by hash[key]= can be any object, and the key still
works even after it is aliased to another variable, does that mean that
the "key" is just a reference?

2. If I pass in a number, say an Integer, as a key, does Ruby actually
use the Integer? Or does it use a reference to an Integer object?
(Numbers are objects too, right?)

I am not sure the answers to these questions really "matter." As long as
you put in the same key, the same value will come out. How the key is
hashed to an index is a separate issue, and is handled differently
depending on the key. Any object can define "hash" and "eql?" methods to
control how they behave as hash keys. Try calling "hash" on a few
objects to see.
3. If I am allowed to pass in a class as a key, does that mean that
classes are objects too? If not, what exactly is being stored as the
key?

Yes, classes are objects too.
4. When I use irb, and a line returns an object, irb shows me the
object's hexadecimal reference address (or at least, that's what it
looks like). Is there a method one can call on an object to get that
reference when one is not in irb? Just curious.
Not that I know of. You can get the same output as irb by calling the
"inspect" method on an object. Ruby documentation claims that the number
shown is an encoded version of the object id, but the code for
Object#to_s shows:

VALUE
rb_any_to_s(VALUE obj)
{
const char *cname = rb_obj_classname(obj);
VALUE str;

str = rb_sprintf("#<%s:%p>", cname, (void*)obj);
OBJ_INFECT(str, obj);

return str;
}

which suggests differently. But I'm not much good at reading C code.

-Justin
 
J

Josh Cheek

[Note: parts of this message were removed to make it a legal post.]

I'm still a bit new to Ruby, so humor me a bit. But I discovered today
(through trial and error) that not only can Strings, numbers, and
symbols be keys for hashes, but also any object, or even a class name!
Ruby is the first language I've used in which I would have even thought
to try that, let alone it actually working:


irb(main):001:0> hsh = {}
=> {}
irb(main):002:0> obj = Object.new
=> #<Object:0x7fa4b83ab1f0>
irb(main):003:0> obj2 = Object.new
=> #<Object:0x7fa4b83a7320>
irb(main):004:0> hsh[obj] = "blah"
=> "blah"
irb(main):005:0> hsh[obj2] = "ble"
=> "ble"
irb(main):006:0> puts hsh[obj2]
ble
=> nil
irb(main):007:0> puts hsh[obj1]
NameError: undefined local variable or method `obj1' for main:Object
from (irb):7
irb(main):008:0> puts hsh[obj]
blah
=> nil
irb(main):009:0> clone = obj
=> #<Object:0x7fa4b83ab1f0>
irb(main):010:0> puts hsh[clone]
blah
=> nil
irb(main):011:0> class Cl
irb(main):012:1> end
=> nil
irb(main):013:0> hsh[Cl] = "blo"
=> "blo"
irb(main):014:0> puts hsh[obj]
blah
=> nil
irb(main):015:0> puts hsh[Cl]
blo
=> nil
irb(main):016:0> class Cl2
irb(main):017:1> end
=> nil
irb(main):018:0> hsh[Cl2] = "blu"
=> "blu"
irb(main):019:0> puts hsh[Cl]
blo
=> nil
irb(main):020:0> puts hsh[Cl2]
blu

Anyway, this raised a few related questions in my mind:

1. If the "key" taken by hash[key]= can be any object, and the key still
works even after it is aliased to another variable, does that mean that
the "key" is just a reference?
The key can be any object that implements the methods "hash" and "eql?" I'm
not sure what you mean when you say "does that mean the 'key' is just a
reference?" If you are asking what Ruby is passing around, the answer is "a
pointer to the object". That is less interesting in this case, the more
interesting thing is why it is behaving that way, which is that objects have
hash defined on them, which returns their object_id

o = Object.new
o.hash # => 2154796
o.object_id # => 2154796

I don't know if you know how hashes are implemented, but internally they map
objects to array indexes. The way Ruby does this is with the hash method,
which returns a number that correlates to the index. In your example with
obj and obj2, you can store different values there. But they are both just
empty objects, does it make sense to consider them two different keys or the
same key? With objects like this, they get different keys because they will
have different object ids. But think about a String, where each string is
different.

a1 = 'a'
a2 = 'a'
a1.object_id # => 2153018
a2.object_id # => 2153004

Do you want to have to always keep track of which string you used as the
key? No. So the hash for a string is based on the string value, in this case
"a", rather than the specific instance of "a" that was used to put it into
the hash.

a1.hash # => 14815807
a2.hash # => 14815807


2. If I pass in a number, say an Integer, as a key, does Ruby actually
use the Integer? Or does it use a reference to an Integer object?
(Numbers are objects too, right?)
There have been long discussions about this, Caleb Clousen tells me that
Fixnums are copied every time they are passed as an argument. He has gone
much deeper than I have, so presumably he knows what he is talking about,
but Ruby goes to really great lengths to hide this from you, to the point
that you must construct contrived explanations to handle the contradictions
that such models have.

I think it is best to Just consider every variable a pointer to the object.
Fixnum or not.

3. If I am allowed to pass in a class as a key, does that mean that
classes are objects too? If not, what exactly is being stored as the
key?
Yes, classes are objects:

class C
end

c = C
c == C # => true
C.class # => Class
C.object_id # => 2156420
C.hash # => 2156420

classes = [C,Array,String]
classes # => [C, Array, String]

Notice that they inherit the default hash method that just uses their object
id as the hash key.

4. When I use irb, and a line returns an object, irb shows me the
object's hexadecimal reference address (or at least, that's what it
looks like). Is there a method one can call on an object to get that
reference when one is not in irb? Just curious.
object.object_id

-----

If you're interested, here is about as simple of an implementation of a hash
table as you can get https://gist.github.com/840135

The purpose is to conceptually understand that hashes internally use arrays,
and see a simple example of how they achieve this. Real hashes are much more
complex (ie what happens if two objects hash to the same value? what happens
if two different objects should be considered the same hash key? what
happens when the array gets full? how is the #hash method written? etc.)
 
R

Robert Klemme

I'm still a bit new to Ruby, so humor me a bit. But I discovered today
(through trial and error) that not only can Strings, numbers, and
symbols be keys for hashes, but also any object, or even a class name!
Ruby is the first language I've used in which I would have even thought
to try that, let alone it actually working:
Anyway, this raised a few related questions in my mind:

1. If the "key" taken by hash[key]= can be any object, and the key still
works even after it is aliased to another variable, does that mean that
the "key" is just a reference?

Yes, in Ruby you always only ever see references. This means, that if
you modify an instance which is used in a Hash as key the Hash likely
needs updating since the hash code of the key usually also changes
(see Hash#rehash).
2. If I pass in a number, say an Integer, as a key, does Ruby actually
use the Integer? Or does it use a reference to an Integer object?
(Numbers are objects too, right?)

Yes, everything is an object in Ruby. There are some optimizations
internally but as a user of the language you do not see them.
Anything, and I mean _anything_, in Ruby is an object and can be
referenced by any variable.
3. If I am allowed to pass in a class as a key, does that mean that
classes are objects too? If not, what exactly is being stored as the
key?

Yes, classes are objects, too. Classes are instances of class Class:

irb(main):002:0> String.class
=> Class
irb(main):003:0> String.class.ancestors
=> [Class, Module, Object, Kernel]
irb(main):004:0> String.kind_of? Object
=> true

There is a tad of recursion involved but you should just accept it and
not think about it too much. Otherwise serious brain damage could be
the consequence. :)
4. When I use irb, and a line returns an object, irb shows me the
object's hexadecimal reference address (or at least, that's what it
looks like). Is there a method one can call on an object to get that
reference when one is not in irb? Just curious.

No, IRB shows the result of obj.inspect. The default implementation
returns something which is related to #object_id which in turn is
derived from the address (I forgot the details):

irb(main):010:0> o=Object.new
=> #<Object:0x7ff72c5c>
irb(main):011:0> o.object_id.to_s 16
=> "3ffb962e"
irb(main):012:0> 0x7ff72c5c / 0x3ffb962e
=> 2

Anyway, since you cannot access memory directly from Ruby it's
worthless to know the memory address. And in other Ruby
implementations (e.g. JRuby) that memory address may even change.
Forget C (until you write your first Ruby extension).

Ah, and one note: Ruby's Hash will apply special treatment to String
keys if they are not frozen. In this case the key is copied so you
can safely modify the key you passed:

irb(main):015:0> h={}
=> {}
irb(main):016:0> k="foo"
=> "foo"
irb(main):017:0> h[k]="x"
=> "x"
irb(main):018:0> h
=> {"foo"=>"x"}
irb(main):019:0> k << "_modified"
=> "foo_modified"
irb(main):020:0> k
=> "foo_modified"
irb(main):021:0> h
=> {"foo"=>"x"}
irb(main):022:0> k.object_id
=> 1073420300
irb(main):023:0> h.each {|k,v| puts k, k.object_id}
foo
1073420320
=> {"foo"=>"x"}

If you know you do not need to modify a String key afterwards you can
gain a few CPU cycles by freezing the String.

Kind regards

robert
 
T

Terry Michaels

Thanks for the great responses. However, my mind is pretty warped now:
I'm beginning to question the fundamentals of how to put my socks on.

One last question: Could somebody give me a direct URL link to the
online documentation for the Ruby class Class? No matter how I word it
at google, all I ever get are tutorials about how to write classes, not
the API for the class Class.
 
T

Terry Michaels

Terry Michaels wrote in post #983778:
Thanks for the great responses. However, my mind is pretty warped now:
I'm beginning to question the fundamentals of how to put my socks on.

One last question: Could somebody give me a direct URL link to the
online documentation for the Ruby class Class? No matter how I word it
at google, all I ever get are tutorials about how to write classes, not
the API for the class Class.

Err, nevermind, just got it. Sorry.
 
C

Charles Oliver Nutter

No, IRB shows the result of obj.inspect. =C2=A0The default implementation
returns something which is related to #object_id which in turn is
derived from the address (I forgot the details):

irb(main):010:0> o=3DObject.new
=3D> #<Object:0x7ff72c5c>
irb(main):011:0> o.object_id.to_s 16
=3D> "3ffb962e"
irb(main):012:0> 0x7ff72c5c / 0x3ffb962e
=3D> 2

Anyway, since you cannot access memory directly from Ruby it's
worthless to know the memory address. =C2=A0And in other Ruby
implementations (e.g. JRuby) that memory address may even change.
Forget C (until you write your first Ruby extension).

In JRuby, object IDs are allocated monotonically only as needed, so
they have nothing to do with a given object's memory location. They
will stay the same once requested, but they're little more than a
number we attach to the object.

Because we don't globally track them, they're not guaranteed to be
unique forever. But they'll be unique for 63 bits worth of object IDs
(64 bits / 2).

- Charlie
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,981
Messages
2,570,188
Members
46,731
Latest member
MarcyGipso

Latest Threads

Top