hash method how-to?

Y

Yossef Mendelssohn

http://en.wikipedia.org/wiki/Hash_function

in most cases it's the easiest to delegate the hash method to something
else. Example:

class X
=A0 def initialize(name)
=A0 =A0 @name =3D name
=A0 end

=A0 def hash
=A0 =A0 @name.hash
=A0 end
end

h =3D { X.new("foo") =3D> "bar" }
h[X.new("foo")] =3D "baz" # will overwrite the above

And so will `h['foo'] =3D 'baz'`.
 
R

Rick DeNatale

http://en.wikipedia.org/wiki/Hash_function

in most cases it's the easiest to delegate the hash method to something
else. Example:

class X
=A0 def initialize(name)
=A0 =A0 @name =3D name
=A0 end

=A0 def hash
=A0 =A0 @name.hash
=A0 end
end

h =3D { X.new("foo") =3D> "bar" }
h[X.new("foo")] =3D "baz" # will overwrite the above

And so will `h['foo'] =3D 'baz'`.

No neither will:

class X
attr_reader :name
def initialize(name)
@name =3D name
end

def hash
@name.hash
end

def inspect
"X:#{@name.inspect}"
end
end

h =3D { X.new("foo") =3D> :bar}
h[X.new("foo")] =3D :baz
h["foo"] =3D :bat

h # =3D> {"foo"=3D>:bat, X:"foo"=3D>:baz, X:"foo"=3D>:bar}

The reason is that the value of #hash is not the only thing needed to
distinguish hash keys. All objects with the same hash value are mapped
to the same hash 'bucket' hash collisions are resolved by using the
eql? method,

X.new("a").eql?(X.new("a")) # =3D> false

class X
def eql?(other)
other.class =3D=3D X && # One of the rare occurrences when checking
the class is a good thing
name.eql?(other.name)
end
end

h =3D { X.new("foo") =3D> :bar}
h[X.new("foo")] =3D :baz
h["foo"] =3D :bat

h # =3D> {"foo"=3D>:bat, X:"foo"=3D>:baz}

X.new("a").eql?(X.new("a")) # =3D> true

Note that the Hash class depends on a relationship between eql? and hash, t=
hat

a.eql?(b) =3D> a.hash =3D=3D b.hash

where =3D> here is logical implication, that is

a =3D> b is the same as

b ||| !a


--=20
Rick DeNatale

Blog: http://talklikeaduck.denhaven2.com/
Github: http://github.com/rubyredrick
Twitter: @RickDeNatale
WWR: http://www.workingwithrails.com/person/9021-rick-denatale
LinkedIn: http://www.linkedin.com/in/rickdenatale
 
Y

Yossef Mendelssohn

The reason is that the value of #hash is not the only thing needed to
distinguish hash keys. All objects with the same hash value are mapped
to the same hash 'bucket' hash collisions are resolved by using the
eql? method,

Good explanation. My response was a knee-jerk reaction without
checking if it actually worked, and it definitely makes sense for the
language to care about more than simply the results of #hash.

Although I still think what I was trying to point out is something to
consider, that simply delegating #hash to an internal element isn't
all that good because on some level it makes them the same.
 
R

Robert Klemme

Although I still think what I was trying to point out is something to
consider, that simply delegating #hash to an internal element isn't
all that good because on some level it makes them the same.

Ultimately all hash values are derived from member hash values. The key
point - and I believe this is what you want to stress - is _how_ values
are derived. The methodology applied should strive to yield different
hash values for things that are considered non equivalent. For example,
hash value of an Array or list implementation should somehow consider
position of elements in order to not return the same hash for two
collections which only differ in ordering. We can see that Array#hash
obeys this:

irb(main):001:0> [1,2].hash
=> 11
irb(main):002:0> [2,1].hash
=> 1

On the other hand, since Set conceptually is agnostic of position
different ordering of elements should not create different hash codes:

irb(main):001:0> require 'set'
=> true
irb(main):002:0> s1 = [1,2].to_set
=> #<Set: {1, 2}>
irb(main):003:0> s2 = [2,1].to_set
=> #<Set: {2, 1}>
irb(main):004:0> s1.hash
=> 14
irb(main):005:0> s2.hash
=> 14
irb(main):006:0> s1 == s2
=> true

Kind regards

robert
 
R

Rick DeNatale

Although I still think what I was trying to point out is something to
consider, that simply delegating #hash to an internal element isn't
all that good because on some level it makes them the same.

Ultimately all hash values are derived from member hash values. =A0The ke= y
point - and I believe this is what you want to stress - is _how_ values a= re
derived. =A0The methodology applied should strive to yield different hash
values for things that are considered non equivalent. =A0For example, has= h
value of an Array or list implementation should somehow consider position= of
elements in order to not return the same hash for two collections which o= nly
differ in ordering. =A0We can see that Array#hash obeys this:

irb(main):001:0> [1,2].hash
=3D> 11
irb(main):002:0> [2,1].hash
=3D> 1

On the other hand, since Set conceptually is agnostic of position differe= nt
ordering of elements should not create different hash codes:

irb(main):001:0> require 'set'
=3D> true
irb(main):002:0> s1 =3D [1,2].to_set
=3D> #<Set: {1, 2}>
irb(main):003:0> s2 =3D [2,1].to_set
=3D> #<Set: {2, 1}>
irb(main):004:0> s1.hash
=3D> 14
irb(main):005:0> s2.hash
=3D> 14
irb(main):006:0> s1 =3D=3D s2
=3D> true

Note that Set, like hash depends on that implication relationship
between #hash and #eql?

class X
attr_reader :name
def initialize(name)
@name =3D name
end

def hash
@name.hash
end

def inspect
"X:#{@name.inspect}"
end
end

require 'set'

[X.new("foo"), X.new("foo")].to_set # =3D> #<Set: {X:"foo", X:"foo"}>

class X
def eql?(other)
other.class =3D=3D X &&
name.eql?(other.name )
end
end

[X.new("foo"), X.new("foo")].to_set # =3D> #<Set: {X:"foo"}>



--=20
Rick DeNatale

Blog: http://talklikeaduck.denhaven2.com/
Github: http://github.com/rubyredrick
Twitter: @RickDeNatale
WWR: http://www.workingwithrails.com/person/9021-rick-denatale
LinkedIn: http://www.linkedin.com/in/rickdenatale
 
R

Robert Klemme

Although I still think what I was trying to point out is something to
consider, that simply delegating #hash to an internal element isn't
all that good because on some level it makes them the same.

Ultimately all hash values are derived from member hash values. The key
point - and I believe this is what you want to stress - is _how_ values are
derived. The methodology applied should strive to yield different hash
values for things that are considered non equivalent. For example, hash
value of an Array or list implementation should somehow consider position of
elements in order to not return the same hash for two collections which only
differ in ordering. We can see that Array#hash obeys this:

irb(main):001:0> [1,2].hash
=> 11
irb(main):002:0> [2,1].hash
=> 1

On the other hand, since Set conceptually is agnostic of position different
ordering of elements should not create different hash codes:

irb(main):001:0> require 'set'
=> true
irb(main):002:0> s1 = [1,2].to_set
=> #<Set: {1, 2}>
irb(main):003:0> s2 = [2,1].to_set
=> #<Set: {2, 1}>
irb(main):004:0> s1.hash
=> 14
irb(main):005:0> s2.hash
=> 14
irb(main):006:0> s1 == s2
=> true

Note that Set, like hash depends on that implication relationship
between #hash and #eql?

This is a different story: I was talking about how Set calculates its
hash code from members. You are talking about how Set determines member
equivalence and especially what role the implementation of #hash of the
_members_ plays in this.

Kind regards

robert
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,141
Messages
2,570,817
Members
47,367
Latest member
mahdiharooniir

Latest Threads

Top