comparing objects

Mark Abramov · Jun 10, 2010

Benoit said:
I searched a bit and concluded this:

Array methods using comparison
- with #hash and #eql?
&, |, uniq(!), -
- with #==
include?, (r)assoc, count, delete, (r,find_)index
(please say me if I forgot one)

I think Array methods should never have to look at #hash and #eql?
methods.
I suppose this is done for performance.

I think this should change, because:
- it violates POLS
- it can make unexpected behavior because you defined #hash and #eql? ,
for
objects which should not need that (when you manage objects in an Array,
you
do not expect to need to think about Hash's keys).
- it is not consistent with other Array's methods

For me it doesn't work anyway.
Unsure how to paste code here, you could see an example here:
http://pastie.org/999353
I am still like "WTF?"

$ ruby -v
ruby 1.8.7 (2009-06-12 patchlevel 174) [i686-darwin9.8.0]

Mark Abramov · Jun 10, 2010

Mark said:
[tl;dr]

Sorry, guys, didn't notice how I used eql instead of eql?
Btw, without #hash it won't work anyways which I consider *weird* at the
very least.

Marcin Wolski · Jun 10, 2010

Rein said:
Mark said:

[tl;dr]

Click to expand...

Sorry, guys, didn't notice how I used eql instead of eql?
Btw, without #hash it won't work anyways which I consider *weird* at the
very least.

Click to expand...

#hash makes sense for Hash#[] and etc. #eql? makes more sense for
Array#&. I too find it odd that both are necessary.

If two objects are set to be eql?, their hash methods must also return
the same value. More details in The Ruby Programming Language book.

Thus, when you redefine eql?, the hash methods also should be redefined.

Mark Abramov · Jun 10, 2010

Marcin said:
Rein said:

Mark Abramov wrote:
[tl;dr]

Sorry, guys, didn't notice how I used eql instead of eql?
Btw, without #hash it won't work anyways which I consider *weird* at the
very least.

Click to expand...

#hash makes sense for Hash#[] and etc. #eql? makes more sense for
Array#&. I too find it odd that both are necessary.

Click to expand...

If two objects are set to be eql?, their hash methods must also return
the same value. More details in The Ruby Programming Language book.

Thus, when you redefine eql?, the hash methods also should be redefined.

http://ruby-doc.org/core-1.8.7/classes/Object.html#M000617
Well, it doesn't say much in core api

Robert Klemme · Jun 10, 2010

Even if
Hmm? Would you care to show an example where overloading those methods
(#eql? and #hash) is needed to ensure proper behavior? I am willing to
learn. But I am not willing to accept this statement as such.
Cheers
R.

Click to expand...

You have been presented with one in this very thread. The OP wants
objects of his class to have the correct semantics for Array#& and
Hash#[], etc. The correct answer is to implement #hash and #eql?, just
as implementing <=> provides objects of his class with the correct
semantics for Array#sort.

See also
http://blog.rubybestpractices.com/posts/rklemme/018-Complete_Class.html
http://blog.rubybestpractices.com/posts/rklemme/019-Complete_Numeric_Class.html

Cheers

robert

Robert Dober · Jun 10, 2010

On 2010-06-10 06:59:40 -0700, Robert Dober said:
You have been presented with one in this very thread. The OP wants objects
of his class to have the correct semantics for Array#& and Hash#[], etc. The
correct answer is to implement #hash and #eql?, just as implementing <=>
provides objects of his class with the correct semantics for Array#sort.

I guess you really do not know what I was talking about? Or do you
just repeat the same stuff over and over again in order to convince
me?
overwriting #hash and #eql? breaks Hash! Why the hack should OP's
usecase justify this?
And it does not answer my question. Where would I like that Hash
behaves accordingly to the redefined #eql? and #hash. And BTW I asked
Wilson, did I not?
Cheers
Robert

Robert Dober · Jun 10, 2010

http://blog.rubybestpractices.com/posts/rklemme/018-Complete_Class.html
http://blog.rubybestpractices.com/posts/rklemme/019-Complete_Numeric_Class.html

I
You define #eql? and #hash for your convenience. So good, so bad. My
question simply was: Show my why *not* redefining #hash and #eql? will
cause problems, because that was Wilson's statement. I am still
waiting

.

Cheers
R.

Mark Abramov · Jun 10, 2010

Robert said:
overwriting #hash and #eql? breaks Hash!

That's not true, I think.

Robert Dober · Jun 11, 2010

That's not true, I think.

Judge for yourself

require "forwardable"

def count klass
ObjectSpace.each_object( klass ).to_a.size
end
class N
extend Forwardable
attr_reader :n
def_delegators :n, :hash
def eql? otha
n =3D=3D otha.n
end
private
def initialize n
@n =3D n
end
end # class N

h =3D { N.new( 42 ) =3D> true }
h[ N.new( 42 ) ] =3D 42
p h
GC.start
p count(N)

Cheers
R.

Robert Klemme · Jun 11, 2010

2010/6/11 Shot (Piotr Szotkowski) said:
Rein Henrichs:

#hash makes sense for Hash#[] and etc. #eql? makes more
sense for Array#&. I too find it odd that both are necessary.

Click to expand...

Both are necessary because #eql? says whether two objects are surely
the same, while #hash says whether they=92re surely different =96 which,
perhaps counterintuitively, is not the same problem.

The difference is that in many, many cases it=92s much faster to check
whether two objects are surely different (via a fast #hash function)
than whether they=92re surely the same (#eql? can be quite slow).

This is not necessarily true. Any reasonable implementation of #eql?
will bail out as soon as it sees a difference. On the contrary, you
always need to look at the complete state of an instance to calculate
#hash. I can easily construct an example where #eql? beats #hash:

14:40:54 Temp$ ruby19 eql-test.rb
same
0.110000 0.000000 0.110000 ( 0.098000)
0.093000 0.000000 0.093000 ( 0.099000)
0.157000 0.000000 0.157000 ( 0.151000)
different early
0.093000 0.000000 0.093000 ( 0.101000)
0.094000 0.000000 0.094000 ( 0.096000)
0.000000 0.000000 0.000000 ( 0.000000)
different late
0.109000 0.000000 0.109000 ( 0.105000)
0.094000 0.000000 0.094000 ( 0.098000)
0.156000 0.000000 0.156000 ( 0.149000)
14:40:56 Temp$ cat eql-test.rb
require 'benchmark'
a1 =3D Array.new 1_000_000
a2 =3D Array.new 1_000_000
puts "same"
puts Benchmark.measure { a1.hash }
puts Benchmark.measure { a2.hash }
puts Benchmark.measure { a1.eql? a2 }
a1[0] =3D 1
a2[0] =3D 2
puts "different early"
puts Benchmark.measure { a1.hash }
puts Benchmark.measure { a2.hash }
puts Benchmark.measure { a1.eql? a2 }
a2[0] =3D a1[0]
a2[999_999] =3D 1
puts "different late"
puts Benchmark.measure { a1.hash }
puts Benchmark.measure { a2.hash }
puts Benchmark.measure { a1.eql? a2 }
14:40:58 Temp$

Notice also how #eql? with equal arrays is not much slower than #hash.

The main difference betwen #eql? and #hash is that #hash can return the
same value for objects that are not #eql? (but if two objects are #eql?
then #hash must return the same value).

An untested, and definitely not optimal
(but hopefully simple) example follows.

Imagine that you want to implement a new immutable string class, one
which caches the string length (for performance reasons). Imagine also
that the vast majority of such strings you use are of different lenghts,
and that you want to use them as Hash keys.

class ImmutableString

=A0def initialize string
=A0 =A0@string =3D string.dup.freeze
=A0 =A0@length =3D string.length
=A0end

end

Given the above assumptions, it might make sense for #hash to
return the @length, while #eql? makes the =91proper=92 comparison:

class ImmutableString

=A0def hash
=A0 =A0@length

Bad hash implementation. Why don't you use String#hash?

=A0end

=A0alias eql? =3D=3D

end

This way in the vast majority of cases, when your ImmutableStrings will
be considered for Hash keys, the check whether a given key exists will
be very quick; only when two objects #hash to the same value (i.e.,
when they=92re not surely different) the #eql? is called to tell whether
they=92re surely the same.

If the set of attributes to be used for the specific comparison needed
in this thread is not the same as the set that we identify as keyish
for class User in general one cannot use User#eql? and User#hash for
quick set intersection. That's why I suggested to use a Struct for
key fields (which has proper #hash and #eql? built in).

Kind regards

robert

--=20
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/

Daniel Berger · Jun 11, 2010

How can I compare two objects and get true if some of his atributes are
equals ?

include Comparable ?

Regards,

Dan

Robert Klemme · Jun 11, 2010

I
You define #eql? and #hash for your convenience. So good, so bad. My
question simply was: Show my why *not* redefining #hash and #eql? will
cause problems, because that was Wilson's statement. I am still
waiting .

The advice to implement #eql? and #hash really only makes sense if
equivalence can reasonably be defined for a class and if instances of
that class should be used as Hash keys or in Set. If not at least
equivalence can be defined other than via identity (which is the
default) then it is perfectly reasonable to not override both methods
and go with the default implementation.

Kind regards

robert

Robert Dober · Jun 11, 2010

The advice to implement #eql? and #hash really only makes sense if
equivalence can reasonably be defined for a class and if instances of tha= t
class should be used as Hash keys or in Set. =A0If not at least equivalen= ce
can be defined other than via identity (which is the default) then it is
perfectly reasonable to not override both methods and go with the default
implementation.

But that was *exactly* my point.

OP wanted to use Array#&, and Array#&, for a reason not too clear to
me, uses Object#eql? instead of Object#=3D=3D I did discourage the
overloading of Object#eql? and Object#hash for *that purpose*.

If you want to change Hash then it is the right thing to do.
Now I might strongly disagree about if one should do that, but that is
rather OT and I would never have made such strong statements about
that issue.
However the technique you suggest is not to be put into non expert
hands as I tried to show with the memory leaking code above.

Cheers
Robert

Kind regards

=A0 =A0 =A0 =A0robert

--=20
The best way to predict the future is to invent it.
-- Alan Kay

Caleb Clausen · Jun 11, 2010

OP wanted to use Array#&, and Array#&, for a reason not too clear to
me, uses Object#eql? instead of Object#== I did discourage the
overloading of Object#eql? and Object#hash for *that purpose*.

Array#& uses eql? instead of == because internally, it works something
like this:

class Array
def &(other)
h1={}
other.each{|x| h1[x]=true}
select{|x| h1[x] }
end
end

In other words, it creates a (hash) index to get a speedup. (From
O(M*N) to O(M+N).)

Robert Dober · Jun 11, 2010

I see, thanx

Robert Klemme · Jun 12, 2010

But that was *exactly* my point.

I don't think we disagree, nor do I argue with you. I just posted blog
links as illustration to Rein's point about how to implement those methods.

Kind regards

robert

Robert Dober · Jun 12, 2010

I don't think we disagree, nor do I argue with you. =A0I just posted blog
links as illustration to Rein's point about how to implement those method=

s.

Forgive my confusion then.
Cheers
Robert

--=20
The best way to predict the future is to invent it.
-- Alan Kay

Robert Klemme · Jun 12, 2010

Forgive my confusion then.

No problem. I think I fueled it by not including a comment in the
original posting. Sorry for that.

Kind regards

robert

simple module for "count my instances" behaviour	10	Jul 22, 2008
Typical javascript array comparison and manipulation	2	Jan 4, 2023
Feedback on my design and how to use DCI Design Pattern?	5	Dec 24, 2011
Need help comparing Array data	6	Oct 18, 2009
C exercise	1	Feb 3, 2022
How to Restore OST File into Outlook? Trouble Free Solution!	1	Jan 2, 2025
Javascript interview question that I couldn't solve	1	Jun 6, 2024
Seeking co-founders for my company.	3	Sep 8, 2024

comparing objects

Mark Abramov

Mark Abramov

Marcin Wolski

Mark Abramov

Robert Klemme

Robert Dober

Robert Dober

Mark Abramov

Robert Dober

Robert Klemme

Daniel Berger

Robert Klemme

Robert Dober

Caleb Clausen

Robert Dober

Robert Klemme

Robert Dober

Robert Klemme

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads