Annoying behaviour of the != operator

  • Thread starter Jordan Rastrick
  • Start date
M

Mahesh

I understand that what makes perfect sense to me might not make perfect
sense to you but it seems a sane default. When you compare two objects,
what is that comparision based on? In the explicit is better than
implicit world, Python can only assume that you *really* do want to
compare objects unless you tell it otherwise. The only way it knows how
to compare two objects is to compare object identities.

I am against making exceptions for corner cases and I do think making
__ne__ implicitly assume not __eq__ is a corner case.

Maybe you think that it takes this explicit is better than implicit
philosophy too far and acts dumb but I think it is acting consistently.

Cheers,
Mahesh
 
J

John Roth

Jordan Rastrick said:
Well, I'll admit I haven't ever used the Numeric module, but since
PEP207 was submitted and accepted, with Numeric as apparently one of
its main motivations, I'm going to assume that the pros and cons for
having == and ilk return things other than True or False have already
been discussed at length and that argument settled. (I suppose theres a
reason why Numeric arrays weren't just given the same behaviour as
builtin lists, and then simple non-special named methods to do the
'rich' comparisons.)

They were - read the PEP again. That's the behavior they wanted
to get away from.
But again, it seems like a pretty rare and marginal use case, compared
to simply wanting to see if some object a is equal to (in a non object
identity sense) object b.

The current situation seems to be essentially use __cmp__ for normal
cases, and use the rich operations, __eq__, __gt__, __ne__, and rest,
only in the rare cases. Also, if you define one of them, make sure you
define all of them.

Theres no room for the case of objects where the == and != operators
should return a simple True or False, and are always each others
complement, but <, >= and the rest give an error. I haven't written
enough Python to know for sure, but based on my experience in other
languages I'd guess this case is vastly more common than all others put
together.

I'd be prepared to bet that anyone defining just __eq__ on a class, but
none of __cmp__, __ne__, __gt__ etc, wants a != b to return the
negation of a.__eq__(b). It can't be any worse than the current case of
having == work as the method __eq__ method describes but != work by
object identity.

To quote Calvin Coolege: You lose.

The primary open source package I work on, PyFit, always wants to
do an equal comparison, and never needs to do a not equal. It also has
no use for ordering comparisons. I do not equals as a matter of symmetry
in case someone else wants them, but I usually have no need of them.
Strict XP says I shouldn't do them without a customer request.
So far, I stand by my suggested change.

I think most of your justification is simple chicken squaking, but write
the PEP anyway. I'd suggest tightening it to say that if __eq__ is
defined, and if neither __ne__ nor __cmp__ is defined, then use
__eq__ and return the negation if and only if the result of __eq__
is a boolean. Otherwise raise the current exception.

I wouldn't suggest the reverse, though. Defining __ne__ and not
defining __eq__ is simply perverse.

John Roth
 
D

Dan Bishop

Mahesh said:
I understand that what makes perfect sense to me might not make perfect
sense to you but it seems a sane default. When you compare two objects,
what is that comparision based on? In the explicit is better than
implicit world, Python can only assume that you *really* do want to
compare objects unless you tell it otherwise. The only way it knows how
to compare two objects is to compare object identities.

This isn't the issue here. I agree that object identity comparison is
a good default equality test. The issue is whether this default should
be thought of as

# your approach (and the current implementation)
def __eq__(self, other):
return self is other
def __ne__(self, other):
return self is not other

or

# my approach
def __eq__(self, other):
return self is other
def __ne__(self, other):
return not (self == other)

My approach simplifies the implementation (i.e., requires one fewer
method to be overridden) of classes for which (x != y) == not (x == y).
This is a very common case.

Your approach simplifies the implementation of classes for which
equality tests are based on data but inequality tests are based on
identity (or vice-versa). I can't think of a single situation in which
this is useful.
 
P

Peter Hansen

Christopher said:
Perhaps the language should offer
the sensible default of (!=) == (not ==) if one of them but not the
other is overriden, but still allow overriding of both.

I believe that's exactly what Jordan is promoting and, having been
bitten in exactly the same way I would support the idea. On the other
hand, I was bitten only _once_ and I suspect Jordan will never be bitten
by it again either. It's pretty hard to forget this wart once you
discover it, but I think the real reason to want to have it excised is
that a large number of people will have to learn this the hard way,
documentation (thankfully) not being shoved down one's throat as one
starts intrepidly down the road of overriding __eq__ for the first time.
This would technically break backwards compatibilty, because it changes
default behavior, but I can't think of any good reason (from a python
newbie perspective) for the current counterintuitive behavior to be the
default. Possibly punt this to Python 3.0?

I'd support an effort to fix it in 2.5 actually. I suspect nobody will
pipe up with code that would actually be broken by it, though some code
(as John Roth points out) doesn't *need* to have the automatic __ne__
even if it wouldn't break because of it.

-Peter
 
P

Peter Hansen

Robert said:
The problem arises that, in the presence of rich comparisons, (a == b)
is not always a boolean value, while (a is b) is always a boolean value.

But that still doesn't mean that in a case where a == b (via __eq__)
returns a non-boolean, __ne__ would not be defined as well. In other
words, there's _nothing_ preventing this "fix" from being made to
provide saner behaviour in the most common case (which happens to pose
the greatest risk of inadvertent mistakes for those who aren't aware of
the requirement to define both) while still allowing the cases that need
unusual behaviour to get it by (as they already surely do!) defining
both __ne__ and __eq__.

-Peter
 
S

Steven D'Aprano

No, why should Python assume that if you use != without supplying a
__ne__ that this is what you want? Without direction it will compare
the two objects which is the default behavior.

Why should Python assume that != means "not is" instead of "not equal"?

That seems like an especially perverse choice given that the operator is
actually called "not equal".
So, s != t is True because the ids of the two objects are different.
The same applies to, for example s > t and s < t. Do you want Python to
be smart and deduce that you want to compare one variable within the
object if you don't create __gt__ and __lt__? I do not want Python to
do that.

That is an incorrect analogy. The original poster doesn't want Python to
guess which attribute to do comparisons by. He wants "!=" to be
defined as "not equal" if not explicitly overridden with a __ne__ method.

If there are no comparison methods defined, then and only then does it
make sense for == and != to implicitly test object identity.

I'm all for the ability to override the default behaviour. But surely
sensible and intuitive defaults are important?
 
G

Greg Ewing

Jordan said:
But I explicitly provided a method to test equality.

Actually, no, you didn't. You provided a method to define
the meaning of the operator spelled '==' when applied to your
object. That's the level of abstraction at which Python's
__xxx__ methods work. They don't make any semantic assumptions.

It's arguable that there should perhaps be some default
assumptions made, but the Python developers seem to have
done the Simplest Thing That Could Possibly Work, which
isn't entirely unreasonable.
 
G

Greg Ewing

Jordan said:
Where are the 'number of situations' where __ne__ cannot be derived
from __eq__? Is it just the floating point one? I must admit, I've
missed any others.

The floating point one is just an example, it's not meant
to be the entire justification.

Some others:

* Numeric arrays, where comparisons return an array of
booleans resulting from applying the comparison to each
element.

* Computer algebra systems and such like, which return a
parse tree as a result of evaluating an expression.
 
G

Greg Ewing

Rocco said:
The main problem is that Python is trying to stick at least three
different concepts onto the same set of operators: equivalence (are
these two objects the same?), ordering (in a sorted list, which comes
first?), and mathematical "size".

A possible compromise would be to add a new special method,
such as __equal__, for use by == and != when there is no
__eq__ or __ne__. Then there would be three clearly separated
levels of comparison: (1) __cmp__ for ordering, (2) __equal__
for equivalence, (3) __eq__ etc. for unrestricted semantics.
> This gives the wacky world where
"[(1,2), (3,4)].sort()" works, whereas "[1+2j, 3+4j].sort()" doesn't.

To solve that, I would suggest a fourth category of "arbitrary
ordering", but that's probably Py3k material.
 
A

Antoon Pardon

Op 2005-06-08 said:
No, why should Python assume that if you use != without supplying a
__ne__ that this is what you want? Without direction it will compare
the two objects which is the default behavior.

So, s != t is True because the ids of the two objects are different.
The same applies to, for example s > t and s < t. Do you want Python to
be smart and deduce that you want to compare one variable within the
object if you don't create __gt__ and __lt__? I do not want Python to
do that.

Python is already smart. It deduces what you want with the += operator
even if you haven't defined an __iadd__ method. If python can be smart
with that, I don't see python being smart with !=.
 
D

Dan Sommers

A possible compromise would be to add a new special method,
such as __equal__, for use by == and != when there is no
__eq__ or __ne__. Then there would be three clearly separated
levels of comparison: (1) __cmp__ for ordering, (2) __equal__
for equivalence, (3) __eq__ etc. for unrestricted semantics.
This gives the wacky world where
"[(1,2), (3,4)].sort()" works, whereas "[1+2j, 3+4j].sort()" doesn't.

Python inherits that wackiness directly from (often wacky) world of
Mathematics.

IMO, the true wackiness is that

[ AssertionError, (vars, unicode), __name__, apply ].sort( )

"works," too. Python refusing to sort my list of complex numbers is a
Good Thing.
To solve that, I would suggest a fourth category of "arbitrary
ordering", but that's probably Py3k material.

Four separate classes of __comparison__ methods in a language that
doesn't (and can't and shouldn't) preclude or warn about rules regarding
which methods "conflict" with which other methods? I do not claim to be
an expert, but that doesn't seem very Pythonic to me.

AIUI, __cmp__ exists for backwards compatibility, and __eq__ and friends
are flexible enough to cover any possible comparison scheme.

Why make the rules, the documentation, and the implementation even more
"interesting" than they already are?

Regards,
Dan
 
D

David M. Cooke

Greg Ewing said:
Rocco said:
This gives the wacky world where
"[(1,2), (3,4)].sort()" works, whereas "[1+2j, 3+4j].sort()" doesn't.

To solve that, I would suggest a fourth category of "arbitrary
ordering", but that's probably Py3k material.

We've got that: use hash().
[1+2j, 3+4j].sort(key=hash)

Using the key= arg in sort means you can do other stuff easily of course:

by real part:
import operator
[1+2j, 3+4j].sort(key=operator.attrgetter('real'))

by size:
[1+2j, 3+4j].sort(key=abs)

and since .sort() is stable, for those numbers where the key is the
same, the order will stay the same.
 
G

greg

David said:
To solve that, I would suggest a fourth category of "arbitrary
ordering", but that's probably Py3k material.

We've got that: use hash().
[1+2j, 3+4j].sort(key=hash)

What about objects that are not hashable?

The purpose of arbitrary ordering would be to provide
an ordering for all objects, whatever they might be.

Greg
 
R

Robert Kern

greg said:
David said:
To solve that, I would suggest a fourth category of "arbitrary
ordering", but that's probably Py3k material.

We've got that: use hash().
[1+2j, 3+4j].sort(key=hash)

What about objects that are not hashable?

The purpose of arbitrary ordering would be to provide
an ordering for all objects, whatever they might be.

How about id(), then?

And so the circle is completed...

--
Robert Kern
(e-mail address removed)

"In the fields of hell where the grass grows high
Are the graves of dreams allowed to die."
-- Richard Harter
 
R

Rocco Moretti

Dan said:
Rocco said:
The main problem is that Python is trying to stick at least three
different concepts onto the same set of operators: equivalence (are
these two objects the same?), ordering (in a sorted list, which comes
first?), and mathematical "size".
This gives the wacky world where
"[(1,2), (3,4)].sort()" works, whereas "[1+2j, 3+4j].sort()" doesn't.


Python inherits that wackiness directly from (often wacky) world of
Mathematics.

IMO, the true wackiness is that

[ AssertionError, (vars, unicode), __name__, apply ].sort( )

"works," too. Python refusing to sort my list of complex numbers is a
Good Thing.

The "wackyness" I refered to wasn't that a list of complex numbers isn't
sortable, but the inconsistent behaviour of list sorting. As you
mentioned, an arbitraty collection of objects in a list is sortable, but
as soon as you throw a complex number in there, you get an exception.

One way to handle that is to refuse to sort anything that doesn't have a
"natural" order. But as I understand it, Guido decided that being able
to sort arbitrary lists is a feature, not a bug. But you can't sort ones
with complex numbers in them, because you also want '1+3j<3+1j' to raise
an error.
Four separate classes of __comparison__ methods in a language that
doesn't (and can't and shouldn't) preclude or warn about rules regarding
which methods "conflict" with which other methods? I do not claim to be
an expert, but that doesn't seem very Pythonic to me.

What "conflict"? Where are you getting the doesn't/can't/shouldn't
prescription from?

Which method you use depends on what you want to achieve:

(Hypothetical Scheme)
Object Identity? - use 'is'
Mathematical Ordering? - use '__eq__' & friends
Object Equivalence? - use '__equiv__'
Arbitrary Ordering? (e.g. for list sorting) - use '__order__'

The only caveat is to define sensible defaults for the cases where one
fuction is not defined. But that shouldn't be too hard.

__eqiv__ -> __eq__ -> is
__order__ -> __lt__/__cmp__
AIUI, __cmp__ exists for backwards compatibility, and __eq__ and friends
are flexible enough to cover any possible comparison scheme.

Except if you want the situation where "[1+2j, 3+4j].sort()" works, and
'1+3j < 3+1j' fails.


I think the issue is you thinking along the lines of Mathematical
numbers, where the four different comparisons colapse to one. Object
identity? There is only one 'two' - heck, in pure mathematics, there
isn't even a 'float two'/'int two' difference. Equivalence *is*
mathematical equality, and the "arbitrary ordering" is easily defined as
"true" ordering. It's only when you break away from mathematics do you
see the divergance in behavior.
 
S

Steven D'Aprano

The main problem is that Python is trying to stick at least three
different concepts onto the same set of operators: equivalence (are
these two objects the same?), ordering (in a sorted list, which comes
first?), and mathematical "size". [snip]
This gives the wacky world where
"[(1,2), (3,4)].sort()" works, whereas "[1+2j, 3+4j].sort()" doesn't.

Python inherits that wackiness directly from (often wacky) world of
Mathematics.

IMO, the true wackiness is that

[ AssertionError, (vars, unicode), __name__, apply ].sort( )

"works," too. Python refusing to sort my list of complex numbers is a
Good Thing.

Only if you understand sorting as being related to the mathematical
sense of size, rather than the sense of ordering. The two are not the
same!

If you were to ask, "which is bigger, 1+2j or 3+4j?" then you
are asking a question about mathematical size. There is no unique answer
(although taking the absolute value must surely come close) and the
expression 1+2j > 3+4j is undefined.

But if you ask "which should come first in a list, 1+2j or 3+4j?" then you
are asking about a completely different thing. The usual way of sorting
arbitrary chunks of data within a list is by dictionary order, and in
dictionary order 1+2j comes before 3+4j because 1 comes before 3.

This suggests that perhaps sort needs a keyword argument "style", one of
"dictionary", "numeric" or "datetime", which would modify how sorting
would compare keys.

Perhaps in Python 3.0.
 
D

Dan Bishop

Steven D'Aprano wrote:
....
If you were to ask, "which is bigger, 1+2j or 3+4j?" then you
are asking a question about mathematical size. There is no unique answer
(although taking the absolute value must surely come close) and the
expression 1+2j > 3+4j is undefined.

But if you ask "which should come first in a list, 1+2j or 3+4j?" then you
are asking about a completely different thing. The usual way of sorting
arbitrary chunks of data within a list is by dictionary order, and in
dictionary order 1+2j comes before 3+4j because 1 comes before 3.

This suggests that perhaps sort needs a keyword argument "style", one of
"dictionary", "numeric" or "datetime", which would modify how sorting
would compare keys.

Perhaps in Python 3.0.

What's wrong with the Python 2.4 approach of
clist = [7+8j, 3+4j, 1+2j, 5+6j]
clist.sort(key=lambda z: (z.real, z.imag))
clist
[(1+2j), (3+4j), (5+6j), (7+8j)]

?
 
T

Terry Reedy

Rocco Moretti said:
The "wackyness" I refered to wasn't that a list of complex numbers isn't
sortable, but the inconsistent behaviour of list sorting. As you
mentioned, an arbitraty collection of objects in a list is sortable, but
as soon as you throw a complex number in there, you get an exception.

This 'wackyness' is an artifact resulting from Python being 'improved'
after its original design. When Guido added complex numbers as a builtin
type, he had to decide whethter to make them sortable or not. There were
reasons to go either way. ... and the discussion has continued ever since
;-)

Terry J. Reedy
 
R

Rocco Moretti

George said:
:




He has changed his mind since then
(http://mail.python.org/pipermail/python-dev/2004-June/045111.html) but
it was already too late.

The indicated message sidesteps the crux of the issue. It confirms that
arbitrary *comparisons* between objects are considered a wart, but it
says nothing about arbitrary *ordering* of objects.

None > True --> Wart
[None, True].sort() --> ????

The point that I've been trying to get across is that the two issues are
conceptually separate. (That's not to say that Guido might now consider
the latter a wart, too.)
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,135
Messages
2,570,783
Members
47,340
Latest member
orhankaya

Latest Threads

Top