Nobody is saying there's a bug in the implementation of "==". I'm just
saying "==" cannot be taken as a universal superset of "is". Therefore a
program cannot blindly use "==" to test for identity.
Um, yes? Nor can you use ">" to test for identity, or "+", or "%", or any
other operator other than "is". Why do you think it is a problem that ==
doesn't test for identity? It's not supposed to test for identity.
Why do you want to test for identity? I think I've asked five times now,
why you care whether a state value has one instance or a thousand
instances, and you haven't even attempted an answer.
That's why "==" is a bit fishy. It immediately raises the question: what
does it mean for a == b, especially since the exact implementation of a
and b are intended to be opaque.
It means that a equals b. For ints, it means that they have the same
numeric value. The same applies for floats. For strings, it means that
they contain the same code points in the same order. And so on. For all
built-in types, equality is well-defined. For custom types you create
yourself, the onus is on you to ensure that equality is meaningful and
well-defined.
Example:
The os module defines the constants os.SEEK_SET, os.SEEK_CUR and
os.SEEK_END that can be used as arguments for os.lseek(). Must those
constants be used, or can a regular integer be used instead? The
documentation clearly states that integers can be used:
SEEK_SET or 0 to set the position relative to the beginning of the
file; SEEK_CUR or 1 to set it relative to the current position;
SEEK_END or 2 to set it relative to the end of the file.
However, on the same reference page, os.posix_fadvise() is defined. We
read:
advice is one of POSIX_FADV_NORMAL, POSIX_FADV_SEQUENTIAL,
POSIX_FADV_RANDOM, POSIX_FADV_NOREUSE, POSIX_FADV_WILLNEED or
POSIX_FADV_DONTNEED
and:
os.POSIX_FADV_NORMAL
os.POSIX_FADV_SEQUENTIAL
os.POSIX_FADV_RANDOM
os.POSIX_FADV_NOREUSE
os.POSIX_FADV_WILLNEED
os.POSIX_FADV_DONTNEED
Flags that can be used in advice in posix_fadvise()
Now, what kinds of object are those constants? We are not supposed to
know or care.
Incorrect. We are supposed to know and care.
os.posix is exactly the sort of library I mentioned earlier when I said
sometimes you're constrained by compatibility with some other system. In
this case, the os module is explicitly designed to be compatible with the
POSIX interface, which is defined to use certain integer values as flags.
This is not an implementation choice which implementers can change at
will, it is part of the interface.
The specific *values* possibly may be allowed to vary from platform to
platform, and since this is C even the definition of "int" may be
platform specific, but not that fact that they are ints. Hence the value
of POSIX_FADV_RANDOM could, theoretically, be different under Linux and
FreeBSD (say). It probably isn't, but it could be. If you hard-code the
magic number 1 in your code, you're risking the (tiny) chance of it
failing on some obscure POSIX system. But that doesn't imply that we must
test for object identity. There could be a million different instances,
all with the value POSIX_FADV_RANDOM.
Python does not guarantee that there is only a single 1 instance. If you
want to test whether a value is os.POSIX_FADV_RANDOM, the right way is to
compare that value for equality with os.POSIX_FADV_RANDOM, not identity.
We could peek into the implementation, but it would be a
grave mistake to trust the implementation choices in the application.
So in my application code I might set:
favd_flag = os.POSIX_FADV_RANDOM
A much better choice than hard-coding the magic value 1. But that choice
has absolutely nothing to do with whether 1 is a singleton or not.
in some other part of my code I might want to see how "flag" was set.
Should I use "==" or "is" to test it?
Equals, of course. There is absolutely no question about that. To even
*think* that you should test it with "is" means that you have completely
misunderstood what you are doing here. Why are you relying on an
implementation detail that CPython happens to cache and reuse small
integers like 1? What happens if you run your code under an
implementation of Python that doesn't cache small ints? Or if your
platform happens to set POSIX_FADV_RANDOM to a non-cached value like
8531201?
Python does not promise that POSIX_FADV_RANDOM will be a singleton value.
Using "is" is unsafe.
If I take the API documentation on its face value, I *must* use "==" for
os.SEEK*:
Correct.
if seek_flag == os.SEEK_END:
...
and I *must* use "is" for os.POSIX_FAVD_*:
Incorrect.
if fsavd_flag is os.POSIX_FADV_RANDOM:
...
Since, for all I know, os.POSIX_FAVD_RANDOM might return a random value
for __eq__().
For all *you* know, perhaps, but since os.posix_fadvise is a thin wrapper
around the POSIX C function fadvise, and that is documented as expecting
ints for the advice parameter, that cannot be the case.
Unfortunately Python has not had the money put into it to make it an ISO
standard like Java, and so there are certain areas where the behaviour is
known by common practice but not officially documented. (A bit like
British common law.) That the os module is a thin wrapper around os-
specific services may not be explicitly stated, but it is nevertheless
true.