S
Steven D'Aprano
Steven D'Aprano said:Is the implementation smart enough to know that x == y is always
False if x and y are using different internal representations?
[...] There may be circumstances where two strings have different
internal representations even though their content is the same
If there is a deterministic algorithm which maps string content to
representation type, then I don't see how it's possible for two strings
with different representation types to have the same content. Could you
give me an example of when this might happen?
There are deterministic algorithms which can result in the same result
with two different internal formats. Here's an example from Python 2:
py> sum([1, 2**30, -2**30, 2**30, -2**30])
1
py> sum([1, 2**30, 2**30, -2**30, -2**30])
1L
The internal representation (int versus long) differs even though the sum
is the same.
A second example: the order of keys in a dict is deterministic but
unpredictable, as it depends on the history of insertions and deletions
into the dict. So two dicts could be equal, and yet have radically
different internal layout.
One final example: list resizing. Here are two lists which are equal but
have different sizes:
py> a = [0]
py> b = range(10000)
py> del b[1:]
py> a == b
True
py> sys.getsizeof(a)
36
py> sys.getsizeof(b)
48
Is PEP 393 another example of this? I have no idea. Somebody who is more
familiar with the details of the implementation would be able to answer
whether or not that is the case. I'm just suggesting that it is possible.