While it's easy to explain the behavior, I think the decision to dis-
allow mutable items as keys is a bit arbitrary. There is no need for
dict to recompute hash
What???
Of course it does. How else can it look up the key? Because it (somehow)
just recognizes that it has seen the key before? How? By magic?
(first of all, a user doesn't even need to know
if underneath 'hashing' is used -- the service is just a mapping between
one item to another item).
The user doesn't need to know the mechanism, but the dict does. Dicts are
implemented as hash tables. I suppose they could be implemented as
something else (what? linear lists? some sort of tree?) but the same
considerations must be made: the dict must be able to find keys it has
seen before. How is the dict supposed to recognise the key if the key has
changed?
Since we know hashing is used, all that is needed is, a well-defined way
to construct a hash out of a mutable. "Given a sequence, how to get a
hash" is the problem.
Nonsense. That's not the problem. The problem is how to get exactly the
same hash when the sequence has changed.
In other words, if you have two mutable objects M1 and M2, then you
expect:
hash(M1) == hash(M2) if and only if M1 and M2 are equal
hash(M1) != hash(M2) if M1 and M2 are unequal
but also:
if M1 mutates to become equal to M2, hash(M1) must remain the same while
still being different from hash(M2).
That means that hash() now is a non-deterministic function. hash([1,2,3])
will vary according to how the list [1,2,3] was constructed!
Obviously something has to give, because not all of these things are
mutually compatible.
If later the given sequence is different, that's
not the dict's problem.
Data structures don't have problems. Programmers do. And language
designers with sense build languages that minimize the programmers
problems, not maximize them.
So if the list changes, it will result in a different hash and we will
get a hash-miss. I doubt this is in anyway less intuitive than dis-
allowing mutable items as keys.
The choices for the language designer are:
(1) Invent some sort of magical non-deterministic hash function which
always does the Right Thing.
(2) Allow the hash of mutable objects to change, which means you can use
mutable objects as keys in dicts but if you change them, you can no
longer find them in the dict. They'll still be there, using up memory,
but you can't get to them.
(3) Simply disallow mutable objects as keys.
Alternative 1 is impossible, as we've seen, because the requirements for
the Right Thing are not mutually compatible.
Alternative (2) leads to hard-to-find, hard-to-diagnose bugs where you
store objects in a dict only for them to mysteriously disappear later.
Worse, it could lead to bugs like the following hypothetical:
M = [1, 2, 3]
D = {M: 'parrot'} # pretend this works
D {[1, 2, 3]: 'parrot'}
M.append(4)
D {[1, 2, 3, 4]: 'parrot'}
D[[1, 2, 3, 4]]
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
KeyError: [1, 2, 3, 4]
Try explaining that one to programmers: they can SEE the key in the dict
when they print it, but they can't get it or delete it because the hash
has changed.
Alternative 3 is easy to deal with: simply don't use mutable objects as
keys. That's what Python does. Sure, the programmer sometimes needs to
work around the lack (convert the list into a tuple, or a string, or
pickle it, whatever...) which on rare occasions is hard to do, but at
least there are no mysterious, hard to track down bugs.