S
Steven D'Aprano
O(n) for all other entries in the dict which suffer a hash collision
with the searched entry.
True, a sensible choice of hash function will reduce n to 1 in common
cases, but it becomes an important consideration for larger datasets.
I'm glad you're wrong for CPython's dictionaries. The only time the
lookup would degenerate to O[n] would be if the hash table had only one
slot. CPython sensibly increases the hash table size when it becomes
too small for efficiency.
Where have you seen dictionaries so poorly implemented?
In vanilla CPython up to version (I think) 3.3, where it's possible to
DoS the hash generator. Hash collisions are always possible, just
ridiculously unlikely unless deliberately exploited.
Not so unlikely actually.
py> hash(3)
3
py> hash(2**64 + 2)
3
py> hash(-1)
-2
py> hash(-2)
-2
By its very nature, a hash function cannot fail to have collisions. The
problem is that in general you have an essentially unlimited number of
objects being mapped to a large but still finite number of hash values.