M
micans
CBFalconer said:I never saw any other part of this thread, but your comments only
make sense for a highly restricted set of hash table
implementations, i.e. those that handle overflow by linked lists
(or such) and where a (possibly long) string comparison can be
replaced by an integer comparison. Even then, string comparisons
normally find a difference in the first few characters, even if the
potential strings are long.
On the other hand re-organizing tables, that control the load
factor and thus the length of any chains involved, do not have this
problem. My hashlib keeps the load under about 88%, and the actual
load is more likely to be about 66%, resulting in very short chains
and expected O(1) performance.
My other response could have used more careful wording. It uses
(perhaps ill-chosen) 'rehashing' to refer to something that could be
similar to your 're-organizing'. In the quoted text above I write 'The
hashes adapt while growing; they double in size when the load factor is
exceeded'. This is a sloppy description of the way 'rehashing' works,
namely by doubling the number of buckets and recomputing the linked
lists. So I am curious about your 'On the other hand'. What are you
exactly contrasting it with?
Stijn