Reducing memory overhead for dictionaries by removing precomputed hash

kirat.singh · Apr 18, 2006

Forgive me if this has already been discussed, but it seems to me that
one could reduce the memory usage of dictionaries by 2/3 by removing
the precomputed hash in each bucket.

Since Dictionaries only allow immutable objects as keys, one could move
the precomputed hash into the keys.

* Strings are probably the most popular keys for dictionaries and they
already cache the hash (hmm that almost rhymes).
* Numbers are trivial to hash.
* For Tuples one could add a member to cache the hash.

So why store the hash? I imagine it makes rebuilding the dictionary a
bit quicker, since I guess you can avoid some comparisions since you
know there are no duplicates. haven't looked at the code to see if it
does that tho.

and collision chains are possibly a bit quicker to traverse, tho I
think python uses a mask instead of a mod by prime, so hash keys with
the same low bits will collide, so collisions might be more common, but
that's possibly fixable by using a congruential hash, but that's all
black magic to me.

-Kirat

'Needless flexibilities' and structured records [very long]	10	Mar 15, 2013
Builtin classes list, set, dict reimplemented via B-trees	2	Sep 14, 2005
python-dev Summary for 2003-08-01 through 2003-08-15	5	Aug 18, 2003
looking for help with a counting algorithm	5	Dec 28, 2003
python-dev Summary for 2004-04-01 through 2004-04-30	3	May 15, 2004
ANN: Sequel 3.1.0 Released	0	Jun 4, 2009
python-dev Summary for 2005-04-16 through 2005-04-30	7	May 16, 2005
python-dev Summary for 2004-08-01 through 2004-08-15	17	Aug 24, 2004

Reducing memory overhead for dictionaries by removing precomputed hash

kirat.singh

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads