Hashtable or array of structs?

A

Alfonso Morra

I am implementing an application which requires the storage of a large
number of items in a cache.

I have 3-tuple key reprented by a struct, as well as an n-tuple (n is
fixed) dataset, also represented by a struct. At run time, I will no
exactly the number of items in the hashtable (I will have populated it
myself by loading the data from a database). During the course of the
applications lifetime, I will retrieve data in the hashtable and update it.

My question about the suitability of using a hashtable (as opposed to a
simple array of structs) is this:

The table will be 100% full - and I know that hashtables begin to suffer
a performance hit once they get to about 70% filled. (or maybe I should
create the hashtable to be able to hold a larger number of items than I
know I will need? - The number is fixed and does not vary after initial
population).

If my key was a single item, then it would be relatively trivial to
implement the cache as an array of structs. However, The are three items
that uniquely identify a record (this 3-tuple actually form the
composite primary key loaded from the db schema).

I would much prefer to implement this as a hashtable, as I can easily
use the composite lookup key. I have also chosen the keys to be
integers, to further speed the computation of the has key. I would
appreciate any feedback on this choice.

Thanks
 
C

CBFalconer

Alfonso said:
.... snip ...

I would much prefer to implement this as a hashtable, as I can
easily use the composite lookup key. I have also chosen the keys
to be integers, to further speed the computation of the has key.
I would appreciate any feedback on this choice.

Take a look at hashlib and the example programs with it. It will
allow entering one item in multiple databases and easy
experimentation with hashfunctions. Another usage example is
id2id-20. Both can be found at:

<http://cbfalconer.home.att.net/download/>
 
M

Malcolm

Alfonso Morra said:
My question about the suitability of using a hashtable (as opposed to a
simple array of structs) is this:

The table will be 100% full - and I know that hashtables begin to suffer a
performance hit once they get to about 70% filled. (or maybe I should
create the hashtable to be able to hold a larger number of items than I
know I will need? - The number is fixed and does not vary after initial
population).
You need free slots in a hash table, or the algorithm begind to break down.
So you will need memory for about 2 * the number of entries.
If my key was a single item, then it would be relatively trivial to
implement the cache as an array of structs. However, The are three items
that uniquely identify a record (this 3-tuple actually form the composite
primary key loaded from the db schema).
If I understand this correctly you will need three arrays / hashtables for
each entry. If the keys are mutable then array indexing is a poor choice -
you will need to constantly rebuild the arrays.
Make the hashtables store pointers to reduce memory overhead and avoid
synchronisation problems.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,995
Messages
2,570,236
Members
46,825
Latest member
VernonQuy6

Latest Threads

Top