D
David McDivitt
From: "Chris Uppal said:Subject: Re: Extreamly large Hashtable
Date: Sat, 7 May 2005 07:58:03 +0100
If a sufficiently good hash function is available, and space is not a problem,
then a big hash table can /always/ beat a big binary lookup. Hashtables can be
made O(1) (with more-or-less any desired constant factor, trading space for
speed), whereas binary lookup is O(log n).
A binary search is the fastest and most easily implemented scheme, when it
can be used. This is because binary searches are at the core of any other
lookup scheme, including databases going through an index. By laying out a
sorted flat file and doing a low level binary search directly on that flat
file, much overhead is eliminated. If records are large, then a key and
offset value can be obtained from the flat file, and the offset used to
fetch actual data from another file. For variable length records obtain
offset and length from the first file and use to fetch from the second file.
Binary searches in this manner are greatly enhanced by the base operation
system which caches disk info, removing that load from the application
completely.
The original question was for alternatives to a ridiculously large hash
table. A small or reasonably sized hash table would certainly be better than
a binary search on disk.