STL map kinda functionality in C

comp_novice · Jul 10, 2006

Hi,

I would like to store a key-value pair and then retrieve the value
based on a given key value.

I am quite conversent in C++ but this needs be implemented in C.

Would appriciate any idea how to achieve this in C?

Thanks in advance
DKRS.

Richard Heathfield · Jul 10, 2006

comp_novice said:

Hi,

I would like to store a key-value pair and then retrieve the value
based on a given key value.

I am quite conversent in C++ but this needs be implemented in C.

Would appriciate any idea how to achieve this in C?

Use a binary search tree. It is trivial to write a bst that can handle any
data type. If you wish to separate the key from the rest of the value,
that's up to you, but C doesn't really mind either way.

For information on binary search trees in C, check out http://adtinfo.org
which documents Ben Pfaff's GNU libavl library. IIRC Ben normally builds
his trees out of ints rather than the void *s you'll be needing, but the
techniques are basically the same.

Ben Pfaff · Jul 10, 2006

Richard Heathfield said:
For information on binary search trees in C, check out http://adtinfo.org
which documents Ben Pfaff's GNU libavl library. IIRC Ben normally builds
his trees out of ints rather than the void *s you'll be needing, but the
techniques are basically the same.

I think that the C Unleashed book implementation might have used
ints, but GNU libavl uses void *s.

Marc Boyer · Jul 10, 2006

Le 10-07-2006 said:
I would like to store a key-value pair and then retrieve the value
based on a given key value.

I am quite conversent in C++ but this needs be implemented in C.

Would appriciate any idea how to achieve this in C?

You could have a look at BPL, it offers a STL-like map
http://www.enseeiht.fr/~boyer/Tools.html

Any feedback welcome,
Marc Boyer

Richard Heathfield · Jul 10, 2006

Ben Pfaff said:

I think that the C Unleashed book implementation might have used
ints,
Indeed.

but GNU libavl uses void *s.

Oh, that's good - I wish I'd known; I could have saved myself some work.

websnarf · Jul 10, 2006

Richard said:
comp_novice said:

I would like to store a key-value pair and then retrieve the value
based on a given key value.

I am quite conversent in C++ but this needs be implemented in C.

Would appriciate any idea how to achieve this in C?

Click to expand...

Use a binary search tree. [...]

Huh? Doesn't this assume you have a lexical ordering on keys? Why
don't we go for a hash table instead? Or at least ask if the keys are
lexical or not.

Richard Heathfield · Jul 10, 2006

(e-mail address removed) said:

Richard said:
Richard said:

comp_novice said:

I would like to store a key-value pair and then retrieve the value
based on a given key value.

I am quite conversent in C++ but this needs be implemented in C.

Would appriciate any idea how to achieve this in C?

Click to expand...

Use a binary search tree. [...]

Click to expand...

Huh? Doesn't this assume you have a lexical ordering on keys?

<shrug> Any key can be expressed as a bit pattern. Bit patterns can be
compared in a consistent manner. As long as a key is unique, in the worst
case you can compare them using memcmp.

Why
don't we go for a hash table instead?

I like trees.

websnarf · Jul 10, 2006

Richard said:
(e-mail address removed) said:

Richard said:

comp_novice said:
I would like to store a key-value pair and then retrieve the value
based on a given key value.

I am quite conversent in C++ but this needs be implemented in C.

Would appriciate any idea how to achieve this in C?

Use a binary search tree. [...]

Click to expand...

Huh? Doesn't this assume you have a lexical ordering on keys?

Click to expand...

<shrug> Any key can be expressed as a bit pattern. Bit patterns can be
compared in a consistent manner. As long as a key is unique, in the worst
case you can compare them using memcmp.

Except if the key is a pointer to the *real* contents of the key. I.e.
"Hello" and strdup("Hello") might compare to be equal even though their
pointers do not compare as equal. (Of course string are lexical, but
obviously this idea extends to more abstract keys.)

Richard Heathfield · Jul 10, 2006

(e-mail address removed) said:

Except if the key is a pointer to the *real* contents of the key.

<shrug> The key that is a pointer to the real key is not itself the real key
(unless it points to itself, which isn't terribly useful).

Comparison functions deal with this in a trivially simple manner.

Dann Corbit · Jul 10, 2006

comp_novice said:
Hi,

I would like to store a key-value pair and then retrieve the value
based on a given key value.

I am quite conversent in C++ but this needs be implemented in C.

Would appriciate any idea how to achieve this in C?

Hash table.
http://en.wikipedia.org/wiki/Hash_table

Sourceforge will have lots of implementations.

Dann Corbit · Jul 10, 2006

Richard Heathfield said:
(e-mail address removed) said:

Richard said:

comp_novice said:
I would like to store a key-value pair and then retrieve the value
based on a given key value.

I am quite conversent in C++ but this needs be implemented in C.

Would appriciate any idea how to achieve this in C?

Use a binary search tree. [...]

Click to expand...

Huh? Doesn't this assume you have a lexical ordering on keys?

Click to expand...

<shrug> Any key can be expressed as a bit pattern. Bit patterns can be
compared in a consistent manner. As long as a key is unique, in the worst
case you can compare them using memcmp.

Why
don't we go for a hash table instead?

Click to expand...

I like trees.

A hash table is much faster than a tree for all operations.

The only reason not to use a hash table is if you need to do range searches.
If all searches are equality searches, then the right answer is a hash
table.

Ben Pfaff · Jul 10, 2006

Dann Corbit said:
A hash table is much faster than a tree for all operations.

The only reason not to use a hash table is if you need to do range searches.
If all searches are equality searches, then the right answer is a hash
table.

I wish someone had pointed this out to me before I spent so much
time on carefully implementing binary trees. I probably would
never have done so if I'd realized that, actually, for all the
problems I was interested in solving at the time, hash tables
were a much better solution. (However, I was only 14 or so at
the time.)

These days, I'm better at thinking before I code.

Dann Corbit · Jul 10, 2006

Ben Pfaff said:
I wish someone had pointed this out to me before I spent so much
time on carefully implementing binary trees. I probably would
never have done so if I'd realized that, actually, for all the
problems I was interested in solving at the time, hash tables
were a much better solution. (However, I was only 14 or so at
the time.)

That would have been a terrible tragedy. Your AVL tree project is the
finest discussion on the topic that I have ever seen. And sometimes, we
want to search for ranges of things instead of exact matches. In such a
case, a tree is the right answer and a hash table is the wrong answer.

These days, I'm better at thinking before I code.
--
char a[]="\n .CJacehknorstu";int putchar(int);int main(void){unsigned long
b[]
={0x67dffdff,0x9aa9aa6a,0xa77ffda9,0x7da6aa6a,0xa67f6aaa,0xaa9aa9f6,0x11f6},*p
=b,i=24;for(;p+=!*p;*p/=4)switch(0[p]&3)case 0:{return
0;for(p--;i--;i--)case+
2:{i++;if(i)break;else default:continue;if(0)case
1utchar(a[i&15]);break;}}}

Eric Sosman · Jul 10, 2006

Dann Corbit wrote On 07/10/06 15:12,:

A hash table is much faster than a tree for all operations.

Usually, yes -- but as Pitti-Sing says, "Bless you,
it all depends!"

The only reason not to use a hash table is if you need to do range searches.
If all searches are equality searches, then the right answer is a hash
table.

I'd extend this just a tiny bit: If range *operations*
are important, ordered data structures (trees, skip lists,
what-have-you) might be the answer. Even if all the searches
are for exact equality, it is sometimes important to be able
to access the "neighbors" of a located item, or to move
"forward" and/or "backward" from it.

But if all that's needed is a plain vanilla key-and-value
mapping, I'd agree with Dann: the hash table is the weapon of
choice.

Morris Dovey · Jul 10, 2006

Eric Sosman (in 1152562315.492306@news1nwk) said:

| Dann Corbit wrote On 07/10/06 15:12,:
||
|| A hash table is much faster than a tree for all operations.
|
| Usually, yes -- but as Pitti-Sing says, "Bless you,
| it all depends!"

I'm inclined to agree with Eric on this one - that it all depends on
the context. For relatively small key sizes (e.g. stock symbols), I've
been able to achieve my best lookup speeds using trie indexes.

websnarf · Jul 10, 2006

Dann said:
Richard Heathfield said:

(e-mail address removed) said:

Richard Heathfield wrote:
comp_novice said:
I would like to store a key-value pair and then retrieve the value
based on a given key value.

I am quite conversent in C++ but this needs be implemented in C.

Would appriciate any idea how to achieve this in C?

Use a binary search tree. [...]

Huh? Doesn't this assume you have a lexical ordering on keys?

Click to expand...

<shrug> Any key can be expressed as a bit pattern. Bit patterns can be
compared in a consistent manner. As long as a key is unique, in the worst
case you can compare them using memcmp.

Why
don't we go for a hash table instead?

Click to expand...

I like trees.

Click to expand...

A hash table is much faster than a tree for all operations.

This is not true. A hash table is faster for *most* operations, under
the most typical real world scenarios. A hash table assumes you have a
lot of memory, and basically leverages the large addressing
capabilities of general hardware in lieu of log(n) binary choices to
find individual entries.

But with a hash table you still have to compute a hash function
whatever it is you are looking for, before you search for, insert or
delete. In ideal cases of a binary tree with large string keys, it may
be faster just to burn the cost of O(log(tree-size)) comparisons to
find out that your entry is not in the tree, rather than compute the
hash of it in O(key-length) time.

Furthermore, if your hash table has been improperly sized, then you
will pay unusually high reprobe penalities, or may thrash your memory
when you don't need to. A hash table usually never has 100% density
efficiency -- on average its half filled. If you make an
auto-resizable hash table, and assume that the number of entries is
generally monotonically increasing, then it means that periodically,
operations will cost O(tree-size) which can be extremely expensive, and
will not fly in real-time systems.

That all being said -- *usually* hash tables are faster. It assumes
you can compute the hash function very quickly, and that your table is
not encumbered by extra overhead for each operation. James Dow Allen
has written a hash table that satisfies these contraints (though I
personally dislike his programming style.)

The only reason not to use a hash table is if you need to do range searches.

Well, another case, not described above, is if you are implementing
something like a file system.

If all searches are equality searches, then the right answer is a hash
table.

A hash table is more general in the case where you *have* to use
equality searches. Otherwise you just have to understand your case to
know which is the best strategy.

Dann Corbit · Jul 10, 2006

Dann said:
Dann said:

Richard Heathfield said:

(e-mail address removed) said:
Richard Heathfield wrote:
comp_novice said:
I would like to store a key-value pair and then retrieve the value
based on a given key value.

I am quite conversent in C++ but this needs be implemented in C.

Would appriciate any idea how to achieve this in C?

Use a binary search tree. [...]

Huh? Doesn't this assume you have a lexical ordering on keys?

<shrug> Any key can be expressed as a bit pattern. Bit patterns can be
compared in a consistent manner. As long as a key is unique, in the
worst
case you can compare them using memcmp.

Why
don't we go for a hash table instead?

I like trees.

Click to expand...

A hash table is much faster than a tree for all operations.

Click to expand...

This is not true. A hash table is faster for *most* operations, under
the most typical real world scenarios. A hash table assumes you have a
lot of memory, and basically leverages the large addressing
capabilities of general hardware in lieu of log(n) binary choices to
find individual entries.

A tree structure requires more memory than a hash table.

But with a hash table you still have to compute a hash function
whatever it is you are looking for, before you search for, insert or
delete. In ideal cases of a binary tree with large string keys, it may
be faster just to burn the cost of O(log(tree-size)) comparisons to
find out that your entry is not in the tree, rather than compute the
hash of it in O(key-length) time.

I guess that the hash computation will be as fast as comparison. Since it
happens N times instead of N*log(N) times, the hash computation will be
faster:
http://www.cse.yorku.ca/~oz/hash.html

Furthermore, if your hash table has been improperly sized, then you
will pay unusually high reprobe penalities, or may thrash your memory
when you don't need to. A hash table usually never has 100% density
efficiency -- on average its half filled. If you make an
auto-resizable hash table, and assume that the number of entries is
generally monotonically increasing, then it means that periodically,
operations will cost O(tree-size) which can be extremely expensive, and
will not fly in real-time systems.

It's pretty unusual to have no idea how large your data set is.

That all being said -- *usually* hash tables are faster. It assumes
you can compute the hash function very quickly, and that your table is
not encumbered by extra overhead for each operation. James Dow Allen
has written a hash table that satisfies these contraints (though I
personally dislike his programming style.)

There are heaps and loads of hash functions with liberal licenses for use.

Well, another case, not described above, is if you are implementing
something like a file system.

Here, you may be onto something. I have seen real-world studies where disk
storage is required where b+trees are faster than hash tables.

A hash table is more general in the case where you *have* to use
equality searches. Otherwise you just have to understand your case to
know which is the best strategy.

If you don't have equality searches, then hash tables are totally useless.

Richard Heathfield · Jul 10, 2006

Dann Corbit said:

A hash table is much faster than a tree for all operations.

Consider writing out the keys and values of all nodes, in key order (i.e.
print a dictionary, phone list, or whatever). Can you really do that faster
with a hash table than with a tree? I'm surprised.

Dann Corbit · Jul 10, 2006

Richard Heathfield said:
Dann Corbit said:

Consider writing out the keys and values of all nodes, in key order (i.e.
print a dictionary, phone list, or whatever). Can you really do that
faster
with a hash table than with a tree? I'm surprised.

No. Hash tables are not useful for ordered lists.
By all operations I meant "insert, update, delete, find." which are all
O(1).
Elsethread, I pointed out that hash tables are not useful for range
comparisons.

Richard Heathfield · Jul 10, 2006

Dann Corbit said:

By all operations I meant "insert, update, delete, find." which are all
O(1).

Surely that depends on the collision rate. If the hash table has a great
many buckets and a superb hash function, then obviously that improves
matters, but collisions will still need to be resolved, and that can't be
done in constant time.

stl::map iterator	2	Jul 1, 2013
Sorting an STL map	1	Oct 19, 2013
Weird Behavior with Rays in C and OpenGL	4	Feb 13, 2024
Keyboard event detection in C#	1	Feb 8, 2023
Is this PEP-able? (syntax for functools.partial-like functionality)	3	Aug 20, 2013
STL map impl. in C	2	Jul 23, 2003
Unraveling Pointers and Arrays in C++: Seeking Expert Advice.	1	Jan 26, 2024
How to try a range of hex values in C# code ?	0	Nov 19, 2022

STL map kinda functionality in C

comp_novice

Richard Heathfield

Ben Pfaff

Marc Boyer

Richard Heathfield

websnarf

Richard Heathfield

websnarf

Richard Heathfield

Dann Corbit

Dann Corbit

Ben Pfaff

Dann Corbit

Eric Sosman

Morris Dovey

websnarf

Dann Corbit

Richard Heathfield

Dann Corbit

Richard Heathfield

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads