Container library progress

jacob navia · Nov 9, 2009

The bitstring module is finished (modulo tests). It now compiles cleanly on
linux and the few tests already in place give the same results.

This finished I will start implementing the first of the associative containers (hashtable)
for implementing later AVL trees and probably fibonacci heaps...

There are MANY papers bout associative containers, and many algorithms to choose
from.

What bothers me is that many algorithms look much better in paper, but in practice they
do not perform very well because of cache interactions:

The more overhead a container has, the less data that will be stored in the cache.
The less data stored in the cache, the more that the algorithm will "swap", i.e.
will wait for main memory to be ready. Since at 3GHZ the processor can do up to 40-50
clocks just waiting for main memory, all the cleverness of the algorithms is totally
wasted unless you are handling HUGE data sets in the order of several million
members.

The idea would be to implement simple algorithms first, then to implement more sophisticated ones
but maintaining the same interface.

The interface will be similar to the C++ STL to easy the conversions between
this library and C++.

At the same time I have developed a primitive template mechanism in standard
C, and tested it with the "vector" container. It is a very simple program that
expand a suitable written file containing keywords introduced by $@ ... @$
and produces a standard C file for a vector container specialized to containers of
the given type.

All this obviates the need for more complicated language constructs.

jacob

Noob · Nov 9, 2009

jacob said:
The bitstring module is finished (modulo tests). It now compiles cleanly on
linux and the few tests already in place give the same results.

This finished I will start implementing the first of the associative
containers (hashtable)
for implementing later AVL trees and probably fibonacci heaps...

clc is now your personal blogging space?

There are MANY papers bout associative containers, and many algorithms
to choose from.

Should you ever wish to discuss them, comp.programming is there for you.

What bothers me is that many algorithms look much better in paper, but
in practice they
do not perform very well because of cache interactions:

The more overhead a container has, the less data that will be stored in
the cache.
The less data stored in the cache, the more that the algorithm will
"swap", i.e.
will wait for main memory to be ready. Since at 3GHZ the processor can
do up to 40-50
clocks just waiting for main memory, all the cleverness of the
algorithms is totally
wasted unless you are handling HUGE data sets in the order of several
million
members.

No. You're misusing the terminology. Again.

Getting data from RAM is not considered swapping.
Swapping involves copying data between secondary storage and RAM.

http://en.wikipedia.org/wiki/Paging

Sheesh.

Ben Pfaff · Nov 9, 2009

jacob navia said:
What bothers me is that many algorithms look much better in
paper, but in practice they do not perform very well because of
cache interactions:

The more overhead a container has, the less data that will be
stored in the cache.

It seems that you've decided that you want at least one balanced
binary tree implementation. Have you considered a scapegoat
tree? They have no per-node overhead at all, and only a small
overhead per-tree.

More information:
http://en.wikipedia.org/wiki/Scapegoat_tree

My implementation:
http://git.savannah.gnu.org/cgit/pspp.git/tree/src/libpspp/bt.h
http://git.savannah.gnu.org/cgit/pspp.git/tree/src/libpspp/bt.c
http://git.savannah.gnu.org/cgit/pspp.git/tree/tests/libpspp/bt-test.c

jacob navia · Nov 9, 2009

Ben Pfaff a écrit :

It seems that you've decided that you want at least one balanced
binary tree implementation. Have you considered a scapegoat
tree? They have no per-node overhead at all, and only a small
overhead per-tree.

More information:
http://en.wikipedia.org/wiki/Scapegoat_tree

My implementation:
http://git.savannah.gnu.org/cgit/pspp.git/tree/src/libpspp/bt.h
http://git.savannah.gnu.org/cgit/pspp.git/tree/src/libpspp/bt.c
http://git.savannah.gnu.org/cgit/pspp.git/tree/tests/libpspp/bt-test.c

Hi

That looks extremely interesting, and your implementation is quite OK. The only problem I see
is that I do not want to have any copyright problems so that people can do ANYTHING with the code.
In that way it will be widely used. Is there a way to avoid the GNU license?

The problem with GNU source code is that if you use it, you have to put your software
under GNU also. I want that the library is available for commercial uses ALSO, i.e.
absolutely no copyright.

Could that be possible?

Thanks

Ben Pfaff · Nov 9, 2009

jacob navia said:
That looks extremely interesting, and your implementation is quite
OK. The only problem I see is that I do not want to have any copyright
problems so that people can do ANYTHING with the code. In that way it
will be widely used. Is there a way to avoid the GNU license?

The problem with GNU source code is that if you use it, you have to
put your software under GNU also. I want that the library is available
for commercial uses ALSO, i.e. absolutely no copyright.

The copyright in that code is assigned to FSF, so I don't really
have control over it.

It's simple code; I was just pointing to it to show that it was
simple. I guess that you are able to implement whatever tree
structure you like.

jacob navia · Nov 9, 2009

Ben Pfaff a Ã©crit :

The copyright in that code is assigned to FSF, so I don't really
have control over it.

It's simple code; I was just pointing to it to show that it was
simple. I guess that you are able to implement whatever tree
structure you like.

OK. Thanks for your help, I did not know at all about scapegoat trees!

According to the wiki they should be much better than the AVL or
the others...

Specially this part seems interesting:

Unlike most other self-balancing search trees, scapegoat trees are entirely flexible as to their
balancing. They support any Î± such that 0.5 <= Î± < 1. A high Î± value results in fewer balances,
making insertion quicker but lookups and deletions slower, and vice versa for a low Î±. Therefore in
practical applications, an Î± can be chosen depending on how frequently these actions should be
performed.

This means that the user could change the parameter when creating the tree according
to the usage he will do of it, quite nice!

Thanks again.

jacob

jacob navia · Nov 9, 2009

Noob a écrit :

clc is now your personal blogging space?

Should you ever wish to discuss them, comp.programming is there for you.

The thread "C the complete Meta-Nonsense" contains (until now) 81 messages,
all of them completely off topic. You did not complain about it.

The thread "How to convert Infix notation to postfix notation" has degenerated into
a flame war about everything but C. It has 451 messages. You did not complain.

When I report about the progress done in a project to make a public library
of containers in C, a project that I have repeatedly written about in this group

THEN

you complain.

No. You're misusing the terminology. Again.

Getting data from RAM is not considered swapping.
Swapping involves copying data between secondary storage and RAM.

http://en.wikipedia.org/wiki/Paging

You did notice the QUOTES that I put around the "swap" verb?

I was applying one concept to a slightly different environment and
warning the reader with the quotes that the word can't be understood
literally. You choose to make an issue of that.

Why?

Because you have nothing to say, nothing substantive anyway.

Sheesh.

Yes.

Moi · Nov 9, 2009

More information:
http://en.wikipedia.org/wiki/Scapegoat_tree

Thanks for the links, Ben.
It is a pity that the ACM site is still restricted to paying users only.

http://git.savannah.gnu.org/cgit/pspp.git/tree/src/libpspp/bt.c

But luckily I can have a look at your source code, too.

AvK

Ben Pfaff · Nov 9, 2009

Moi said:
It is a pity that the ACM site is still restricted to paying users only.

I agree.

The original scapegoat tree paper is available here (linked from
Wikipedia):
http://cg.scs.carleton.ca/~morin/teaching/5408/refs/gr93.pdf

Moi · Nov 9, 2009

I agree.

The original scapegoat tree paper is available here (linked from
Wikipedia):
http://cg.scs.carleton.ca/~morin/teaching/5408/refs/gr93.pdf

Thanks again.
I'll check it out.

AvK

Seebs · Nov 10, 2009

The thread "C the complete Meta-Nonsense" contains (until now) 81 messages,
all of them completely off topic. You did not complain about it.

They're not *completely* off-topic, though. Discussion of C books is topical,
so's discussion of reviews of C books.

There's certainly a fair amount of discussion of C going on.

-s

Dann Corbit · Nov 10, 2009

Thanks again.
I'll check it out.

Boost has a Scapegoat tree container also:
http://www.boost.org/doc/libs/1_36_
0/doc/html/intrusive/sg_set_multiset.html

Ian Collins · Nov 10, 2009

jacob said:
The idea would be to implement simple algorithms first, then to
implement more sophisticated ones
but maintaining the same interface.

The interface will be similar to the C++ STL to easy the conversions
between
this library and C++.

At the same time I have developed a primitive template mechanism in
standard
C, and tested it with the "vector" container. It is a very simple
program that
expand a suitable written file containing keywords introduced by $@ ... @$
and produces a standard C file for a vector container specialized to
containers of
the given type.

All this obviates the need for more complicated language constructs.

Isn't this a bit of a quixotic exercise? Why not just use C++?

If you have to use a platform without a C++ compiler, give the nice
people at www.comeaucomputing.com a call.

jacob navia · Nov 10, 2009

Ian Collins a écrit :

Isn't this a bit of a quixotic exercise? Why not just use C++?

If you have to use a platform without a C++ compiler, give the nice
people at www.comeaucomputing.com a call.

As always, the messages from Mr Collins promote C++. His only objective
in posting in this group is to preach how great C++ is.

Granted, here anybody can say anything, Mr Collins included. But it would
be better if he stopped bothering to write in a C group if he thinks C++
is so great. There are C++ groups where he can write.

Nick Keighley · Nov 10, 2009

They're not *completely* off-topic, though. Discussion of C books is topical,
so's discussion of reviews of C books.

There's certainly a fair amount of discussion of C going on.

as someone who's participated, I think you're fooling yourself

Nick Keighley · Nov 10, 2009

Ian Collins a écrit :

As always, the messages from Mr Collins promote C++. His only objective
in posting in this group is to preach how great C++ is.

Granted, here anybody can say anything, Mr Collins included. But it would
be better if he stopped bothering to write in a C group if he thinks C++
is so great. There are C++ groups where he can write.

if someone wants container classes I don't see the problem in pointing
them in the direction of C++.

jacob navia · Nov 10, 2009

Nick Keighley a écrit :

if someone wants container classes I don't see the problem in pointing
them in the direction of C++.

Nobody wanted "container classes"... In the code I posted there are no classes
at all.

gwowen · Nov 10, 2009

Nobody wanted "container classes"... In the code I posted there are no classes
at all.

You have structures that encapsulate both data and methods to
manipulate that data. Regardless of whether you wish to call them
classes, they're definitely objects.

jacob navia · Nov 10, 2009

gwowen a écrit :

You have structures that encapsulate both data and methods to
manipulate that data. Regardless of whether you wish to call them
classes, they're definitely objects.

If "objects" means object oriented programming the answer is surely NO.

There is no class hierarchy, no inheritance, just objects in the C sense.
Obviously C has objects too. But no object oriented programming.

Nick Keighley · Nov 10, 2009

Nick Keighley a écrit :

I'd not noticed this

Nobody wanted "container classes"... In the code I posted there are no classes
at all.

fair point. C++ containers aren't classes either.

What I /meant/ to say was if someone wants container libraries I don't
see the problem in pointing
them in the direction of C++. You can write nearly-C in C++ and have
containers.

Container library (continued)	69	Mar 16, 2010
How to properly insert a landing page within same container beneath an image element?	0	Oct 7, 2024
Container Interfaces	17	May 9, 2010
Observing a container	2	Dec 10, 2013
Engineering a list container. Part 1.	71	Dec 7, 2013
Container library, second attempt	0	Oct 4, 2009
Fast Container	8	Nov 2, 2011
Container library (continued)	21	Dec 25, 2009

Container library progress

jacob navia

Noob

Ben Pfaff

jacob navia

Ben Pfaff

jacob navia

jacob navia

Moi

Ben Pfaff

Moi

Seebs

Dann Corbit

Ian Collins

jacob navia

Nick Keighley

Nick Keighley

jacob navia

gwowen

jacob navia

Nick Keighley

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads