Container library progress

J

jacob navia

The bitstring module is finished (modulo tests). It now compiles cleanly on
linux and the few tests already in place give the same results.

This finished I will start implementing the first of the associative containers (hashtable)
for implementing later AVL trees and probably fibonacci heaps...

There are MANY papers bout associative containers, and many algorithms to choose
from.

What bothers me is that many algorithms look much better in paper, but in practice they
do not perform very well because of cache interactions:

The more overhead a container has, the less data that will be stored in the cache.
The less data stored in the cache, the more that the algorithm will "swap", i.e.
will wait for main memory to be ready. Since at 3GHZ the processor can do up to 40-50
clocks just waiting for main memory, all the cleverness of the algorithms is totally
wasted unless you are handling HUGE data sets in the order of several million
members.

The idea would be to implement simple algorithms first, then to implement more sophisticated ones
but maintaining the same interface.

The interface will be similar to the C++ STL to easy the conversions between
this library and C++.

At the same time I have developed a primitive template mechanism in standard
C, and tested it with the "vector" container. It is a very simple program that
expand a suitable written file containing keywords introduced by $@ ... @$
and produces a standard C file for a vector container specialized to containers of
the given type.

All this obviates the need for more complicated language constructs.

jacob
 
N

Noob

jacob said:
The bitstring module is finished (modulo tests). It now compiles cleanly on
linux and the few tests already in place give the same results.

This finished I will start implementing the first of the associative
containers (hashtable)
for implementing later AVL trees and probably fibonacci heaps...

clc is now your personal blogging space?
There are MANY papers bout associative containers, and many algorithms
to choose from.

Should you ever wish to discuss them, comp.programming is there for you.
What bothers me is that many algorithms look much better in paper, but
in practice they
do not perform very well because of cache interactions:

The more overhead a container has, the less data that will be stored in
the cache.
The less data stored in the cache, the more that the algorithm will
"swap", i.e.
will wait for main memory to be ready. Since at 3GHZ the processor can
do up to 40-50
clocks just waiting for main memory, all the cleverness of the
algorithms is totally
wasted unless you are handling HUGE data sets in the order of several
million
members.

No. You're misusing the terminology. Again.

Getting data from RAM is not considered swapping.
Swapping involves copying data between secondary storage and RAM.

http://en.wikipedia.org/wiki/Paging

Sheesh.
 
B

Ben Pfaff

jacob navia said:
What bothers me is that many algorithms look much better in
paper, but in practice they do not perform very well because of
cache interactions:

The more overhead a container has, the less data that will be
stored in the cache.

It seems that you've decided that you want at least one balanced
binary tree implementation. Have you considered a scapegoat
tree? They have no per-node overhead at all, and only a small
overhead per-tree.

More information:
http://en.wikipedia.org/wiki/Scapegoat_tree

My implementation:
http://git.savannah.gnu.org/cgit/pspp.git/tree/src/libpspp/bt.h
http://git.savannah.gnu.org/cgit/pspp.git/tree/src/libpspp/bt.c
http://git.savannah.gnu.org/cgit/pspp.git/tree/tests/libpspp/bt-test.c
 
J

jacob navia

Ben Pfaff a écrit :
It seems that you've decided that you want at least one balanced
binary tree implementation. Have you considered a scapegoat
tree? They have no per-node overhead at all, and only a small
overhead per-tree.

More information:
http://en.wikipedia.org/wiki/Scapegoat_tree

My implementation:
http://git.savannah.gnu.org/cgit/pspp.git/tree/src/libpspp/bt.h
http://git.savannah.gnu.org/cgit/pspp.git/tree/src/libpspp/bt.c
http://git.savannah.gnu.org/cgit/pspp.git/tree/tests/libpspp/bt-test.c

Hi

That looks extremely interesting, and your implementation is quite OK. The only problem I see
is that I do not want to have any copyright problems so that people can do ANYTHING with the code.
In that way it will be widely used. Is there a way to avoid the GNU license?

The problem with GNU source code is that if you use it, you have to put your software
under GNU also. I want that the library is available for commercial uses ALSO, i.e.
absolutely no copyright.

Could that be possible?

Thanks
 
B

Ben Pfaff

jacob navia said:
That looks extremely interesting, and your implementation is quite
OK. The only problem I see is that I do not want to have any copyright
problems so that people can do ANYTHING with the code. In that way it
will be widely used. Is there a way to avoid the GNU license?

The problem with GNU source code is that if you use it, you have to
put your software under GNU also. I want that the library is available
for commercial uses ALSO, i.e. absolutely no copyright.

The copyright in that code is assigned to FSF, so I don't really
have control over it.

It's simple code; I was just pointing to it to show that it was
simple. I guess that you are able to implement whatever tree
structure you like.
 
J

jacob navia

Ben Pfaff a écrit :
The copyright in that code is assigned to FSF, so I don't really
have control over it.

It's simple code; I was just pointing to it to show that it was
simple. I guess that you are able to implement whatever tree
structure you like.

OK. Thanks for your help, I did not know at all about scapegoat trees!

According to the wiki they should be much better than the AVL or
the others...

Specially this part seems interesting:

Unlike most other self-balancing search trees, scapegoat trees are entirely flexible as to their
balancing. They support any α such that 0.5 <= α < 1. A high α value results in fewer balances,
making insertion quicker but lookups and deletions slower, and vice versa for a low α. Therefore in
practical applications, an α can be chosen depending on how frequently these actions should be
performed.

This means that the user could change the parameter when creating the tree according
to the usage he will do of it, quite nice!

Thanks again.

jacob
 
J

jacob navia

Noob a écrit :
clc is now your personal blogging space?


Should you ever wish to discuss them, comp.programming is there for you.

The thread "C the complete Meta-Nonsense" contains (until now) 81 messages,
all of them completely off topic. You did not complain about it.

The thread "How to convert Infix notation to postfix notation" has degenerated into
a flame war about everything but C. It has 451 messages. You did not complain.

When I report about the progress done in a project to make a public library
of containers in C, a project that I have repeatedly written about in this group

THEN

you complain.
No. You're misusing the terminology. Again.

Getting data from RAM is not considered swapping.
Swapping involves copying data between secondary storage and RAM.

http://en.wikipedia.org/wiki/Paging

You did notice the QUOTES that I put around the "swap" verb?

I was applying one concept to a slightly different environment and
warning the reader with the quotes that the word can't be understood
literally. You choose to make an issue of that.

Why?

Because you have nothing to say, nothing substantive anyway.


Yes.
 
S

Seebs

The thread "C the complete Meta-Nonsense" contains (until now) 81 messages,
all of them completely off topic. You did not complain about it.

They're not *completely* off-topic, though. Discussion of C books is topical,
so's discussion of reviews of C books.

There's certainly a fair amount of discussion of C going on.

-s
 
I

Ian Collins

jacob said:
The idea would be to implement simple algorithms first, then to
implement more sophisticated ones
but maintaining the same interface.

The interface will be similar to the C++ STL to easy the conversions
between
this library and C++.

At the same time I have developed a primitive template mechanism in
standard
C, and tested it with the "vector" container. It is a very simple
program that
expand a suitable written file containing keywords introduced by $@ ... @$
and produces a standard C file for a vector container specialized to
containers of
the given type.

All this obviates the need for more complicated language constructs.

Isn't this a bit of a quixotic exercise? Why not just use C++?

If you have to use a platform without a C++ compiler, give the nice
people at www.comeaucomputing.com a call.
 
J

jacob navia

Ian Collins a écrit :
Isn't this a bit of a quixotic exercise? Why not just use C++?

If you have to use a platform without a C++ compiler, give the nice
people at www.comeaucomputing.com a call.

As always, the messages from Mr Collins promote C++. His only objective
in posting in this group is to preach how great C++ is.

Granted, here anybody can say anything, Mr Collins included. But it would
be better if he stopped bothering to write in a C group if he thinks C++
is so great. There are C++ groups where he can write.
 
N

Nick Keighley

They're not *completely* off-topic, though.  Discussion of C books is topical,
so's discussion of reviews of C books.

There's certainly a fair amount of discussion of C going on.

as someone who's participated, I think you're fooling yourself
 
N

Nick Keighley

Ian Collins a écrit :





As always, the messages from Mr Collins promote C++.  His only objective
in posting in this group is to preach how great C++ is.

Granted, here anybody can say anything, Mr Collins included. But it would
be better if he stopped bothering to write in a C group if he thinks C++
is so great. There are C++ groups where he can write.

if someone wants container classes I don't see the problem in pointing
them in the direction of C++.
 
J

jacob navia

Nick Keighley a écrit :
if someone wants container classes I don't see the problem in pointing
them in the direction of C++.

Nobody wanted "container classes"... In the code I posted there are no classes
at all.
 
G

gwowen

Nobody wanted "container classes"... In the code I posted there are no classes
at all.

You have structures that encapsulate both data and methods to
manipulate that data. Regardless of whether you wish to call them
classes, they're definitely objects.
 
J

jacob navia

gwowen a écrit :
You have structures that encapsulate both data and methods to
manipulate that data. Regardless of whether you wish to call them
classes, they're definitely objects.

If "objects" means object oriented programming the answer is surely NO.

There is no class hierarchy, no inheritance, just objects in the C sense.
Obviously C has objects too. But no object oriented programming.
 
N

Nick Keighley

Nick Keighley a écrit :

I'd not noticed this
Nobody wanted "container classes"... In the code I posted there are no classes
at all.

fair point. C++ containers aren't classes either.

What I /meant/ to say was if someone wants container libraries I don't
see the problem in pointing
them in the direction of C++. You can write nearly-C in C++ and have
containers.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,990
Messages
2,570,211
Members
46,796
Latest member
SteveBreed

Latest Threads

Top