Suggestions about transparent data structures

S

stdazi

Hello,

I have a bunch of *.c files implementing various data structures
(lists, stacks, trees..). Till now, each time I had to use one of
these data structures I simply copied the relevant files and made the
required modifications.

I've noticed that lot of time is wasted converting the part handling
the specific objects types. (for example, the linked list
implementation holds structs of type `foo' while the working project
requires `bar'). The first thing that came to my mind was to use
something like this :

struct data_type {
element_type foo;
}

where element_type is a typedef one sets at compile time. Also, the
programmer defines a few functions for comparison/output of
element_type objects and there you go!
The problem I see with this solution is that given that program X uses
two linked lists one holding objects of type `foo' and some other
linked list of type `bar' I'd have to use two typedefs and have the
given code handling linked lists duplicated.

I'm therefore seeking for a better solution! Before I reinvent the
wheel I'd like to hear some suggestions from you!

Thanks.
 
U

user923005

Hello,

I have a bunch of *.c files implementing various data structures
(lists, stacks, trees..). Till now, each time I had to use one of
these data structures I simply copied the relevant files and made the
required modifications.

I've noticed that lot of time is wasted converting the part handling
the specific objects types. (for example, the linked list
implementation holds structs of type `foo' while the working project
requires `bar'). The first thing that came to my mind was to use
something like this :

struct data_type {
element_type foo;

}

where element_type is a typedef one sets at compile time. Also, the
programmer defines a few functions for comparison/output of
element_type objects and there you go!
The problem I see with this solution is that given that program X uses
two linked lists one holding objects of type `foo' and some other
linked list of type `bar' I'd have to use two typedefs and have the
given code handling linked lists duplicated.

I'm therefore seeking for a better solution! Before I reinvent the
wheel I'd like to hear some suggestions from you!

In C, this is typically solved by using a void pointer to the object
type, and you supply necessary callback functions for performing
operations on the objects like compare().
 
S

s0suk3

Hello,

I have a bunch of *.c files implementing various data structures
(lists, stacks, trees..). Till now, each time I had to use one of
these data structures I simply copied the relevant files and made the
required modifications.

I've noticed that lot of time is wasted converting the part handling
the specific objects types. (for example, the linked list
implementation holds structs of type `foo' while the working project
requires `bar'). The first thing that came to my mind was to use
something like this :

struct data_type {
element_type foo;

}

where element_type is a typedef one sets at compile time. Also, the
programmer defines a few functions for comparison/output of
element_type objects and there you go!
The problem I see with this solution is that given that program X uses
two linked lists one holding objects of type `foo' and some other
linked list of type `bar' I'd have to use two typedefs and have the
given code handling linked lists duplicated.

I'm therefore seeking for a better solution! Before I reinvent the
wheel I'd like to hear some suggestions from you!

You'll get a lot of suggestions from people recommending void*.
However, beware that you can use void* in two different ways: to hold
the data itself, or to *point* to the data.

The former solution is inferior, because (a) it can only handle
pointer types and (b) the object that the void* points to may be
deallocated (say if the object has automatic storage duration and the
program leaves the scope in which it's declared), and then the pointer
will become invalid.

To demonstrate the latter solution, here's an array ADT that uses it
(sorry if it's too long):

---------------------------------------
Header file
---------------------------------------
#ifndef ARRAY_H_
#define ARRAY_H_

#include <stdbool.h>
#include <stdlib.h>

struct _Array;
typedef struct _Array* Array;

typedef unsigned array_size_type;
typedef int array_index_type;

// The callback used to compare array elements; it should return a
strcmp()-style integer
typedef int (*ArrayElementsComparatorCallback)(const void*, const
void*);

// The callback used to destroy array elements; a function taking a
value of this type
// can be passed null, in which case no destructor will be called on
the elements
typedef void (*ArrayElementsDestructorCallback)(void*);

bool ArrayInit(Array*, array_size_type aElementSize);
bool ArrayInitWithArray(Array*, const Array aWithArray);
bool ArrayInitWithElements(Array*, array_size_type aElementSize, const
void* aElements, array_size_type aNumElements);

void ArrayDestroy(Array, ArrayElementsDestructorCallback
aDestructorCallback);

// @return the contiguous block of memory containing the elements
void* ArrayElements(Array);
const void* ArrayElements_const(const Array);

// @return the number of elements currently in the array
array_size_type ArraySize(const Array);

// @return the size of this array's elements
array_size_type ArrayElementSize(const Array);

inline bool ArrayIsEmpty(const Array array) {
return ArraySize(array) == 0;
}

// Searches for an item in the array and returns its index
// @param aItem the item to search for
// @param aStart the index to start from
// @param aComparatorCallback the callback used to compare elements
// @return the zero-based index of the element,
or -1 if not found
array_index_type ArrayIndexOf(const Array array, const void* aItem,
array_index_type aStart, ArrayElementsComparatorCallback
aComparatorCallback);

inline bool ArrayContains(const Array array, const void* aItem,
ArrayElementsComparatorCallback aComparatorCallback) {
return ArrayIndexOf(array, aItem, 0, aComparatorCallback) != -1;
}

// Replaces a range of elements with another array
// @param aStart the starting index of the elements to
replace
// @param aCount the number of elements to replace
// @param aItems the items to copy into the range
// @param aNumItems the number of elements in aItems
// @param aDestructorCallback a callback used to destroy the
replaced elements
// @return a pointer to the new elements
void* ArrayReplaceElementsAt(Array, array_index_type aStart,
array_size_type aCount, const void* aItems, array_size_type aNumItems,
ArrayElementsDestructorCallback aDestructorCallback);

// A variation on ArrayReplaceElementsAt
inline void* ArrayReplaceElementAt(Array array, array_index_type
aIndex, const void* aItem, ArrayElementsDestructorCallback
aDestructorCallback) {
return ArrayReplaceElementsAt(array, aIndex, 1, aItem, 1,
aDestructorCallback);
}

// Inserts an array at a specified position
// @param aIndex the index to insert the array at
// @param aItems the array to insert
// @param aNumItems the number of elements in aItems
// @return a pointer to the inserted elements
inline void* ArrayInsertElementsAt(Array array, array_index_type
aIndex, const void* aItems, array_size_type aNumItems) {
return ArrayReplaceElementsAt(array, aIndex, 0, aItems, aNumItems,
0);
}

// A variation on ArrayInsertElementsAt
inline void* ArrayInsertElementAt(Array array, array_index_type
aIndex, const void* aItem) {
return ArrayInsertElementsAt(array, aIndex, aItem, 1);
}

// Appends elements to the end of this array
// @param aItems the sequence of elements to append
// @param aNumItems the number of elements to append
// @return a pointer to the appended elements in the
array
void* ArrayAppendElements(Array, const void* aItems, array_size_type
aNumItems);

// A variation on ArrayAppendElements
inline void* ArrayAppendElement(Array array, const void* aItem) {
return ArrayAppendElements(array, aItem, 1);
}

// Removes a range of elements
// @param aStart the startig index of the elements to remove
// @param aCount the number of elements to remove
void ArrayRemoveElementsAt(Array, array_index_type aStart,
array_size_type aCount, ArrayElementsDestructorCallback
aDestructorCallback);

// A variation on ArrayRemoveElementsAt
inline void ArrayRemoveElementAt(Array array, array_index_type aIndex,
ArrayElementsDestructorCallback aDestructorCallback) {
ArrayRemoveElementsAt(array, aIndex, 1, aDestructorCallback);
}

// Searches for an element and removes it
// @param aItem the item to search for
// @param aComparatorCallback the callback used to compare elements
// @param aDestructorCallback the callback used to destroy the
element
// @return true if the element was found (and
therefore removed)
bool ArrayRemoveElement(Array array, const void* aItem,
ArrayElementsComparatorCallback aComparatorCallback,
ArrayElementsDestructorCallback aDestructorCallback);

// Removes all elements in the array
inline void ArrayClear(Array array, ArrayElementsDestructorCallback
aDestructorCallback) {
ArrayRemoveElementsAt(array, 0, ArraySize(array),
aDestructorCallback);
}

// Direct element access. These do NOT bounds-check
void* ArrayAt(Array, array_index_type aIndex);
const void* ArrayAt_const(const Array, array_index_type aIndex);

inline void ArraySort(Array array, ArrayElementsComparatorCallback
aComparatorCallback) {
qsort(ArrayElements(array), ArraySize(array), ArrayElementSize
(array), aComparatorCallback);
}

#endif /* ARRAY_H_ */

------------------------------------------
Source file
------------------------------------------
#include "Array.h"
#include <stdlib.h>
#include <stdbool.h>
#include <string.h>

struct _Array
{
array_size_type capacity;
array_size_type size;
array_size_type elementSize;
void* contents;
};

#define kArrayIncrementSize 64

bool ArrayInit(Array* array,
array_size_type aElementSize)
{
return ArrayInitWithElements(array, aElementSize, 0, 0);
}

bool ArrayInitWithArray(Array* array,
const Array aWithArray)
{
return ArrayInitWithElements(array, aWithArray->elementSize,
aWithArray->contents, aWithArray->size);
}

bool ArrayInitWithElements(Array* array,
array_size_type aElementSize,
const void* aElements,
array_size_type aNumElements)
{
*array = (Array) malloc(sizeof(struct _Array));
if (!*array)
return false;

(*array)->contents = malloc(aElementSize * (kArrayIncrementSize >=
aNumElements ? kArrayIncrementSize : aNumElements));
if (!(*array)->contents) {
free(*array);
return false;
}

if (aNumElements > 0)
memcpy((*array)->contents, aElements, aElementSize *
aNumElements);

(*array)->capacity = kArrayIncrementSize >= aNumElements ?
kArrayIncrementSize : aNumElements;
(*array)->size = aNumElements;
(*array)->elementSize = aElementSize;
return true;
}

void ArrayDestroy(Array array,
ArrayElementsDestructorCallback aDestructorCallback)
{
ArrayClear(array, aDestructorCallback);
free(ArrayElements(array));
free(array);
}

//
// Helpers
//

// Increases the size of the array if necessary
// @param aCapacity the required minimum capacity
// @return true if the array was already large enough or
if the
// reallocation succeeded; false on OOM error
static bool ArrayEnsureCapacity(Array array,
array_size_type aCapacity)
{
void* newContents;
array_size_type incrementSize;

if (array->capacity >= aCapacity)
return true;

incrementSize = aCapacity - array->capacity;
if (incrementSize < kArrayIncrementSize)
incrementSize = kArrayIncrementSize;

newContents = realloc(array->contents, array->elementSize * (array-
capacity + incrementSize));
if (!newContents)
return false;
array->contents = newContents;
array->capacity += incrementSize;
return true;
}

// Destroys a range of elements by calling a destructor callback on
them
static inline void ArrayDestructRange(Array array,
array_index_type aStart,
array_size_type aCount,
ArrayElementsDestructorCallback
aDestructorCallback)
{
if (aDestructorCallback) {
array_index_type i;
for (i = aStart; i < aCount; ++i)
aDestructorCallback((char*) ArrayElements(array) + i *
ArrayElementSize(array));
}
}

// Shifts a sequence of elements a number of positions forward or
backward
static inline void ArrayShiftRange(Array array,
array_index_type aOldStart,
array_index_type aNewStart,
array_size_type aCount)
{
if (aOldStart != aNewStart && aCount > 0)
memmove((char*) ArrayElements(array) + aNewStart *
ArrayElementSize(array),
(char*) ArrayElements(array) + aOldStart *
ArrayElementSize(array),
aCount * ArrayElementSize(array));
}

static inline void ArrayAssignRange(Array array,
array_index_type aStart,
array_size_type aCount,
const void* aItems)
{
memcpy((char*) ArrayElements(array) + aStart * ArrayElementSize
(array),
aItems, aCount * ArrayElementSize(array));
}

//
// Array functions
//

void* ArrayElements(Array array)
{
return array->contents;
}

const void* ArrayElements_const(const Array array)
{
return array->contents;
}

array_size_type ArraySize(const Array array)
{
return array->size;
}

array_size_type ArrayElementSize(const Array array)
{
return array->elementSize;
}

array_index_type ArrayIndexOf(const Array array,
const void* aItem,
array_index_type aStart,
ArrayElementsComparatorCallback
aComparatorCallback)
{
array_index_type i;
if (aStart < 0 || aStart >= ArraySize(array))
return -1;
for (i = aStart; i < ArraySize(array); ++i)
if (aComparatorCallback((char*) ArrayElements_const(array) + i
* ArrayElementSize(array), aItem) == 0)
return i;
return -1;
}

void* ArrayReplaceElementsAt(Array array,
array_index_type aStart,
array_size_type aCount,
const void* aItems,
array_size_type aNumItems,
ArrayElementsDestructorCallback
aDestructorCallback)
{
if (aStart < 0 || aStart >= ArraySize(array) || aStart + aCount >
ArraySize(array))
return 0;
if (!ArrayEnsureCapacity(array, ArraySize(array) - aCount +
aNumItems))
return 0;
ArrayDestructRange(array, aStart, aCount, aDestructorCallback);
ArrayShiftRange(array, aStart + aCount, aStart + aNumItems,
ArraySize(array) - (aStart + aCount));
ArrayAssignRange(array, aStart, aNumItems, aItems);
array->size = ArraySize(array) - aCount + aNumItems;
return ArrayAt(array, aStart);
}

void* ArrayAppendElements(Array array,
const void* aItems,
array_size_type aNumItems)
{
if (!ArrayEnsureCapacity(array, ArraySize(array) + aNumItems))
return 0;
ArrayAssignRange(array, ArraySize(array), aNumItems, aItems);
array->size += aNumItems;
return ArrayAt(array, ArraySize(array));
}

void ArrayRemoveElementsAt(Array array,
array_index_type aStart,
array_size_type aCount,
ArrayElementsDestructorCallback
aDestructorCallback)
{
if (aStart < 0 || aStart >= ArraySize(array) || aStart + aCount >
ArraySize(array))
return;
ArrayDestructRange(array, aStart, aCount, aDestructorCallback);
ArrayShiftRange(array, aStart + aCount, aStart, ArraySize(array) -
(aStart + aCount));
array->size -= aCount;
}

bool ArrayRemoveElement(Array array,
const void* aItem,
ArrayElementsComparatorCallback
aComparatorCallback,
ArrayElementsDestructorCallback
aDestructorCallback)
{
array_index_type i = ArrayIndexOf(array, aItem, 0,
aComparatorCallback);
if (i == -1)
return false;
ArrayRemoveElementAt(array, i, aDestructorCallback);
return true;
}

void* ArrayAt(Array array,
array_index_type aIndex)
{
return (char*) ArrayElements(array) + aIndex * array->elementSize;
}

const void* ArrayAt_const(const Array array,
array_index_type aIndex)
{
return (char*) ArrayElements_const(array) + aIndex * array-
elementSize;
}
 
R

robertwessel2

Hello,

I have a bunch of *.c files implementing various data structures
(lists, stacks, trees..). Till now, each time I had to use one of
these data structures I simply copied the relevant files and made the
required modifications.

I've noticed that lot of time is wasted converting the part handling
the specific objects types. (for example, the linked list
implementation holds structs of type `foo' while the working project
requires `bar'). The first thing that came to my mind was to use
something like this :

struct data_type {
element_type foo;

}

where element_type is a typedef one sets at compile time. Also, the
programmer defines a few functions for comparison/output of
element_type objects and there you go!
The problem I see with this solution is that given that program X uses
two linked lists one holding objects of type `foo' and some other
linked list of type `bar' I'd have to use two typedefs and have the
given code handling linked lists duplicated.

I'm therefore seeking for a better solution! Before I reinvent the
wheel I'd like to hear some suggestions from you!


Heretical answer: switch to C++, and use inheritance or templates.

In C, you might consider making the container objects use a void
pointer as was suggested, although that brings considerable overhead
if the objects you're storing are small.

Alternatively you can fake some C++-like template specialization in a
couple of different ways with macros.

First, you can put your routines in a header file, and #define a
couple of parameters to generate specialized versions of your code.
So each inclusion looks like:

#define LIST_TYPE struct mystruct
#include <list.h>
#undef LIST_TYPE
#define LIST_TYPE struct myotherstruct
#include <list.h>

list.h then plays various macro games to generate specialized versions
of the code. For example it might generate routines named
"ListInsert_mystruct" in the first case and "ListInsert_myotherstruct"
in the other.

Alternatively, you can put the whole kit and caboodle in a large macro
(the routines, structures, everything), and then just invoke the macro
with the type as a parameter, so you end up with:

#include <list.h>
typedef struct mystruct_tag mystruct
GENERATE_LIST(mystruct)
typedef struct mystructother_tag myotherstruct
GENERATE_LIST(myotherstruct)

Again, the macro would similarly generate type-specific names.
 
D

David Resnick

Hello,

I have a bunch of *.c files implementing various data structures
(lists, stacks, trees..). Till now, each time I had to use one of
these data structures I simply copied the relevant files and made the
required modifications.

I've noticed that lot of time is wasted converting the part handling
the specific objects types. (for example, the linked list
implementation holds structs of type `foo' while the working project
requires `bar'). The first thing that came to my mind was to use
something like this :

struct data_type {
element_type foo;

}

where element_type is a typedef one sets at compile time. Also, the
programmer defines a few functions for comparison/output of
element_type objects and there you go!
The problem I see with this solution is that given that program X uses
two linked lists one holding objects of type `foo' and some other
linked list of type `bar' I'd have to use two typedefs and have the
given code handling linked lists duplicated.

I'm therefore seeking for a better solution! Before I reinvent the
wheel I'd like to hear some suggestions from you!

C++ makes this kind of stuff much easier. But I guess that is not an
option for you?

How about having a common initial structure or struct field(s)
followed by the data? You probably need to do some casting, but I
think this might work for you:

e.g. (unchecked)

struct linkedList
{
struct linkedList *next;
}

struct linkedListWithData
{
struct linkedList *next;
datatype data;
}

struct linkedListWithOtherData
{
struct linkedList *next;
otherdatatype data;
}

For the purpose of list manipulation, you could pass any of these to
the list manipulation functions, casting types with data to type
struct linkedList * which is what those functions would take. For
code that needs to manipulate the data part of course you need to know
the details of the data, but that doesn't sound like a problem? Far
as I can see this will work. I don't remember if this is blessed by
the standard or not though.

-David
 
G

Guest

Heretical answer: switch to C++, and use inheritance or templates.

inheritance? [I know what inheritance is, I was just questioning
if it was a good way to implement containers in C++]
 
S

s0suk3

(e-mail address removed) said:
On Feb 12, 1:28 pm, "(e-mail address removed)" <[email protected]> wrote:

[question about containers]
You'll get a lot of suggestions from people recommending void*.

And rightly so.
However, beware that you can use void* in two different ways: to
hold the data itself, or to *point* to the data.

Huh? Please explain how a void * can hold any data other than a
pointer value!

In other words, the container will maintain a block of memory with
elements of a specified size, and it simply uses the void* to point to
it. That's why you need to let it know the element size at initiation:

int n = 42;
Array a;
ArrayInit(&a, sizeof(int)); // array of int
ArrayAppendElement(a, &n);

It doesn't matter if 'n' goes out of scope, because the append
function stored sizeof(int) bytes starting at &n, rather than just &n.
You lost me. What's wrong with just having the container point at
the data-to-be-contained? The object can't be deallocated without
the programmer deciding to deallocate it, at which point they have
a decision to make about the container's pointer (typically they
will want to remove that node/element/member from the container).

If it has automatic storage, it will be deallocated at scope exit,
regardless of the programmer's wishes. Sometimes you want the object
to be preserved between function calls.
No doubt the folks with C99 compilers will be only too willing to
check it out for you.

#define inline static
#define bool int
#define true 1
#define false 0

Sebastian
 
W

WANG Cong

Hello,

I have a bunch of *.c files implementing various data structures
(lists, stacks, trees..). Till now, each time I had to use one of
these data structures I simply copied the relevant files and made the
required modifications.

I've noticed that lot of time is wasted converting the part handling
the specific objects types. (for example, the linked list
implementation holds structs of type `foo' while the working project
requires `bar'). The first thing that came to my mind was to use
something like this :

hello,

I think you should at least take a look at the link list in linux
kernel source code, see include/linux/list.h.

The core definition is:

struct list_head {
struct list_head *next, *prev;
};

If some struct wants to use it, say struct foo, use it like this:

struct foo {
struct list_head head;
.... // other data
};

So the main problem here is how to get foo when you have foo->head,
for example, you want to do a traverse, what you get to literate is
foo->head, not 'foo', so you have to extract 'foo' with 'foo->head'.
Linux's solution is to use container_of:

#define container_of(ptr, type, member) ({ \
const typeof( ((type *)0)->member ) *__mptr = (ptr); \
(type *)( (char *)__mptr - offsetof(type,member) );})

So you can extract the pointer to struct foo like this:

container_of(ptr, struct foo, head);

where ptr is a pointer to struct list_head of a struct foo.

Note that 'typeof' is an extension of gcc, it's not standard C, but
i think you can remove it. :)
 
R

robertwessel2

Heretical answer: switch to C++, and use inheritance or templates.

inheritance? [I know what inheritance is, I was just questioning
if it was a good way to implement containers in C++]


Templates would probably be better in most cases. Inheriting the
management functions from a base class works, but leaves you with some
annoying type conversions, or a set of (trivial) type conversion
wrapper functions that need to be defined in the derived class, or you
need a (virtual) accessor function to access the contained object.
 
C

CBFalconer

.... snip ...
Heretical answer: switch to C++, and use inheritance or templates.

inheritance? [I know what inheritance is, I was just questioning
if it was a good way to implement containers in C++]

This is comp.lang.c. For that question you want comp.lang.c++.
The question is off-topic here.
 
G

gw7rib

Heretical answer: switch to C++, and use inheritance or templates.

inheritance? [I know what inheritance is, I was just questioning
if it was a good way to implement containers in C++]

It can be done. It's the way I do it. While this does not of course
prove that it's a good way, it does have the major advantage that I
don't have to learn about templates. It also avoids "template bloat",
and ties in better with the C philosophy that there's little hidden
overhead, the code you get out looks like the code you write. (I
presume this was still part of the C philosophy, it clearly isn't part
of the C++ philosphy but is stated by the author to be an aim of BCPL,
one of C's predecessors).
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,997
Messages
2,570,239
Members
46,827
Latest member
DMUK_Beginner

Latest Threads

Top