A string collection abstract data type

J

jacob navia

Abstract:

Continuing the discussion about abstract data types, in this
discussion group, a string collection data type is presented,
patterned after the collection in C# and similar languages (Java).
It stores character strings, and resizes itself to accommodate
new strings when needed.

Interface:
----------

The data structure uses a table of function pointers to provide an
extensible basis for many functions without cluttering the
user's workspace with too many names.

This architecture is not only extensible at the API level, but
it allows "subclassing". If the user is unsatisfied with some
of the functions of the API, he/she can:

o Replace one or more of the function pointers in the table
with a function of his/her own.

o Store somewhere the old function pointer, and replace it with
a function of his own that after doing some work calls the
original function pointer, optionally doing some work after the
original function returns.

This type of extensibility can only be achieved with function pointers.
The basic problem with many of the proposals presented here is that
they present too much names that can conflict with user names.

This hiding of names is useful in another way, since it allows the API
to use completely generic names like "Add" or similar without any
ambiguity.

Since those generic names can be used in *other* abstract data types
that use the same type of structure, the user code can be made truly
general, and it is easy to change from a string collection to a list
without too much trouble.

The objective would be to make a standard way of naming things within
a container, so that user code is shielded from change when passing from
one container to another.

The name of the table of functions member is "lpVtbl" following
an old function naming convention under the windows OS and in
the COM system. It can be changed of course.



The interface is described in the header file <strcollection.h>
---------------------------------------------------------------------
#include <string.h>
// Forward declaration of the string collection type
typedef struct _StringCollection StringCollection;
typedef struct {
// Returns the number of elements stored
int (*GetCount)(StringCollection *SC);

// Is this collection read only?
int (*IsReadOnly)(StringCollection *SC);

// Sets this collection read-only or unsets the read-only flag
int (*SetReadOnly)(StringCollection *SC,int flag);

// Adds one element at the end. Given string is copied
int (*Add)(StringCollection *SC,char *newval);

// Adds a NULL terminated table of strings
int (*AddRange)(StringCollection *SC,char **newvalues);

// Clears all data and frees the memory
int (*Clear)(StringCollection *SC);

//Case sensitive search of a character string in the data
int (*Contains)(StringCollection *SC,char *str);

// Copies all strings into a NULL terminated vector
char **(*CopyTo)(StringCollection *SC);

//Returns the index of the given string or -1 if not found
int (*IndexOf)(StringCollection *SC,char *SearchedString);

// Inserts a string at the position zero.
int (*Insert)(StringCollection *SC,char *);

// Inserts a string at the given position
int (*InsertAt)(StringCollection *SC,int idx,char *newval);

// Returns the string at the given position
char *(*IndexAt)(StringCollection *SC,int idx);

// Removes the given string if found
int (*Remove)(StringCollection *SC,char *);

//Removes the string at the indicated position
int (*RemoveAt)(StringCollection *SC,int idx);

// Frees the memory used by the collection
int (*Finalize)(StringCollection *SC);

// Returns the current capacity of the collection
int (*GetCapacity)(StringCollection *SC);

// Sets the capacity if there are no items in the collection
int (*SetCapacity)(StringCollection *SC,int newCapacity);

// Calls the given function for all strings.
// "Arg" is a user supplied argument (that can be NULL)
// which is passed to the function to call
void (*Apply)(StringCollection *SC,int (*Applyfn)(char *,void *
arg),void *arg);

// Calls the given function for each string and saves
// all results in an integer vector
int *(*Map)(StringCollection *SC,int (*Applyfn)(char *));

// Pushes a string, using the collection as a stack
int (*Push)(StringCollection *SC,char *str);

// Pops the last string off the collection
char * (*Pop)(StringCollection *SC);

// Replaces the character string at the given position
// with a new one
char *(*ReplaceAt)(StringCollection *SC,int idx,char *newval);

} StringCollectionFunctions;

// Definition of the String Collection type
struct _StringCollection {
StringCollectionFunctions *lpVtbl; // The table of functions
size_t count; /* in element size units */
char **contents; /* The contents of the collection */
size_t capacity; /* in element_size units */
unsigned int flags; // Read-only or other flags
};


// This is the only exported function from this module
StringCollection * newStringCollection(int startsize);

------------------------------------------end of stringcollection.h

Implementation
--------------

I haven't had the time to comment this more in depth. I hope that
this will be done in the ensuing discussion.

------------------------------------------stringcollection.c
#include "strcollection.h"
// Forward definitions
static int GetCount( struct _StringCollection *SC);
static int IsReadOnly( struct _StringCollection *SC);
static int SetReadOnly( struct _StringCollection *SC,int newval);
static int Add( struct _StringCollection *SC,char *newval);
static int AddRange( struct _StringCollection *SC,char **newvalues);
static int Clear( struct _StringCollection *SC);
static int Contains( struct _StringCollection *SC,char *str);
static char **CopyTo( struct _StringCollection *SC);
static int IndexOf( struct _StringCollection *SC,char *SearchedString);
static int Insert( struct _StringCollection *SC,char *);
static int InsertAt( struct _StringCollection *SC,int idx,char *newval);
static char *IndexAt( struct _StringCollection *SC,int idx);
static int Remove( struct _StringCollection *SC,char *);
static int RemoveAt( struct _StringCollection *SC,int idx);
static int Finalize( struct _StringCollection *SC);
static int GetCapacity( struct _StringCollection *SC);
static int SetCapacity( struct _StringCollection *SC,int newCapacity);
static void Apply(struct _StringCollection *SC,int(*Applyfn)(char *,void
*),void *arg);
static int *Map(struct _StringCollection *SC,int (*Applyfn)(char *));
static int Push(struct _StringCollection *SC,char *str);
static char *Pop(struct _StringCollection *SC);
static char *ReplaceAt(struct _StringCollection *SC,int idx,char *newval);
static StringCollectionFunctions lpVtableSC = {
GetCount, IsReadOnly, SetReadOnly, Add, AddRange,
Clear, Contains, CopyTo, IndexOf, Insert,
InsertAt, IndexAt, Remove, RemoveAt,
Finalize, GetCapacity, SetCapacity, Apply,
Map, Push, Pop, ReplaceAt,
};

static char *DuplicateString(char *str)
{
char *result;
if (str == NULL)
return NULL;
result = MALLOC(strlen(str)+1);
if (result == NULL)
return NULL;
strcpy(result,str);
return result;
}

#define SC_READONLY 1
#define CHUNKSIZE 20

StringCollection * newStringCollection(int startsize)
{
StringCollection *result = MALLOC(sizeof(StringCollection));
if (result == NULL)
return NULL;
result->count = 0;
if (startsize == 0)
startsize = DEFAULT_START_SIZE;
result->contents = MALLOC(startsize*sizeof(char *));
if (result->contents == NULL) {
FREE(result);
return NULL;
}
memset(result->contents,0,sizeof(char *)*startsize);
result->capacity = startsize;
result->count = 0;
result->flags = 0;
result->lpVtbl = &lpVtableSC;
return result;
}

static int GetCount(struct _StringCollection *SC)
{
return SC->count;
}
static int IsReadOnly(struct _StringCollection *SC)
{
return SC->flags * SC_READONLY ? 1 : 0;
}
static int SetReadOnly(struct _StringCollection *SC,int newval)
{
int oldval = SC->flags * SC_READONLY ? 1 : 0;
if (newval)
SC->flags |= SC_READONLY;
else
SC->flags &= ~SC_READONLY;
return oldval;
}

static int Resize(struct _StringCollection *SC)
{
int newcapacity = SC->capacity + CHUNKSIZE;
char **oldcontents = SC->contents;
SC->contents = MALLOC(newcapacity*sizeof(char *));
if (SC->contents == NULL) {
SC->contents = oldcontents;
return 0;
}
memset(SC->contents,0,sizeof(char *)*newcapacity);
memcpy(SC->contents,oldcontents,SC->count*sizeof(char *));
SC->capacity = newcapacity;
return 1;
}

static int Add(struct _StringCollection *SC,char *newval)
{
if (SC->flags & SC_READONLY)
return -1;
if (SC->count >= SC->capacity) {
if (!Resize(SC))
return 0;
}

if (newval) {
SC->contents[SC->count] = DuplicateString(newval);
if (SC->contents[SC->count] == NULL) {
return 0;
}
}
else
SC->contents[SC->count] = NULL;
SC->count++;
return SC->count;
}

static int AddRange(struct _StringCollection * SC,char **data)
{
int i = 0;
if (SC->flags & SC_READONLY)
return 0;
while (data != NULL) {
int r = Add(SC,data);
if (r <= 0)
return r;
i++;
}
return SC->count;
}

static int Clear(struct _StringCollection * SC)
{
int oldval = SC->count,i;
if (SC->flags & SC_READONLY)
return 0;
for (i=0; i<SC->count;i++) {
FREE(SC->contents);
SC->contents = NULL;
}
SC->count = 0;
return oldval;
}

static int Contains(struct _StringCollection * SC,char *str)
{
int c,i;
if (str == NULL)
return 0;
c = *str;
for (i=0; i<SC->count;i++) {
if (c == SC->contents[0] && !strcmp(SC->contents,str))
return 1;
}
return 0;
}

static char **CopyTo(struct _StringCollection *SC)
{
char **result = MALLOC((1+SC->count)*sizeof(char *));
int i;
if (result == NULL)
return NULL;
for (i=0; i<SC->count;i++) {
result = DuplicateString(SC->contents);
}
result = NULL;
return result;
}

#ifdef __LCC__
/* The lcc-win compiler allows operator overloading, and allows using
this abstract data type with the more natural [ ] operators */
char * __declspec(naked) operator[](StringCollection *SC, int idx)
{
}
#endif
static int IndexOf(struct _StringCollection *SC,char *str)
{
int i;

for (i=0; i<SC->count;i++) {
if (!strcmp(SC->contents,str)) {
return i;
}
}
return -1;
}


static char *IndexAt(struct _StringCollection *SC,int idx)
{
if (idx >=SC->count || idx < 0)
return NULL;
return SC->contents[idx];
}

static int InsertAt(struct _StringCollection *SC,int idx,char *newval)
{
if (SC->flags & SC_READONLY)
return 0;
if (SC->count >= (SC->capacity-1)) {
if (!Resize(SC))
return 0;
}
if (idx < 0 || idx > SC->count)
return -1;
if (idx == 0) {
if (SC->count > 0)
memmove(SC->contents+1,SC->contents,SC->count*sizeof(char *));
SC->contents[0] = newval;
}
else if (idx == SC->count) {
SC->contents[idx] = newval;
}
else if (idx < SC->count) {

memmove(SC->contents+idx+1,SC->contents+idx,(SC->count-idx+1)*sizeof(char
*));
SC->contents[idx] = newval;
}
SC->count++;
return SC->count;
}

static int Insert(struct _StringCollection *SC,char *newval)
{
if (SC->flags & SC_READONLY)
return 0;
return InsertAt(SC,0,newval);
}

static int RemoveAt(struct _StringCollection *SC,int idx)
{
if (idx >= SC->count || idx < 0)
return -1;
if (SC->count == 0)
return -2;
FREE(SC->contents[idx]);
memmove(SC->contents+idx,SC->contents+idx+1,(SC->count-idx)*sizeof(char
*));
SC->contents[SC->count-1]=NULL;
SC->count--;
return SC->count;
}

static int Remove(struct _StringCollection *SC,char *str)
{
int idx = IndexOf(SC,str);
if (idx < 0)
return idx;
return RemoveAt(SC,idx);
}

static int Push(struct _StringCollection *SC,char *str)
{
char *r;
if (SC->flags&SC_READONLY)
return 0;
if (SC->count >= SC->capacity) {
if (!Resize(SC))
return 0;
}
r = DuplicateString(str);
if (r == NULL)
return 0;
SC->contents[SC->count++] = r;
return SC->count;
}

static char * Pop(struct _StringCollection *SC)
{
char *result;
if ((SC->flags&SC_READONLY) || SC->count == 0)
return NULL;
SC->count--;
result = SC->contents[SC->count];
SC->contents[SC->count] = NULL;
return result;
}

static int Finalize(struct _StringCollection *SC)
{
int result = SC->count,i;
for (i=0; i<SC->count;i++) {
FREE(SC->contents);
}
FREE(SC->contents);
FREE(SC);
return result;
}

static int GetCapacity(struct _StringCollection *SC)
{
return SC->capacity;
}

static int SetCapacity(struct _StringCollection *SC,int newCapacity)
{
if (SC->count != 0)
return 0;
FREE(SC->contents);
SC->contents = MALLOC(newCapacity*sizeof(char *));
memset(SC->contents,0,sizeof(char *)*newCapacity);
SC->capacity = newCapacity;
return 1;
}

static void Apply(struct _StringCollection *SC,int (*Applyfn)(char
*,void *),void *arg)
{
int i;
for (i=0; i<SC->count;i++) {
Applyfn(SC->contents,arg);
}
}

static int *Map(struct _StringCollection *SC,int (*Applyfn)(char *))
{
int *result = MALLOC(SC->count*sizeof(int)),i;

if (result == NULL)
return NULL;
for (i=0; i<SC->count;i++) {
result = Applyfn(SC->contents);
}
return result;
}

#ifdef __LCC__
/* The lcc-win compiler allows operator overloading, and allows using
this abstract data type with the more natural [ ]= operators */
char * __declspec(naked) operator[]=(StringCollection *SC,int idx,char
*newval)
{
}

static char *ReplaceAt(StringCollection *SC,int idx,char *newval)
{
if (SC->flags & SC_READONLY)
return NULL;
if (idx < 0 || idx > SC->count)
return NULL;
FREE(SC->contents[idx]);
SC->contents[idx] = newval;
return newval;
}
#ifdef TEST
#include <stdio.h>
static void PrintStringCollection(StringCollection *SC)
{
int i;
printf("Count %d, Capacity %d\n",SC->count,SC->capacity);
for (i=0; i<SC->count;i++) {
printf("%s\n",SC->lpVtbl->IndexAt(SC,i));
}
printf("\n");
}
int main(void)
{
StringCollection *SC = newStringCollection(10);
char *p;
SC->lpVtbl->Add(SC,"Martin");
SC->lpVtbl->Insert(SC,"Jakob");
printf("Count should be 2, is %d\n",SC->lpVtbl->GetCount(SC));
PrintStringCollection(SC);
SC->lpVtbl->InsertAt(SC,1,"Position 1");
SC->lpVtbl->InsertAt(SC,2,"Position 2");
PrintStringCollection(SC);
SC->lpVtbl->Remove(SC,"Jakob");
PrintStringCollection(SC);
SC->lpVtbl->Push(SC,"pushed");
PrintStringCollection(SC);
SC->lpVtbl->Pop(SC);
PrintStringCollection(SC);
p = SC->lpVtbl->IndexAt(SC,1);
printf("Item position 1:%s\n",p);
PrintStringCollection(SC);
}
#endif
 
C

Charlie Gordon

"jacob navia" <[email protected]> a écrit dans le message de
[email protected]...

<original post snipped>

Thank you for your contribution Jacob.
The code was flushed left probably because of tabs, I reformated it and
fixed a number of small and not so small issues. See below:

Problems fixed:

added API remarks
fixed missing #endif
fixed indentation and spacing
used typedef instead of struct tag in implementation
added const on unmodified string parameters
#include <stddef.h> instead of <string.h>
fixed some bogus bit tests: SC->flags * SC_READONLY ?
added definition for DEFAULT_START_SIZE, allowed it to be 0
removed redundant initialization of result->count
used safer MALLOC idiom (à la c.l.c)
fixed off by one error in ReplaceAt
fixed memory leak in Resize
fixed crash on NULL strings in Contains and IndexOf
used same semantics and method in IndexOf and Contains and shared code
fixed two off by one errors in InsertAt
simplified InsertAt
removed test for condition never reached in RemoveAt
fixed off by one bug in RemoveAt
make SetCapacity work for a non empty collection
added test for MALLOC failure in SetCapacity
duplicate string in Insert, InsertAt, ReplaceAt; handle NULL and failure
allow ReplaceAt with an index of SC->count
used int instead of size_t for count and size for consistency with API.
improved PrintStringCollection and test main output
added missing return 0 in main

------------
Here is the corrected version:

/*----------------
StringCollection by Jacob Navia

jacob navia
jacob at jacob point remcomp point fr
logiciels/informatique
http://www.cs.virginia.edu/~lcc-win32

Abstract:

Continuing the discussion about abstract data types, in this
discussion group, a string collection data type is presented,
patterned after the collection in C# and similar languages (Java).
It stores character strings, and resizes itself to accommodate
new strings when needed.

Interface:
----------

The data structure uses a table of function pointers to provide an
extensible basis for many functions without cluttering the
user's workspace with too many names.

This architecture is not only extensible at the API level, but
it allows "subclassing". If the user is unsatisfied with some
of the functions of the API, he/she can:

o Replace one or more of the function pointers in the table
with a function of his/her own.

o Store somewhere the old function pointer, and replace it with
a function of his own that after doing some work calls the
original function pointer, optionally doing some work after the
original function returns.

This type of extensibility can only be achieved with function pointers.
The basic problem with many of the proposals presented here is that
they present too much names that can conflict with user names.

This hiding of names is useful in another way, since it allows the API
to use completely generic names like "Add" or similar without any
ambiguity.

Since those generic names can be used in *other* abstract data types
that use the same type of structure, the user code can be made truly
general, and it is easy to change from a string collection to a list
without too much trouble.

The objective would be to make a standard way of naming things within
a container, so that user code is shielded from change when passing from
one container to another.

The name of the table of functions member is "lpVtbl" following
an old function naming convention under the windows OS and in
the COM system. It can be changed of course.

API remarks:
------------

Strings passed to API functions are not modified.
Strings inserted into the collection are duplicated will MALLOC and
freed with FREE
Collection can contain NULL strings

The interface is described in the header file <strcollection.h>
--------------------------------------------*/

//#include <stddef.h> // for size_t, if needed

// Forward declaration of the string collection type
typedef struct _StringCollection StringCollection;
typedef struct {
// Returns the number of elements stored
int (*GetCount)(StringCollection *SC);

// Is this collection read only?
int (*IsReadOnly)(StringCollection *SC);

// Sets this collection read-only or unsets the read-only flag
int (*SetReadOnly)(StringCollection *SC, int flag);

// Adds one element at the end. Given string is copied
int (*Add)(StringCollection *SC, const char *newval);

// Adds a NULL terminated table of strings
int (*AddRange)(StringCollection *SC, const char * const *newvalues);

// Clears all data and frees the memory
int (*Clear)(StringCollection *SC);

//Case sensitive search of a character string in the data
int (*Contains)(StringCollection *SC, const char *str);

// Copies all strings into a NULL terminated vector
char **(*CopyTo)(StringCollection *SC);

//Returns the index of the given string or -1 if not found
int (*IndexOf)(StringCollection *SC, const char *SearchedString);

// Inserts a string at the position zero.
int (*Insert)(StringCollection *SC, const char *str);

// Inserts a string at the given position
int (*InsertAt)(StringCollection *SC, int idx, const char *newval);

// Returns the string at the given position
char *(*IndexAt)(StringCollection *SC, int idx);

// Removes the given string if found
int (*Remove)(StringCollection *SC, const char *str);

//Removes the string at the indicated position
int (*RemoveAt)(StringCollection *SC, int idx);

// Frees the memory used by the collection
int (*Finalize)(StringCollection *SC);

// Returns the current capacity of the collection
int (*GetCapacity)(StringCollection *SC);

// Sets the capacity if there are no items in the collection
int (*SetCapacity)(StringCollection *SC, int newCapacity);

// Calls the given function for all strings.
// "Arg" is a user supplied argument (that can be NULL)
// which is passed to the function to call
void (*Apply)(StringCollection *SC,
int (*Applyfn)(char *, void *arg), void *arg);

// Calls the given function for each string and saves
// all results in an integer vector
int *(*Map)(StringCollection *SC, int (*Applyfn)(char *));

// Pushes a string, using the collection as a stack
int (*Push)(StringCollection *SC, const char *str);

// Pops the last string off the collection
char * (*Pop)(StringCollection *SC);

// Replaces the character string at the given position
// with a copy of a new one
char *(*ReplaceAt)(StringCollection *SC, int idx,
const char *newval);

} StringCollectionFunctions;

// Definition of the String Collection type
struct _StringCollection {
StringCollectionFunctions *lpVtbl; // The table of functions
int count; /* in element size units */
char **contents; /* The contents of the collection */
int capacity; /* in element_size units */
unsigned int flags; // Read-only or other flags
};

// This is the only exported function from this module
StringCollection * newStringCollection(int startsize);

/*------------------------------------------end of stringcollection.h

Implementation
--------------

I haven't had the time to comment this more in depth. I hope that
this will be done in the ensuing discussion.

------------------------------------------stringcollection.c */

#include <string.h>
#include "strcollection.h"

#ifndef MALLOC
#include <stdlib.h>
#define MALLOC(s) malloc(s)
#define FREE(p) free(p)
#endif

// Forward definitions
static int GetCount(StringCollection *SC);
static int IsReadOnly(StringCollection *SC);
static int SetReadOnly(StringCollection *SC, int newval);
static int Add(StringCollection *SC, const char *newval);
static int AddRange(StringCollection *SC, const char * const *newvalues);
static int Clear(StringCollection *SC);
static int Contains(StringCollection *SC, const char *str);
static char **CopyTo(StringCollection *SC);
static int IndexOf(StringCollection *SC, const char *SearchedString);
static int Insert(StringCollection *SC, const char *str);
static int InsertAt(StringCollection *SC, int idx, const char *newval);
static char *IndexAt(StringCollection *SC, int idx);
static int Remove(StringCollection *SC, const char *str);
static int RemoveAt(StringCollection *SC, int idx);
static int Finalize(StringCollection *SC);
static int GetCapacity(StringCollection *SC);
static int SetCapacity(StringCollection *SC, int newCapacity);
static void Apply(StringCollection *SC,
int(*Applyfn)(char *str, void *arg), void *arg);
static int *Map(StringCollection *SC, int (*Applyfn)(char *str));
static int Push(StringCollection *SC, const char *str);
static char *Pop(StringCollection *SC);
static char *ReplaceAt(StringCollection *SC, int idx, const char *str);
static StringCollectionFunctions lpVtableSC = {
GetCount, IsReadOnly, SetReadOnly, Add, AddRange,
Clear, Contains, CopyTo, IndexOf, Insert,
InsertAt, IndexAt, Remove, RemoveAt,
Finalize, GetCapacity, SetCapacity, Apply,
Map, Push, Pop, ReplaceAt,
};

static char *DuplicateString(const char *str)
{
char *result;

if (str == NULL)
return NULL;
result = MALLOC(strlen(str) + 1);
if (result == NULL)
return NULL;
return strcpy(result, str);
}

#define SC_READONLY 1
#define CHUNKSIZE 20
#define DEFAULT_START_SIZE 0

StringCollection *newStringCollection(int startsize)
{
StringCollection *SC = MALLOC(sizeof(*SC));

if (SC == NULL)
return NULL;

SC->lpVtbl = &lpVtableSC;
SC->count = 0;
SC->contents = NULL;
SC->capacity = 0;
SC->flags = 0;

if (startsize == 0)
startsize = DEFAULT_START_SIZE;
if (startsize) {
SC->capacity = startsize;
SC->contents = MALLOC(startsize * sizeof(*SC->contents));
if (SC->contents == NULL) {
FREE(SC);
return NULL;
}
// useless, for cleanliness only
memset(SC->contents, 0, startsize * sizeof(*SC->contents));
}
return SC;
}

static int GetCount(StringCollection *SC)
{
return SC->count;
}

static int IsReadOnly(StringCollection *SC)
{
return (SC->flags & SC_READONLY) ? 1 : 0;
}

static int SetReadOnly(StringCollection *SC, int newval)
{
int oldval = (SC->flags & SC_READONLY) ? 1 : 0;

if (newval)
SC->flags |= SC_READONLY;
else
SC->flags &= ~SC_READONLY;
return oldval;
}

static int Resize(StringCollection *SC)
{
int newcapacity = SC->capacity + CHUNKSIZE;
char **newcontents;

newcontents = MALLOC(newcapacity * sizeof(*newcontents));
if (newcontents == NULL)
return 0;
memcpy(newcontents, SC->contents, SC->count * sizeof(*newcontents));
// useless, for cleanliness only
memset(newcontents + SC->count * sizeof(*newcontents),
0, (newcapacity - SC->count) * sizeof(*newcontents));
FREE(SC->contents);
SC->contents = newcontents;
SC->capacity = newcapacity;
return 1;
}

static int Add(StringCollection *SC, const char *str)
{
char *newval = NULL;

if (SC->flags & SC_READONLY)
return -1;
if (SC->count >= SC->capacity) {
if (!Resize(SC))
return 0;
}
if (str) {
newval = DuplicateString(str);
if (newval == NULL)
return 0;
}
SC->contents[SC->count] = newval;
return ++SC->count;
}

static int AddRange(StringCollection *SC, const char * const *data)
{
int i;

if (SC->flags & SC_READONLY)
return 0;
for (i = 0; data != NULL; i++) {
int r = Add(SC, data);
if (r <= 0)
return r;
}
return SC->count;
}

static int Clear(StringCollection *SC)
{
int oldval = SC->count, i;

if (SC->flags & SC_READONLY)
return 0;
for (i = 0; i < SC->count; i++) {
FREE(SC->contents);
SC->contents = NULL;
}
SC->count = 0;
return oldval;
}

static int Contains(StringCollection *SC, const char *str)
{
return (IndexOf(SC, str) >= 0);
}

static char **CopyTo(StringCollection *SC)
{
char **result = MALLOC((SC->count + 1) * sizeof(*result));
int i;

if (result == NULL)
return NULL;
for (i = 0; i < SC->count; i++) {
// XXX: MALLOC failure ignored
result = DuplicateString(SC->contents);
}
result = NULL;
return result;
}

#ifdef __LCC__
/* The lcc-win compiler allows operator overloading, and allows using
this abstract data type with the more natural [ ] operators */
char * __declspec(naked) operator[](StringCollection *SC, int idx)
{
}
#endif

static int IndexOf(StringCollection *SC, const char *str)
{
int c, i;

if (str == NULL) {
for (i = 0; i < SC->count; i++) {
if (SC->contents == NULL)
return i;
}
return -1;
} else {
c = *str;
for (i = 0; i < SC->count; i++) {
if (SC->contents == NULL)
continue;
if (c == SC->contents[0] && !strcmp(SC->contents, str))
return i;
}
return -1;
}
}

static char *IndexAt(StringCollection *SC, int idx)
{
if (idx < 0 || idx >= SC->count)
return NULL;
return SC->contents[idx];
}

static int InsertAt(StringCollection *SC, int idx, const char *str)
{
char *newval = NULL;

if (SC->flags & SC_READONLY)
return 0;
if (idx < 0 || idx > SC->count)
return -1;
if (SC->count >= SC->capacity) {
if (!Resize(SC))
return 0;
}
if (str) {
newval = DuplicateString(str);
if (newval == NULL)
return 0;
}
if (idx < SC->count) {
memmove(SC->contents + idx + 1, SC->contents + idx,
(SC->count - idx) * sizeof(*SC->contents));
}
SC->contents[idx] = newval;
return ++SC->count;
}

static int Insert(StringCollection *SC, const char *str)
{
return InsertAt(SC, 0, str);
}

static int RemoveAt(StringCollection *SC, int idx)
{
if (idx < 0 || idx >= SC->count)
return -1;
FREE(SC->contents[idx]);
memmove(SC->contents + idx, SC->contents + idx + 1,
(SC->count - idx - 1) * sizeof(*SC->contents));
SC->contents[--SC->count] = NULL;
return SC->count;
}

static int Remove(StringCollection *SC, const char *str)
{
int idx = IndexOf(SC, str);
if (idx < 0)
return idx;
return RemoveAt(SC, idx);
}

static int Push(StringCollection *SC, const char *str)
{
char *newval = NULL;

if (SC->flags & SC_READONLY)
return 0;
if (SC->count >= SC->capacity) {
if (!Resize(SC))
return 0;
}
if (str) {
newval = DuplicateString(str);
if (newval == NULL)
return 0;
}
SC->contents[SC->count++] = newval;
return SC->count;
}

static char * Pop(StringCollection *SC)
{
char *result;

if ((SC->flags & SC_READONLY) || SC->count == 0)
return NULL;
SC->count--;
result = SC->contents[SC->count];
SC->contents[SC->count] = NULL;
return result;
}

static int Finalize(StringCollection *SC)
{
int result = SC->count, i;

for (i = 0; i < SC->count; i++) {
FREE(SC->contents);
}
FREE(SC->contents);
FREE(SC);
return result;
}

static int GetCapacity(StringCollection *SC)
{
return SC->capacity;
}

static int SetCapacity(StringCollection *SC, int newcapacity)
{
char **newcontents;

if (newcapacity < 0)
return 0;

newcontents = MALLOC(newcapacity * sizeof(*newcontents));
if (newcontents == NULL)
return 0;

while (SC->count > newcapacity) {
FREE(SC->contents[--SC->count]);
}
memcpy(newcontents, SC->contents, SC->count * sizeof(*newcontents));
// useless, for cleanliness only
memset(newcontents + SC->count * sizeof(*newcontents),
0, (newcapacity - SC->count) * sizeof(*newcontents));
FREE(SC->contents);
SC->contents = newcontents;
SC->capacity = newcapacity;
return 1;
}

static void Apply(StringCollection *SC,
int (*Applyfn)(char *str, void *arg), void *arg)
{
int i;

for (i = 0; i < SC->count; i++) {
Applyfn(SC->contents, arg);
}
}

static int *Map(StringCollection *SC, int (*Applyfn)(char *str))
{
int *result = MALLOC(SC->count * sizeof(*result));
int i;

if (result == NULL)
return NULL;
for (i = 0; i < SC->count; i++) {
result = Applyfn(SC->contents);
}
return result;
}

#ifdef __LCC__
/* The lcc-win compiler allows operator overloading, and allows using
this abstract data type with the more natural [ ]= operators */
char * __declspec(naked) operator[]=(StringCollection *SC,
int idx, char *newval)
{
}
#endif

static char *ReplaceAt(StringCollection *SC, int idx, const char *str)
{
char *newval = NULL;

if (SC->flags & SC_READONLY)
return NULL;
if (idx < 0 || idx > SC->count)
return NULL;
if (idx == SC->count) {
if (!Add(SC, str))
return NULL;
} else {
if (str) {
newval = DuplicateString(str);
if (newval == NULL)
return NULL;
}
FREE(SC->contents[idx]);
SC->contents[idx] = newval;
}
return SC->contents[idx];
}

#ifdef TEST

#include <stdio.h>

static void PrintStringCollection(StringCollection *SC)
{
int i;

printf("Count %d, Capacity %d {\n",
SC->count, SC->capacity);
for (i = 0; i < SC->count; i++) {
printf(" \"%s\",\n", SC->lpVtbl->IndexAt(SC, i));
}
printf("}\n");
}

int main(void)
{
StringCollection *SC;
int count;
char *p;

SC = newStringCollection(0);
printf("newStringCollection(%d)\n", 0);
PrintStringCollection(SC);
SC->lpVtbl->Finalize(SC);

SC = newStringCollection(10);
printf("newStringCollection(%d)\n", 10);
PrintStringCollection(SC);
printf("Add \"%s\"\n", "Martin");
SC->lpVtbl->Add(SC, "Martin");
printf("Insert \"%s\"\n", "Jacob");
SC->lpVtbl->Insert(SC, "Jakob");
count = SC->lpVtbl->GetCount(SC);
if (count != 2)
printf("Count should be 2, is %d\n", count);
PrintStringCollection(SC);
printf("Insert at %d \"%s\"\n", 1, "Position 1");
SC->lpVtbl->InsertAt(SC, 1, "Position 1");
printf("Insert at %d \"%s\"\n", 2, "Position 2");
SC->lpVtbl->InsertAt(SC, 2, "Position 2");
PrintStringCollection(SC);
printf("Remove \"%s\"\n", "Jacob");
SC->lpVtbl->Remove(SC, "Jakob");
PrintStringCollection(SC);
printf("Push \"%s\"\n", "pushed");
SC->lpVtbl->Push(SC, "pushed");
PrintStringCollection(SC);
p = SC->lpVtbl->Pop(SC);
FREE(p);
printf("Pop -> \"%s\"\n", p);
PrintStringCollection(SC);
p = SC->lpVtbl->IndexAt(SC, 1);
printf("IndexAt %d -> \"%s\"\n", 1, p);
PrintStringCollection(SC);
SC->lpVtbl->Finalize(SC);

return 0;
}
#endif

/*--------------------------------------------------*/

There are some remaining issues:

Return values should be documented... in fact the whole API should be. A
concise paragraph before each method would suffice.

We should decide if the collection is indexed with int or size_t, or some
appropriate typedef and make API consistent.

Why does the apply function in Apply return int? the value is ignored.
Why does the apply function in Map not take an extra arg like that of Apply
does?

Memory allocation could be handled with methods as well, but that's
debatable.

"Subclassing" should be done by changing the pointer to the virtual table
for specific instances, and make that point to a copy of the original one,
with appropriate function pointer changed, and possibly extra methods
defined. The way you document affects *all* StringCollections in a given
program.

Please allow me to insist on you publishing the source your compiler so you
can get this kind of help everywhere. Under an appropriate licence, open
sourcing it would not prevent you from selling it to businesses for
commercial use.

Good night.
 
J

jacob navia

Charlie said:
"jacob navia" <[email protected]> a écrit dans le message de
[email protected]...

<original post snipped>

Thank you for your contribution Jacob.
The code was flushed left probably because of tabs, I reformated it and
fixed a number of small and not so small issues. See below:

Problems fixed:

added API remarks
fixed missing #endif
fixed indentation and spacing
used typedef instead of struct tag in implementation
added const on unmodified string parameters
#include <stddef.h> instead of <string.h>
fixed some bogus bit tests: SC->flags * SC_READONLY ?
added definition for DEFAULT_START_SIZE, allowed it to be 0
removed redundant initialization of result->count
used safer MALLOC idiom (à la c.l.c)
fixed off by one error in ReplaceAt
fixed memory leak in Resize
fixed crash on NULL strings in Contains and IndexOf
used same semantics and method in IndexOf and Contains and shared code
fixed two off by one errors in InsertAt
simplified InsertAt
removed test for condition never reached in RemoveAt
fixed off by one bug in RemoveAt
make SetCapacity work for a non empty collection
added test for MALLOC failure in SetCapacity
duplicate string in Insert, InsertAt, ReplaceAt; handle NULL and failure
allow ReplaceAt with an index of SC->count
used int instead of size_t for count and size for consistency with API.
improved PrintStringCollection and test main output
added missing return 0 in main

Thanks a lot!

The original version of this code is written for
my compiler system and distributed with the source code
in the lcc-win distribution.

I eliminated the lcc-win specific parts in this code.
When doing this, I may have introduced some of the bugs
above. Specifically the replacing of references with pointers
intoruced the bug replacing the "&" in an AND operation
with a multiplication sign!

When I comment about it then, I will refer to this version.

[snip code]
 
R

Roland Pibinger

Abstract:
Continuing the discussion about abstract data types, in this
discussion group, a string collection data type is presented,
patterned after the collection in C# and similar languages (Java).
It stores character strings, and resizes itself to accommodate
new strings when needed.

A collection for strings certainly can be useful in many cases. I have
two major objection to the current design:

1. The function pointers are unnecessary. I can hardly imagine anyone
wants to override/replace some of the container functions. OTOH, the
function pointers make usage clumsy. One has to write:
SC->lpVtbl->Remove(SC,"Jakob");
instead of just:
Remove(SC,"Jakob");

Actually, I don't see why the vtable is exposed to the user at all.
Even if you insist on function pointers you can implement something
like:

static
int Remove (StringCollection *SC, const char *str) {
return SC->lpVtbl->Remove(SC, str);
}

2. Memory management:
Strings in the container are managed by the container (via
Finalize()). But in some cases the returned strings must be freed by
the caller. A clear and unambiguous strategy would prevent memory
leaks.
 
J

jacob navia

Roland said:
A collection for strings certainly can be useful in many cases. I have
two major objection to the current design:

1. The function pointers are unnecessary. I can hardly imagine anyone
wants to override/replace some of the container functions. OTOH, the
function pointers make usage clumsy. One has to write:
SC->lpVtbl->Remove(SC,"Jakob");
instead of just:
Remove(SC,"Jakob");

This is certainly possible woth some macros.

However, the key idea of the design is precisely to give several
containers the SAME interface, so that if you decide to
replace a string collection with a list, you can still have

SC->lpVtbl->Remove(SC,"Jacob");

without changing a single line from your user code besides a few
declarations.

An added advantage is that it could very well exist some "Remove"
function in a LOT of user code, so that we would be forced to
use SOME kind of identifier prefix like

CLC_LIST_Remove

or similar anyway.

Those macros are so trivial to write that they could be proposed but
not be a forced part of the interface.
Actually, I don't see why the vtable is exposed to the user at all.
Even if you insist on function pointers you can implement something
like:

static
int Remove (StringCollection *SC, const char *str) {
return SC->lpVtbl->Remove(SC, str);
}


The advantages of exposing the vtable is to give the user the freedom to
change to behavior of the containers as he/she wishes WITHOUT changing
a single line in the user code.

For instance if you want to store only unique strings in the table,
and keep it sorted at all times, you just change the table of
functions and add the new functionality WITHOUT any change
to all your code at all!!

Besides, this gives the user the possibility of changing the behavior
of only ONE instance or changing the behavior of ALL instances
without any limitations with a VERY fast pointer assignment operation!

2. Memory management:
Strings in the container are managed by the container (via
Finalize()). But in some cases the returned strings must be freed by
the caller. A clear and unambiguous strategy would prevent memory
leaks.

This is true, but I do not see any easy way out. When I return a string
should I allocate a new copy and free the one in the container?

But maybe there is a better solution. You have a valid point.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,982
Messages
2,570,186
Members
46,740
Latest member
JudsonFrie

Latest Threads

Top