R
Ross A. Finlayson
I'm trying to write some C code, but I want to use C++'s std::vector.
Indeed, if the code is compiled as C++, I want the container to
actually be std::vector, in this case of a collection of value types or
std::vector<int>.
So where I would use an int* and reallocate it from time to time in C,
and randomly access it via [], then I figure to copy the capacity and
reserve methods, because I just need a growable array.
I get to considering reallocation patterns, and get to thinking about
something along the lines of:
allocate_one_more
allocate_n_more
allocate_twice_as_many
allocate_field_again_as_many
alllocate_pagesize_many
Those are basically allocation resizing strategies. For example, in a
program with a one or few elements in an array, with few reallocations,
then when the array is reallocated there might only need be one more
allocated. If again the count of reallocations is low but few or many
elements are added at one time, then allocate n many more for some
variable n.
In general, the notion might be to allocate twice as many, this has
some notion of capacity, which is the space allocated in std::vector,
and is the recommended amount to resize the array on the SGI STL
description of vector.
http://www.sgi.com/tech/stl/Vector.html
There might be more reason to increase the capacity only in increments
of the original size or multiples of the memory page size, eg from
GetSystemInfo or sysinfo, or ioctl or getpagesize orwhatever, or just
plain increments of 4k, 4096, which probably evenly divides into the
machine's memory page size.
I know there are a lot of hot wheels out there when it comes to
reinventing this growable array wheel. By the same token, there are a
lot of junk wheels.
One place to consider is the backlogs of comp.lang.c:
http://groups-beta.google.com/group/comp.lang.c/search?group=comp.lang.c&q=growable+array
I don't want a linked list, this data structure appends only at the end
and has a lot of random access reads, also near the end, where a linked
list takes linear time to access random elements, it's just good for
insertion into the list. Anyways I'm trying to figure out a pragmatic
and sensible way to implement a subset of C++'s vector in C, without
need for iterators, which would just be a pointer to the data.
Anyways, I'm trying to figure out a set of macros to implement a C
language growable array that I can redefine to reflect the same thing
with the std::vector.
typedef struct {
int* data
size_t size;
size_t capacity;
} x_vector;
#define vector_get(vector, offset, val) val = vector##.data[offset]
#define vector_set(vector, offset, val) vector##.data[offset] = val
#define vector_capacity(vector) vector##.capacity
#define vector_size(vector) vector##.size
#define vector_reserve(vector, size) \
{ \
int* temp; \
temp = vector##.data; \
vector##.data = realloc(data, size); \
if(vector##.data ==NULL){ \
vector##.data = temp; \
} else { \
vector##.capacity = size; \
} (void)0
versus
#define vector_get(vector, offset, val) val = vector##[offset]
#define vector_set(vector, offset, val) vector##[offset] = val
#define vector_capacity(vector) vector##.capacity()
#define vector_size(vector) vector##.size()
#define vector_reserve(vector, size) vector##.reserve(size)
Obviously there is no bounds checking on that and the implementation
would need to check or know the capacity before trying to get or set
any of the elements of the growable array. If the realloc fails then
the pointer is restored to what it was instead of nulling and losing
it.
#define vector_init(vector) vector##.size = 0; \
vector##.capacity = 0; \
vector##.data = malloc(0)
#define vector_free(vector) free(vector##.data)
It's obviously a hassle to type
x_vector v;
vector_init(v);
vector_reserve(v, 4096);
if(vector_capacity < 4096){ error();}
vector_set(v, 0, 9);
....
compared to
v[0] = 9;
or
#define x_vector std::vector<int>
#define vector_init(v)
#define vector_free(v) delete v
Anyways, good practices for growable arrays, in this case vectors:
advice?
There are various implementations of realloc with respect to passing it
a NULL pointer. You might expect that realloc would be OK with a NULL
pointer. Sometimes that is not the case. Also, malloc(0) might be
bad, but probably not.
I don't know if I need the ##'s which merge the identifiers in the
macros. Also I think (void)0; is discarded by the compiler.
Thanks,
Ross F.
Indeed, if the code is compiled as C++, I want the container to
actually be std::vector, in this case of a collection of value types or
std::vector<int>.
So where I would use an int* and reallocate it from time to time in C,
and randomly access it via [], then I figure to copy the capacity and
reserve methods, because I just need a growable array.
I get to considering reallocation patterns, and get to thinking about
something along the lines of:
allocate_one_more
allocate_n_more
allocate_twice_as_many
allocate_field_again_as_many
alllocate_pagesize_many
Those are basically allocation resizing strategies. For example, in a
program with a one or few elements in an array, with few reallocations,
then when the array is reallocated there might only need be one more
allocated. If again the count of reallocations is low but few or many
elements are added at one time, then allocate n many more for some
variable n.
In general, the notion might be to allocate twice as many, this has
some notion of capacity, which is the space allocated in std::vector,
and is the recommended amount to resize the array on the SGI STL
description of vector.
http://www.sgi.com/tech/stl/Vector.html
There might be more reason to increase the capacity only in increments
of the original size or multiples of the memory page size, eg from
GetSystemInfo or sysinfo, or ioctl or getpagesize orwhatever, or just
plain increments of 4k, 4096, which probably evenly divides into the
machine's memory page size.
I know there are a lot of hot wheels out there when it comes to
reinventing this growable array wheel. By the same token, there are a
lot of junk wheels.
One place to consider is the backlogs of comp.lang.c:
http://groups-beta.google.com/group/comp.lang.c/search?group=comp.lang.c&q=growable+array
I don't want a linked list, this data structure appends only at the end
and has a lot of random access reads, also near the end, where a linked
list takes linear time to access random elements, it's just good for
insertion into the list. Anyways I'm trying to figure out a pragmatic
and sensible way to implement a subset of C++'s vector in C, without
need for iterators, which would just be a pointer to the data.
Anyways, I'm trying to figure out a set of macros to implement a C
language growable array that I can redefine to reflect the same thing
with the std::vector.
typedef struct {
int* data
size_t size;
size_t capacity;
} x_vector;
#define vector_get(vector, offset, val) val = vector##.data[offset]
#define vector_set(vector, offset, val) vector##.data[offset] = val
#define vector_capacity(vector) vector##.capacity
#define vector_size(vector) vector##.size
#define vector_reserve(vector, size) \
{ \
int* temp; \
temp = vector##.data; \
vector##.data = realloc(data, size); \
if(vector##.data ==NULL){ \
vector##.data = temp; \
} else { \
vector##.capacity = size; \
} (void)0
versus
#define vector_get(vector, offset, val) val = vector##[offset]
#define vector_set(vector, offset, val) vector##[offset] = val
#define vector_capacity(vector) vector##.capacity()
#define vector_size(vector) vector##.size()
#define vector_reserve(vector, size) vector##.reserve(size)
Obviously there is no bounds checking on that and the implementation
would need to check or know the capacity before trying to get or set
any of the elements of the growable array. If the realloc fails then
the pointer is restored to what it was instead of nulling and losing
it.
#define vector_init(vector) vector##.size = 0; \
vector##.capacity = 0; \
vector##.data = malloc(0)
#define vector_free(vector) free(vector##.data)
It's obviously a hassle to type
x_vector v;
vector_init(v);
vector_reserve(v, 4096);
if(vector_capacity < 4096){ error();}
vector_set(v, 0, 9);
....
compared to
v[0] = 9;
or
#define x_vector std::vector<int>
#define vector_init(v)
#define vector_free(v) delete v
Anyways, good practices for growable arrays, in this case vectors:
advice?
There are various implementations of realloc with respect to passing it
a NULL pointer. You might expect that realloc would be OK with a NULL
pointer. Sometimes that is not the case. Also, malloc(0) might be
bad, but probably not.
I don't know if I need the ##'s which merge the identifiers in the
macros. Also I think (void)0; is discarded by the compiler.
Thanks,
Ross F.