Simple C containers, std::vector analog

  • Thread starter Ross A. Finlayson
  • Start date
R

Ross A. Finlayson

I'm trying to write some C code, but I want to use C++'s std::vector.
Indeed, if the code is compiled as C++, I want the container to
actually be std::vector, in this case of a collection of value types or
std::vector<int>.

So where I would use an int* and reallocate it from time to time in C,
and randomly access it via [], then I figure to copy the capacity and
reserve methods, because I just need a growable array.

I get to considering reallocation patterns, and get to thinking about
something along the lines of:

allocate_one_more
allocate_n_more
allocate_twice_as_many
allocate_field_again_as_many
alllocate_pagesize_many

Those are basically allocation resizing strategies. For example, in a
program with a one or few elements in an array, with few reallocations,
then when the array is reallocated there might only need be one more
allocated. If again the count of reallocations is low but few or many
elements are added at one time, then allocate n many more for some
variable n.

In general, the notion might be to allocate twice as many, this has
some notion of capacity, which is the space allocated in std::vector,
and is the recommended amount to resize the array on the SGI STL
description of vector.

http://www.sgi.com/tech/stl/Vector.html

There might be more reason to increase the capacity only in increments
of the original size or multiples of the memory page size, eg from
GetSystemInfo or sysinfo, or ioctl or getpagesize orwhatever, or just
plain increments of 4k, 4096, which probably evenly divides into the
machine's memory page size.

I know there are a lot of hot wheels out there when it comes to
reinventing this growable array wheel. By the same token, there are a
lot of junk wheels.

One place to consider is the backlogs of comp.lang.c:

http://groups-beta.google.com/group/comp.lang.c/search?group=comp.lang.c&q=growable+array

I don't want a linked list, this data structure appends only at the end
and has a lot of random access reads, also near the end, where a linked
list takes linear time to access random elements, it's just good for
insertion into the list. Anyways I'm trying to figure out a pragmatic
and sensible way to implement a subset of C++'s vector in C, without
need for iterators, which would just be a pointer to the data.

Anyways, I'm trying to figure out a set of macros to implement a C
language growable array that I can redefine to reflect the same thing
with the std::vector.

typedef struct {
int* data
size_t size;
size_t capacity;
} x_vector;

#define vector_get(vector, offset, val) val = vector##.data[offset]
#define vector_set(vector, offset, val) vector##.data[offset] = val
#define vector_capacity(vector) vector##.capacity
#define vector_size(vector) vector##.size
#define vector_reserve(vector, size) \
{ \
int* temp; \
temp = vector##.data; \
vector##.data = realloc(data, size); \
if(vector##.data ==NULL){ \
vector##.data = temp; \
} else { \
vector##.capacity = size; \
} (void)0

versus

#define vector_get(vector, offset, val) val = vector##[offset]
#define vector_set(vector, offset, val) vector##[offset] = val
#define vector_capacity(vector) vector##.capacity()
#define vector_size(vector) vector##.size()
#define vector_reserve(vector, size) vector##.reserve(size)

Obviously there is no bounds checking on that and the implementation
would need to check or know the capacity before trying to get or set
any of the elements of the growable array. If the realloc fails then
the pointer is restored to what it was instead of nulling and losing
it.

#define vector_init(vector) vector##.size = 0; \
vector##.capacity = 0; \
vector##.data = malloc(0)
#define vector_free(vector) free(vector##.data)

It's obviously a hassle to type

x_vector v;

vector_init(v);
vector_reserve(v, 4096);

if(vector_capacity < 4096){ error();}

vector_set(v, 0, 9);

....

compared to

v[0] = 9;

or

#define x_vector std::vector<int>
#define vector_init(v)
#define vector_free(v) delete v

Anyways, good practices for growable arrays, in this case vectors:
advice?

There are various implementations of realloc with respect to passing it
a NULL pointer. You might expect that realloc would be OK with a NULL
pointer. Sometimes that is not the case. Also, malloc(0) might be
bad, but probably not.

I don't know if I need the ##'s which merge the identifiers in the
macros. Also I think (void)0; is discarded by the compiler.

Thanks,

Ross F.
 
R

Ross A. Finlayson

Stop intimidating me!

Yeah, there are bugs in that. The macro has a block which declares its
own variable, and there is a missing closing brace on the block. All
those ##'s are not needed.

Today the compiler will readily tell you exactly where the bug is, and
is that close to correcting it, steps toward automatic computing or
rather automatic programming. Thirty years ago, the compiler will
readily give you the object file to execute it on the thirty million
dollar supercomputing mainframe. Today, the computing power of thiry
years ago's thirty million dollar supercomputing mainframe is fit into
an MP3 player that plays MP3 files into headphones and straps to your
arm. Thirty years ago it was 1975 and computers ran in Kilohertz
(kilohertz, kiloHertz, KiloHertz). Today they run in Gigahertz, as
Moore approaches physical silicon limits, leading to reversible
processors that run cold on a trickle of picoamps. Having no truck
with school, the computer still runs a million times faster.

The realloc function doesn't always deallocate the previous block of
memory and copy the contents, er... sometimes there is still room on
the heap contiguous to the originally allocated block. It still
returns NULL on failure. It's at the implementation's discretion how
to do that. I'm concerned that some C implementations don't support
nested blocks.

It's probably safer to use malloc(sizeof(word_t)) in the initializer
than malloc(0), or rather:

#define vector_init(vector, T) vector.data =
s_cast(T)(malloc(sizeof(T))

and some implementations do choke on realloc'ing a null pointer.

I think RAII is a decent concept, a good concept, but like everything
else, good in moderation.

(Some things are good in lots of moderation.)

That C vector analog is kind of a castrated C++ standard library
container, that's because in this case for its use there is no need for
many of the functions except those listed. It can return begin and end
pointers for use with std::algorithm, obviously enough implemented with
std::vector it does. There's no implementation of push_front or
pop_front, but it could pop_front by incrementing the pointer, storing
the original pointer for allocation, and stuff. There no insertion
into the array, only push_back, or vector push_back(vector, value).

Basically the idea here is to design the functions as good ISO C, and
ANSI C, yet have them essentially be template functions for use with
C++, with old or new style I/O and containers.

#ifdef __cplusplus
template f <voidpT> void f(voidpT buffer){
#elif
void f(void* buffer){
#endif

For example, they work on buffers, and these vector type dealies, but
again I am working on simple software where the full complexity of C++
is underappreciated. The #elif thing there is not universally
supported. So, I'm trying to figure out how to work in basically
istream and ostream, but only as they look like a file-mapped memory
buffer, just readT and writeT, * and ++. I'm out of my depth here.


Thank you,

Ross F.
 
R

Ross A. Finlayson

I get to thinking about how to implement a vector. Currently I just
have a bunch of macro functions that represent a subset of C++'s
std::vector implementation.

Basically, I don't need a vector, just a growable array with the
ability to add elements to the back, and a forward iterator from the
front.

As part of its being growable, basically I want to minimize
reallocation. Also, and this is platform specific but a general
concern, I don't want to have fragmented bits in memory, but instead
blocks of memory naturally aligned to the machine and runtime's
allocation method for physical random access memory, RAM.

So I started with a struct:

struct x_vec_t{

size_t size;
size_t capacity
x_vecstate_t vecstate;
x_t* data;
};

I search for "linked list of memory pages" and get to
http://www.gotw.ca/publications/mill14.htm , and basically it seems the
C++ container that is implemented with a linked list of memory pages is
double-ended queue or deque. Mmm... sundaes. Anyways, I think I'm
more interested in an array of memory pages, instead of a linked list,
for a queue. This is where there are generally going to be more
elements in the growable array than the size of a memory page. I want
to avoid reallocation, but I want the pointers to the memory pages in
order so that random access to their pointer values is constant time.

struct x_vec_t{

size_t size;
size_t capacity;
x_vecstate_t vecstate;
memptr_t * pages
x** data;
};

Then, instead of using [] to randomly access a data*, indirection is
used to access the data** for the one data* that the page contains,
using a mask computed from the page size and some shifts.

Then, the user might want the contiguous array out of the growable
array for functions that just work on contiguous arrays, not these
abstract data structures. The data might already be well-organized for
that. If it's not, then a function allocates its own array of the
correct size and then another function copies the elements in order
from the container to the array, from the growable array to the array,
basically forming a snapshot. Alternatively, the growable array can be
marked with a flag that indicates that its x* at data[0] is the
beginning of the contiguous array of size many element of type x, with
a function to reorganize itself by allocating as much memory as
necessary to contain its not-necessarily contiguous memory pages and
copy them over itself, having the contiguous flag set until it reserves
more capacity by allocating another page and adding its offset into the
array of pages, without reallocating all the other data pages.

So then when the structure is initialized, it allocates a memory page
for the page pointer, or sizeof many pages, and a memory page for the
data pointer, or sizeof many or some other amount per the allocation
strategy. On lots of machines these days, the memory page size is
around 4 kilobytes, which means that around 1024 many pages could be
allocated with pointers from the page array before that would have to
be reallocated. On some machines the page size is several megabytes
and instead of getpagesize a constant macro can replace the function,
that as well for platforms where there is no getpagesize function or
other function returning some ideal memory block size.

So, that appears to be replaceable by the definition of a queue instead
of a vector, because there's only addition of elements to the end and
not the beginning. It's kind of like a template, for each type x_t to
be contained there is declared struct x_vec_t, or x_array_t.

Let me see if I can recount these requirements: I want basically an
array, that is growable as painlessly as possible. The characteristics
of the array is that it might have only a few thousand elements or
several millions or billions of elements. Thus it should avoid
reallocation of the data array as possible because the eventual count
of elements of the growable array is unknown at the outset. It should
be readily compatible with functions that expect fixed arrays. It
should have constant time access to elements and amortized constant
time growing.

There you go.

Warm regards,

Ross F.
 
R

Ross A. Finlayson

Hi,

I wonder about this notion of the growable array. Basically, it seems
a sound idea: avoid reallocation except for the page or "block"
pointers, because the runtime could have any of a variety of behaviors
in its memory allocator(s).

So, I make a type, I figure I'll call it x_array, actually fc_array,
proj_array, with a struct.

#ifndef h__fc_array_h
#define h__fc_array_h

/*

The array functions operate on structs specialized for a type.

struct x_array_t{
fc_size_t size;
fc_size_t capacity;
fc_size_t blocksize;
fc_uint_t blockmask;
fc_arraystate_t state;
fc_size_t pagecount;
fc_ptr_t* pages;
x_t** data;
};

The sizeof the struct is probably 32 bytes on a 32-bit architecture,
the pages and data
members are pointer types, and size_t and so on change, so is probably
64 bytes on a 64-bit
system.

The size member contains the number of elements of type x_t.

The capacity member indicates the maximum size before the array would
be grown/resized.

The blocksize is the size of the page and set by fc_array_init.

The blockmask member is computed from the block size and used with its
complement to address
individual elements in fc_array_init.

The state member is an enumerated value of various states of the data
structure.

The pagecount member is the count of currently referenced pages in the
fixed pages array.

The pages member is a fixed array containing in order the pointers to
the data blocks. It is
reallocated as necessary to contain enough elements to point to each
member of the data array.

The data member is a pointer to memory blocks, x_t pointer, each of
which is a fixed size, blocksize.

The basic use of the functions are to allocate an array context struct,
zero it, initialize it,
push elements onto it, and then free it.

When a new page is allocated, its pointer is added in sequence to the
list of pages and blockcount
is incremented. The page is not necessarily initialized.

The pages might come from a page pool, in multiples of sizeof(x_t).

*/

/* This allocates sizeof(x_array_t) and returns a pointer for
non-automatic variables. */
#define fc_array_context_allocate

/* This zeroes each member of an array struct. */
#define fc_array_context_zero

/* The struct should be zeroed on allocation */
#define FC_ARRAYSTATE_UNINITIALIZED 0
/* After the array is initialized it should be OK. */
#define FC_ARRAYSTATE_OK 1
/* If a page index realloc or block allocation fails, this is the
state. */
#define FC_ARRAYSTATE_CANTALLOCATEMEMORY 2

/* This calls the getpagesize function, and can be set instead to a
constant. */
#define FC_ARRAYBLOCKSIZE_DEFAULT fc_array_getpagesize()

#define FC_ARRAYGROWTH_PAGE
#define FC_ARRAYGROWTH_DOUBLE
#define FC_ARRAYGROWTH_DEFAULT FC_ARRAYGROWTH_DOUBLE

/* Initialize an array with a default block size and growth strategy.
*/
#define fc_array_init

/* Initialize an array with a given block size and reallocation/growth
strategy. */
#define fc_array_init_strategy

/* Free the array. */
#define fc_array_free

/* Set a value at an index in the array. */
#define fc_array_set

/* Get a value at an index in the array. */
#define fc_array_get

/* Push an element onto the back of the array. */
#define fc_array_push_back

/* Get the size of the array. */
#define fc_array_size

/* Get the capacity of the array. */
#define fc_array_capacity

/* Reserve a capacity for the array. */
#define fc_array_reserve

/* Trim an array, releasing unused memory pages and setting
capacity to as low as possible. */
#define fc_array_trim

/* Reorganize an array, eg make it contiguous. */
#define fc_array_reorganize

/* Fill a fixed array. */
#define fc_array_fixed

/* Copy an array. */
#define fc_array_copy

/* Move an array. */
#define fc_array_move

/* Concatenate an array to another. */
#define fc_array_concatenate

/* Concatenate a fixed array. */
#define fc_array_concatenate_fixed

/* Interleave the elements of two arrays. */
#define fc_array_interleave

/* Uninterleave the elements into two arrays. */
#define fc_array_uninterleave


#endif /* h__fc_array_h */

Now, I would rather have those be functions than macros, but the
specialization of the array leads to the requirement of not necessarily
the type x_t, but the sizeof(x_t). So, it is not so bad to implement
specializations for sizeof(x_t) = 1, 2, 4, 8, 16, and 32, say, as they
are templatized functions, but the C language itself has no feature for
that. By the time I got to typing "fc_array_interleave" etc. then I
was starting to think about algorithms on the items, eg sort, again
another thing and something that doesn't belong in a minimal
implementation of a growable array.

Now, here's a separate question that is quite unrelated, it has to do
with tables, basically. Say I have a compilation unit containing one
definition(?), a variable initializer for a globally scoped variable,
as so:

int table[x] = {
0, 1, 2, ...
};

I want to know what it the maximum value for x. For example, say I
have a table that is two megabytes in size. I think on some segmented
memory systems that the maximum sizeof is 64 kilobytes. I wonder if
there are reasonable and very real limits on the size of a const array.

Speaking of const, then I wonder about const on that kind of table.
Should I add the const keyword as modifier to the table variable?

const int table[256]={0, 1, 2, ..., 255};

Another thing I wonder about is the use of the memory page sized
blocks. Is the runtime memory allocator truly happy to allocate blocks
of the memory page size, and on commodity PC systems, will it always be
the case that working with page-aligned page-size blocks of memory will
be the best route?

Again, the notion here is to define a simple container, where those
macros that call functions can have those replaced with, say, some
other standard containers invocation, because of a desire for
consistent runtime interface.

For example, say there is:

#define fc_array_context_allocate(A, T) A =
(T##array_t*)malloc(sizeof(T##array_t))
#define fc_array_context_zero(A, T) memzero(A, sizeof(T##array_t))
#define fc_array_init(A, T) fc_array_init_imp(A, sizeof(T));

vs.

#define fc_array_context_allocate(A, T) A = new std::vector<T>
#define fc_array_context_zero(A, T)
#define fc_array_init(A, T)

This is just a simplistic notion about a growable array in C for value
types (primitives, value composites) that is readily compatible with
C++ standard containers and various levels of indirection with a lean
interface, with the high speed access and constant time growth, memory
pool, arena, free list, and perhaps strong type and garbage collection
compatibility, from C, and use of memory blocks conveniently sized for
minimizing memory fragmentation.

Warm regards,

Ross F.
 
R

Ross A. Finlayson

I wonder how the runtime allocates memory. Please excuse this extended
message.

#include <stdlib.h>
#include <stdio.h>

void buf(int n){

void* buffer = NULL;
buffer = malloc(n);
printf("buffer of size %u: \t%p", n, buffer);
if(((unsigned long)buffer) % n == 0){
printf(", address aligned to buffer size");
}
printf("\n");
free(buffer);

}

int main(int ac, char** av){

unsigned long one = 1;
int i = 0;

buf(getpagesize());

for(i=0;i<32;i++){
buf(one << i);
}

return 0;
}


I run that little program and get:

buffer of size 4096: 0x44520
buffer of size 1: 0x44520, address aligned to buffer size
buffer of size 2: 0x44520, address aligned to buffer size
buffer of size 4: 0x44520, address aligned to buffer size
buffer of size 8: 0x44520, address aligned to buffer size
buffer of size 16: 0x44520, address aligned to buffer size
buffer of size 32: 0x44520, address aligned to buffer size
buffer of size 64: 0x44520
buffer of size 128: 0x44520
buffer of size 256: 0x44520
buffer of size 512: 0x44520
buffer of size 1024: 0x44520
buffer of size 2048: 0x44520
buffer of size 4096: 0x44520
buffer of size 8192: 0x45570
buffer of size 16384: 0xa4000, address aligned to buffer size
buffer of size 32768: 0xa8000, address aligned to buffer size
buffer of size 65536: 0xb0000, address aligned to buffer size
buffer of size 131072: 0xc0000, address aligned to buffer size
buffer of size 262144: 0xe0000
buffer of size 524288: 0x120000
buffer of size 1048576: 0x1a0000
buffer of size 2097152: 0x2a0000
buffer of size 4194304: 0x4a0000
buffer of size 8388608: 0x8a0000
buffer of size 16777216: 0x10a0000
buffer of size 33554432: 0x10a0000
buffer of size 67108864: 0x10a0000
buffer of size 134217728: 0x10a0000
buffer of size 268435456: 0x10a0000
buffer of size 536870912: 0x10a0000
buffer of size 1073741824: 0x10a0000
*** malloc[931]: error: Can't allocate region
buffer of size 2147483648: 0x0, address aligned to buffer size


So, it looks like asking malloc for a block the size of the memory page
does not result in a page-aligned block. All the allocated blocks are
on a 32-byte boundary. for 16, 32, 64, and 128 kilobyte blocks it
might seem that they are allocated to be aligned to the block size,
that might be not the case except for 64 kilobytes. After 16 megabytes
the returned address is the same.

I don't free each allocating buffer, letting the process' return from
main free the memory buffers.

buffer of size 4096: 0x44520
buffer of size 1: 0x45570, address aligned to buffer size
buffer of size 2: 0x45580, address aligned to buffer size
buffer of size 4: 0x45590, address aligned to buffer size
buffer of size 8: 0x455a0, address aligned to buffer size
buffer of size 16: 0x455b0, address aligned to buffer size
buffer of size 32: 0x455d0
buffer of size 64: 0x45600, address aligned to buffer size
buffer of size 128: 0x45650
buffer of size 256: 0x456e0
buffer of size 512: 0x457f0
buffer of size 1024: 0x45a00
buffer of size 2048: 0x45e10
buffer of size 4096: 0x46620
buffer of size 8192: 0x47630
buffer of size 16384: 0xa4000, address aligned to buffer size
buffer of size 32768: 0xa8000, address aligned to buffer size
buffer of size 65536: 0xb0000, address aligned to buffer size
buffer of size 131072: 0xc0000, address aligned to buffer size
buffer of size 262144: 0xe0000
buffer of size 524288: 0x120000
buffer of size 1048576: 0x1a0000
buffer of size 2097152: 0x2a0000
buffer of size 4194304: 0x4a0000
buffer of size 8388608: 0x8a0000
buffer of size 16777216: 0x10a0000
buffer of size 33554432: 0x20a0000
buffer of size 67108864: 0x40a0000
buffer of size 134217728: 0x80a0000
buffer of size 268435456: 0x100a0000
buffer of size 536870912: 0x200a0000
*** malloc[960]: error: Can't allocate region
buffer of size 1073741824: 0x0, address aligned to buffer size
*** malloc[960]: error: Can't allocate region
buffer of size 2147483648: 0x0, address aligned to buffer size

It appears that 16 bytes, on this system, is the least alignment (gcd).

In a recent e-mail I was reading about how an allocated block sometimes
has bookkeeping information in the word or two immediately prior to the
contiguous block. I research, and find some interesting web sites
discussing this kind of thing.

http://www.memorymanagement.org/
http://ose.sourceforge.net/browse.php?group=library-manual&entry=memory.htm

There are lots of memory managers. Their behavior is often
predictable. It's predictable that misuse of dynamic memory
allocations will cause problems.

Basically I'm not very concerned, but I do wonder about high
performance access to physical memory, but I don't have any inclination
to consider implementing a virtual memory. I look at sys/mman.h, this
is a BSD system. There appears to be discussion of pages and
"regions". Hmm... on the malloc man page, it describes a
(non-standard) function malloc_good_size. For each of the above block
sizes, the recommended "good" size is 14 bytes more than the block size
unless it is less than 8 in which case it says 14 or 16k or more in
which case it is the block size. The malloc_size function on this
implementation always returns zero, but errno is not set.

#include <stdlib.h>
#include <stdio.h>
#include <errno.h>

void buf(int n);

void info_good_size(int n){

printf("\t%u", malloc_good_size(n));
/* buf(malloc_good_size(n)); */

}
void buf(int n){

size_t msize=0;

void* buffer = NULL;
buffer = malloc(n);
printf("buffer of size %u: \t%p", n, buffer);
msize = malloc_size(buf);
if(errno) perror(NULL);
printf("\t%u", msize);
info_good_size(n);
if(((unsigned long)buffer) % n == 0){
printf(" A ");
}
printf("\n");
free(buffer);

}


int main(int ac, char** av){
....


buffer of size 4096: 0x44520 0 4110
buffer of size 1: 0x44520 0 14 A
buffer of size 2: 0x44520 0 14 A
buffer of size 4: 0x44520 0 14 A
buffer of size 8: 0x44520 0 14 A
buffer of size 16: 0x44520 0 30 A
buffer of size 32: 0x44520 0 46 A
buffer of size 64: 0x44520 0 78
buffer of size 128: 0x44520 0 142
buffer of size 256: 0x44520 0 270
buffer of size 512: 0x44520 0 526
buffer of size 1024: 0x44520 0 1038
buffer of size 2048: 0x44520 0 2062
buffer of size 4096: 0x44520 0 4110
buffer of size 8192: 0x45570 0 8206
buffer of size 16384: 0xa4000 0 16384 A
buffer of size 32768: 0xa8000 0 32768 A
buffer of size 65536: 0xb0000 0 65536 A
buffer of size 131072: 0xc0000 0 131072 A
buffer of size 262144: 0xe0000 0 262144
buffer of size 524288: 0x120000 0 524288
buffer of size 1048576: 0x1a0000 0 1048576
buffer of size 2097152: 0x2a0000 0 2097152
buffer of size 4194304: 0x4a0000 0 4194304
buffer of size 8388608: 0x8a0000 0 8388608
buffer of size 16777216: 0x10a0000 0 16777216
buffer of size 33554432: 0x10a0000 0 33554432
buffer of size 67108864: 0x10a0000 0 67108864
buffer of size 134217728: 0x10a0000 0 134217728
buffer of size 268435456: 0x10a0000 0 268435456
buffer of size 536870912: 0x10a0000 0 536870912
buffer of size 1073741824: 0x10a0000 0 1073741824
*** malloc[1078]: error: Can't allocate region
buffer of size 2147483648: 0x0 0 2147483648 A

If I uncomment the recursive call to buf() in info_good_size():

Segmentation fault

That is confusing.

So, it seems like instead of using getpagesize() as a default memory
block size, I should start at the least power of two greater than or
equal to getpagesize, and shift that left a bit at a time and compare
that to malloc_good_size of that until they are equal and use that.

While that may be so, malloc_good_size is not a portable function, or
rather not in the standard C library. I could just allocate blocks of
those test sizes until one is aligned on the block boundary, and assume
that's the region.

Basically I'm wondering the most humane way to treat malloc, _the_
standard C library dynamic heap memory allocation facility, to get
memory from it. I want some nice growable arrays, and by nice meaning
the most efficient in terms of being polite to the hardware.

Standard C might be mute about these things, but programs written in
standard C are not deaf to their concerns, or if they are, it's from
too much noise.

I'm trying to determine some good practices for coding C libraries in
general, and the dynamic memory allocation is one of the
strengths/weaknesses, features/bugs, .... This is still in mostly
ignorance about reference counting and basically multithreading and a
processor's real single thread of execution, the active node on the
flowgraph of the flow of control. With the threading being basically
non-portable, although to a large extent the thread and lock functions
have translations in each threading environment, global variables are
to be avoided like the plague, except in a framework where they're
planned with regards to a portable thread-safe or thread-friendly
design.

So anyways, towards implementing a growable array, I don't want to
implement a heavy function for each time the array context is
initialized for a default memory block size, and for portability's sake
on compilation that value will vary anyways. It could be heuristically
determined by a program at compile time and emitted to a port.h type
file, or maybe just:

#define DEFAULT_BLOCK_SIZE 16384

Advice is appreciate, madvise() might be.

Warm regards,

Ross F.
 
B

Barry Schwarz

I wonder how the runtime allocates memory. Please excuse this extended
message.

#include <stdlib.h>
#include <stdio.h>

void buf(int n){

void* buffer = NULL;
buffer = malloc(n);

Since a successful allocation is guaranteed to be properly aligned for
any **object** type, n can have no portable affect on the alignment.
printf("buffer of size %u: \t%p", n, buffer);

n is an int. %u requires an unsigned int.
if(((unsigned long)buffer) % n == 0){

If malloc fails and returns null, is this still defined?
printf(", address aligned to buffer size");
}
printf("\n");
free(buffer);

}

int main(int ac, char** av){

unsigned long one = 1;
int i = 0;

buf(getpagesize());

A non-standard function.
for(i=0;i<32;i++){
buf(one << i);

The argument will be converted from unsigned long to int. Does
1ul<<16 fit in an int portably? Does 1ul<<31 do so on your system?
If you changed buf to use unsigned long your undefined behavior
problems would disappear.
}

return 0;
}


I run that little program and get:

buffer of size 4096: 0x44520
buffer of size 1: 0x44520, address aligned to buffer size
buffer of size 2: 0x44520, address aligned to buffer size
buffer of size 4: 0x44520, address aligned to buffer size
buffer of size 8: 0x44520, address aligned to buffer size
buffer of size 16: 0x44520, address aligned to buffer size
buffer of size 32: 0x44520, address aligned to buffer size
buffer of size 64: 0x44520
buffer of size 128: 0x44520
buffer of size 256: 0x44520
buffer of size 512: 0x44520
buffer of size 1024: 0x44520
buffer of size 2048: 0x44520
buffer of size 4096: 0x44520
buffer of size 8192: 0x45570
buffer of size 16384: 0xa4000, address aligned to buffer size
buffer of size 32768: 0xa8000, address aligned to buffer size
buffer of size 65536: 0xb0000, address aligned to buffer size
buffer of size 131072: 0xc0000, address aligned to buffer size
buffer of size 262144: 0xe0000
buffer of size 524288: 0x120000
buffer of size 1048576: 0x1a0000
buffer of size 2097152: 0x2a0000
buffer of size 4194304: 0x4a0000
buffer of size 8388608: 0x8a0000
buffer of size 16777216: 0x10a0000
buffer of size 33554432: 0x10a0000
buffer of size 67108864: 0x10a0000
buffer of size 134217728: 0x10a0000
buffer of size 268435456: 0x10a0000
buffer of size 536870912: 0x10a0000
buffer of size 1073741824: 0x10a0000
*** malloc[931]: error: Can't allocate region
buffer of size 2147483648: 0x0, address aligned to buffer size


So, it looks like asking malloc for a block the size of the memory page
does not result in a page-aligned block. All the allocated blocks are

C doesn't have an object of type page and therefore cannot know about
its alignment requirements.
on a 32-byte boundary. for 16, 32, 64, and 128 kilobyte blocks it

Possibly an artifact of your OS's memory handler.
might seem that they are allocated to be aligned to the block size,
that might be not the case except for 64 kilobytes. After 16 megabytes
the returned address is the same.

I don't free each allocating buffer, letting the process' return from
main free the memory buffers.

buffer of size 4096: 0x44520
buffer of size 1: 0x45570, address aligned to buffer size
buffer of size 2: 0x45580, address aligned to buffer size
buffer of size 4: 0x45590, address aligned to buffer size
buffer of size 8: 0x455a0, address aligned to buffer size
buffer of size 16: 0x455b0, address aligned to buffer size
buffer of size 32: 0x455d0
buffer of size 64: 0x45600, address aligned to buffer size
buffer of size 128: 0x45650
buffer of size 256: 0x456e0
buffer of size 512: 0x457f0
buffer of size 1024: 0x45a00
buffer of size 2048: 0x45e10
buffer of size 4096: 0x46620
buffer of size 8192: 0x47630
buffer of size 16384: 0xa4000, address aligned to buffer size
buffer of size 32768: 0xa8000, address aligned to buffer size
buffer of size 65536: 0xb0000, address aligned to buffer size
buffer of size 131072: 0xc0000, address aligned to buffer size
buffer of size 262144: 0xe0000
buffer of size 524288: 0x120000
buffer of size 1048576: 0x1a0000
buffer of size 2097152: 0x2a0000
buffer of size 4194304: 0x4a0000
buffer of size 8388608: 0x8a0000
buffer of size 16777216: 0x10a0000
buffer of size 33554432: 0x20a0000
buffer of size 67108864: 0x40a0000
buffer of size 134217728: 0x80a0000
buffer of size 268435456: 0x100a0000
buffer of size 536870912: 0x200a0000
*** malloc[960]: error: Can't allocate region
buffer of size 1073741824: 0x0, address aligned to buffer size
*** malloc[960]: error: Can't allocate region
buffer of size 2147483648: 0x0, address aligned to buffer size

It appears that 16 bytes, on this system, is the least alignment (gcd).

I wonder what object that corresponds to. long double or long long or
perhaps a non-standard extension your compiler supports?
In a recent e-mail I was reading about how an allocated block sometimes
has bookkeeping information in the word or two immediately prior to the
contiguous block. I research, and find some interesting web sites
discussing this kind of thing.

A system specific possibility, but one I know mine doesn't use.
http://www.memorymanagement.org/
http://ose.sourceforge.net/browse.php?group=library-manual&entry=memory.htm

There are lots of memory managers. Their behavior is often
predictable. It's predictable that misuse of dynamic memory
allocations will cause problems.

Basically I'm not very concerned, but I do wonder about high
performance access to physical memory, but I don't have any inclination
to consider implementing a virtual memory. I look at sys/mman.h, this
is a BSD system. There appears to be discussion of pages and
"regions". Hmm... on the malloc man page, it describes a
(non-standard) function malloc_good_size. For each of the above block
sizes, the recommended "good" size is 14 bytes more than the block size
unless it is less than 8 in which case it says 14 or 16k or more in
which case it is the block size. The malloc_size function on this
implementation always returns zero, but errno is not set.

#include <stdlib.h>
#include <stdio.h>
#include <errno.h>

void buf(int n);

void info_good_size(int n){

printf("\t%u", malloc_good_size(n));
/* buf(malloc_good_size(n)); */

}
void buf(int n){

size_t msize=0;

void* buffer = NULL;
buffer = malloc(n);
printf("buffer of size %u: \t%p", n, buffer);
msize = malloc_size(buf);
if(errno) perror(NULL);
printf("\t%u", msize);

Why do you assume that size_t is unsigned int?
info_good_size(n);
if(((unsigned long)buffer) % n == 0){
printf(" A ");
}
printf("\n");
free(buffer);

}


int main(int ac, char** av){
...


buffer of size 4096: 0x44520 0 4110
buffer of size 1: 0x44520 0 14 A
buffer of size 2: 0x44520 0 14 A
buffer of size 4: 0x44520 0 14 A
buffer of size 8: 0x44520 0 14 A
buffer of size 16: 0x44520 0 30 A
buffer of size 32: 0x44520 0 46 A
buffer of size 64: 0x44520 0 78
buffer of size 128: 0x44520 0 142
buffer of size 256: 0x44520 0 270
buffer of size 512: 0x44520 0 526
buffer of size 1024: 0x44520 0 1038
buffer of size 2048: 0x44520 0 2062
buffer of size 4096: 0x44520 0 4110
buffer of size 8192: 0x45570 0 8206
buffer of size 16384: 0xa4000 0 16384 A
buffer of size 32768: 0xa8000 0 32768 A
buffer of size 65536: 0xb0000 0 65536 A
buffer of size 131072: 0xc0000 0 131072 A
buffer of size 262144: 0xe0000 0 262144
buffer of size 524288: 0x120000 0 524288
buffer of size 1048576: 0x1a0000 0 1048576
buffer of size 2097152: 0x2a0000 0 2097152
buffer of size 4194304: 0x4a0000 0 4194304
buffer of size 8388608: 0x8a0000 0 8388608
buffer of size 16777216: 0x10a0000 0 16777216
buffer of size 33554432: 0x10a0000 0 33554432
buffer of size 67108864: 0x10a0000 0 67108864
buffer of size 134217728: 0x10a0000 0 134217728
buffer of size 268435456: 0x10a0000 0 268435456
buffer of size 536870912: 0x10a0000 0 536870912
buffer of size 1073741824: 0x10a0000 0 1073741824
*** malloc[1078]: error: Can't allocate region
buffer of size 2147483648: 0x0 0 2147483648 A

If I uncomment the recursive call to buf() in info_good_size():

Segmentation fault

That is confusing.

So you see any output before the seg fault?
main calls buf(4096).
buf calls malloc_size. Who knows what happens?
buf calls info_good_size(4096).
info_good_size calls buf(4110).
buf calls malloc_size. Who knows what happens?
buf calls info_good_size(4110).
info_good_size calls buf(4124).
buf calls malloc_size. Who knows?
buf calls info_good_size(4124).
info_good_size calls buf(4138).
etc

Do you think maybe you are heading for a stack overflow?
So, it seems like instead of using getpagesize() as a default memory
block size, I should start at the least power of two greater than or
equal to getpagesize, and shift that left a bit at a time and compare
that to malloc_good_size of that until they are equal and use that.

Did I miss an earlier explanation of why you need something "more"
aligned than the default from malloc?
While that may be so, malloc_good_size is not a portable function, or
rather not in the standard C library. I could just allocate blocks of
those test sizes until one is aligned on the block boundary, and assume
that's the region.

Basically I'm wondering the most humane way to treat malloc, _the_
standard C library dynamic heap memory allocation facility, to get
memory from it. I want some nice growable arrays, and by nice meaning
the most efficient in terms of being polite to the hardware.

One common approach is to estimate a reasonable amount, including an
appropriate growth factor, for the initial request. When that is
full, use realloc() to expand. Let the library functions and the OS
worry about being hardware friendly. They are more likely to know
about it than you. It will also mean less work if you have to
transfer the code to a different system.
Standard C might be mute about these things, but programs written in
standard C are not deaf to their concerns, or if they are, it's from
too much noise.

Or possibly from recognizing that attempts to game the system this way
sometimes cause more problems than they solve.
I'm trying to determine some good practices for coding C libraries in
general, and the dynamic memory allocation is one of the

For coding libraries in general, I suggest portability is more
important than optimizations of this kind.
strengths/weaknesses, features/bugs, .... This is still in mostly
ignorance about reference counting and basically multithreading and a
processor's real single thread of execution, the active node on the
flowgraph of the flow of control. With the threading being basically
non-portable, although to a large extent the thread and lock functions
have translations in each threading environment, global variables are
to be avoided like the plague, except in a framework where they're
planned with regards to a portable thread-safe or thread-friendly
design.

I must have missed the change in subject.
So anyways, towards implementing a growable array, I don't want to
implement a heavy function for each time the array context is
initialized for a default memory block size, and for portability's sake
on compilation that value will vary anyways. It could be heuristically
determined by a program at compile time and emitted to a port.h type
file, or maybe just:

In terms of the overall execution of your application, how much is
consumed by "defining" the vector? How much by growing it? Is the
effort involved and the loss of portability worth the potential gain?
#define DEFAULT_BLOCK_SIZE 16384

Advice is appreciate, madvise() might be.

Warm regards,

Ross F.



<<Remove the del for email>>
 
R

Ross A. Finlayson

Oh, the stack overflows!

Oh.

Thank you.

I think when you say that the return of malloc is always aligned to any
data type, you mean integer and floating point value and pointer types,
but not struct types, because that is not the case. I fix the program
as you indicate and now it returns an aligned block to 16 k but not 32
or 64k. That's an example of how heuristics would depend on the many
non-standard and variable runtime samples.

About getpagesize, I guess POSIX says sysconf(_SC_PAGE_SIZE).

You're right about not needing to care very much about the memory
allocation beyond the standard malloc interface, in assuming that the
runtime memory manager is so smart an efficient that you can pass
garbage to it all day and it doesn't impact the performance of the
system. While that is so, if the memory manager for real when asked
for a 16k block, on aparticular platform for a particular runtime on
particular third Thursday's has for some reason maximized, that can be
a useful thing to know, when it doesn't get in the way of more
pragmatic things.

Another notion is that I might want to replace malloc with C++'s
new/delete. That's one reason to use these blocks, and a list of
blocks, in not knowing the ultimate size of the growable array, because
there is no realloc function, and besides realloc might be having
problems with memory fragmentation anyways. The realloc method,
whether it's used for the list of array block pointers only or a C++
replacement with new, copy iterator, and delete, realloc on the block
pointers is less of a concern because if the standard block size is
16k, the block pointers aren't reallocated unless there are more than
4k entries, for about a 64 megabyte array.

This has the empty, initialized, growable array using about 33
kilobytes of memory., with 16k default block size.

As well, in dealing with some idealized block size, the only free list
for an application memory manager need be a list of blocks, with high
and low tide and stuff.

Another notion of having the block size for sure being a power of two
is in accessing the elements in the array by an index, and determining
block and offset by the default block size by bitwise operations
instead of addition. Premature optimization is premature.

In the ideal situation, the array size is known in advance and
allocated, dynamically, to a fixed size. Thus, how often the array
would have to be expanded, or grown or resized, to a larger size,
depends on the algorithms in use and the input data. Categorizing the
patterns associated patterns of that kind of memory usage is something
to consider.

Besides memory block size and sizes convenient to the runtime, another
thing to consider, if only academically, are file cache block sizes,
and the virtual memory implementations' block size(s). You know that
the system probably operates on octets of bits instead of single bits,
in this case the memory manager might be much happier dealing with
"16k", or "best-fit block size" many bytes than some other amount.

So, I wonder about selecting the default block size. In terms of
portability and porting, for this kind of generic array, there's lots
of variability.

I had another question, about the "maximum array size" for an
initialized array variable. It looks like gcc 2.95 will write about a
megabyte to an object file, but not two. It is unclear whether there
would be image loader restrictions as well. Basically I have here
about two or three thousand tables of, say, 256 entries each, and, they
refer to each other, and, I was thinking it might be better to put them
into one initialized array instead of, on object linking and pointer
relocation, having the linker do that for thousands of tables. It
might just be more simple to dynamically allocate an array of that
size, and load the table from file, using the specified file APIs
instead of the mysterious object image linker/loader. Then, if I have
a table of 32 bit ints, I just generate two tables and then by the
platform byte order of Big- or Little-Endian, read the table in, but
that causes problems, because I might want to have the code compile
more or less directly as strongly type-safe and stuff, yet still with
the efficiency.

Warm regards,

Ross F.

memtest.c:
#include <stdlib.h>
#include <stdio.h>
#include <errno.h>

void buf(size_t n, int recurse);

void info_good_size(size_t n, int recurse){

printf("\t%u\n", malloc_good_size(n));
if(recurse!=0) { buf(malloc_good_size(n), 0); }

return;
}
void buf(size_t n, int recurse){

size_t msize=0;

void* buffer = NULL;
buffer = malloc(n);
printf("buffer of size %u: \t%p", n, buffer);
msize = malloc_size(buf);
if(errno) perror(NULL);
if(((unsigned long)buffer) % n == 0){
printf(" A ");
}
printf("\t%u", msize);
if(recurse!=0){
info_good_size(n, 1);
} else {
printf("\t%u\n", malloc_good_size(n));
}
free(buffer);

}


int main(int ac, char** av){

size_t one = 1;
int i = 0;

buf(getpagesize(), 1);

for(i=0;i<32;i++){
buf(one << i, 1);
}

return 0;
}


[space:~/Desktop/faxcog_test] space% cc -pedantic -ansi memtest.c
[space:~/Desktop/faxcog_test] space% pagesize
4096
[space:~/Desktop/faxcog_test] space% ./a.out
buffer of size 4096: 0x44520 0 4110
buffer of size 4110: 0x45570 0 4110
buffer of size 1: 0x44520 A 0 14
buffer of size 14: 0x44530 0 14
buffer of size 2: 0x44520 A 0 14
buffer of size 14: 0x44530 0 14
buffer of size 4: 0x44520 A 0 14
buffer of size 14: 0x44530 0 14
buffer of size 8: 0x44520 A 0 14
buffer of size 14: 0x44530 0 14
buffer of size 16: 0x44520 A 0 30
buffer of size 30: 0x44540 0 30
buffer of size 32: 0x44520 A 0 46
buffer of size 46: 0x44550 0 46
buffer of size 64: 0x44520 0 78
buffer of size 78: 0x44570 0 78
buffer of size 128: 0x44520 0 142
buffer of size 142: 0x445b0 0 142
buffer of size 256: 0x44520 0 270
buffer of size 270: 0x44630 0 270
buffer of size 512: 0x44520 0 526
buffer of size 526: 0x44730 0 526
buffer of size 1024: 0x44520 0 1038
buffer of size 1038: 0x44930 0 1038
buffer of size 2048: 0x44520 0 2062
buffer of size 2062: 0x45570 0 2062
buffer of size 4096: 0x44520 0 4110
buffer of size 4110: 0x45570 0 4110
buffer of size 8192: 0x45570 0 8206
buffer of size 8206: 0x47580 0 8206
buffer of size 16384: 0xa4000 A 0 16384
buffer of size 16384: 0xa8000 A 0 16384
buffer of size 32768: 0xa4000 0 32768
buffer of size 32768: 0xac000 0 32768
buffer of size 65536: 0xa4000 0 65536
buffer of size 65536: 0xb4000 0 65536
buffer of size 131072: 0xa4000 0 131072
buffer of size 131072: 0xc4000 0 131072
buffer of size 262144: 0xa4000 0 262144
buffer of size 262144: 0xe4000 0 262144
buffer of size 524288: 0xa4000 0 524288
buffer of size 524288: 0x124000 0 524288
buffer of size 1048576: 0xa4000 0 1048576
buffer of size 1048576: 0x1a4000 0 1048576
buffer of size 2097152: 0xa4000 0 2097152
buffer of size 2097152: 0x2a4000 0 2097152
buffer of size 4194304: 0xa4000 0 4194304
buffer of size 4194304: 0x4a4000 0 4194304
buffer of size 8388608: 0xa4000 0 8388608
buffer of size 8388608: 0x8a4000 0 8388608
buffer of size 16777216: 0x10a4000 0 16777216
buffer of size 16777216: 0x20a4000 0 16777216
buffer of size 33554432: 0x10a4000 0 33554432
buffer of size 33554432: 0x30a4000 0 33554432
buffer of size 67108864: 0x10a4000 0 67108864
buffer of size 67108864: 0x50a4000 0 67108864
buffer of size 134217728: 0x10a4000 0 134217728
buffer of size 134217728: 0x90a4000 0 134217728
buffer of size 268435456: 0x10a4000 0 268435456
buffer of size 268435456: 0x110a4000 0 268435456
buffer of size 536870912: 0x10a4000 0 536870912
buffer of size 536870912: 0x210a4000 0 536870912
buffer of size 1073741824: 0x10a4000 0 1073741824
*** malloc[271]: error: Can't allocate region
buffer of size 1073741824: 0x0 A 0 1073741824
*** malloc[271]: error: Can't allocate region
buffer of size 2147483648: 0x0 A 0 2147483648
*** malloc[271]: error: Can't allocate region
buffer of size 2147483648: 0x0 A 0 2147483648
[space:~/Desktop/faxcog_test] space%
 
R

Ross A. Finlayson

Hi,

Well, I'm sitting here typing on this, and my growable array has
changed into an exercise in implementing a generic function template.

I have it so that there is fc_array.h, and that contains a bunch of
macros.

#define fc_array_struct_declare(T) \
struct fc_array_##T{ \
fc_size_t a_size; \
fc_size_t a_capacity; \
fc_size_t a_type_size; \
fc_size_t a_block_size; \
fc_uint_t a_block_mask; \
fc_uint_t a_base_shift; \
fc_uint_t a_offset_mask; \
fc_array_error_state_t a_state; \
fc_array_organization_t a_org; \
fc_array_growth_t a_growth; \
fc_size_t a_block_count; \
fc_array_block_state_t* a_block_states; \
T** a_blocks; \
};

#define fc_array_struct_t(T) struct fc_array_##T
#define fc_array_func(T,X) X##_##T

/** This allocates sizeof(x_array_t) and returns a pointer for
non-automatic variables. */
#define fc_array_context_allocate(T,Ap) (Ap =
fc_array_func(T,fc_array_context_allocate_imp)() )

....

Then, there are two header files, each containing basically the
specializations for the type T.

fc_array_declare.h:

#include "fc_array.h"

fc_array_struct_declare(SPEC_T)

fc_array_struct_t(SPEC_T)*
fc_array_func(SPEC_T,fc_array_context_allocate_imp)();

....

fc_array_define.h:


#include "fc_typedefs.h"
#include "fc_facility_memory.h"

/* This allocates sizeof(x_array_t) and returns a pointer for
non-automatic variables. */
fc_array_struct_t(SPEC_T)*
fc_array_func(SPEC_T,fc_array_context_allocate_imp)(){

fc_array_struct_t(SPEC_T)* Ap;

return fc_mem_allocate_1( Ap, fc_array_struct_t(SPEC_T) ) ;

}

....

Then, when I want to implement the growable array, for example to
specialize to type int, then:

fc_array_int.h:

#ifndef h__fc_array_int_h
#define h__fc_array_int_h

#ifdef SPEC_T
#undef SPEC_T
#endif

#define SPEC_T int

#include "fc_array_declare.h"

#endif /* h__fc_array_int_h */

fc_array_int.c:

#include "fc_array_int.h"

#include "fc_array_define.h"


Then the idea is to use a growable array specialized for int as so:

#include "fc_array_int.h"

int main(){
fc_array_struct_t(int) growable;
int* intp;
size_t i = 0;

fc_array_zero(int, growable);
fc_array_init(int, growable);

for(i=0;i< (1<<16);i++){
fc_array_push_back(int, growable, 475);
}

fc_array_copy_fixed(int, growable, intp);

}

Then, intp was an int* with size at least 2^16 with each of those
values in the fixed array being 475.

Basically it uses dynamic heap allocation, but that is encapsulated in
single point-of-change files and macro redefinitions can replace the
implementation with C++ library containers, and instead of reallocating
the backing store it works on page sized or other efficient blocks.
Also, the heap allocation functions can be replaced, to use new/delete
or other platform specific heap allocation functions, or to work off of
a free list of blocks. It has the type sprinkled all over the place
because there's no compiler support for templates in C.

Currently, I'm typing on it and was staring at a raft of compiler
errors, I'm concerned about abusing the preprocessor to implement
template function specialization in C.

I don't know if it'll work because I define a symbol or "define" SPEC_T
after I define a macro fc_array_func or fc_array_struct, fc_array_ctx,
:

#define fc_array_func(T, F) F##_##T

#define SPEC_T int

void fc_array_func(SPEC_T, f)();

and it's coming through

f_SPEC_T();

instead of

f_int();

I.e.:

....
fc.a(fc_array_int.o):
00000674 T _fc_array_capacity_imp_SPEC_T
00000798 T _fc_array_concatenate_fixed_imp_SPEC_T
00000760 T _fc_array_concatenate_imp_SPEC_T
00000000 T _fc_array_context_allocate_imp_SPEC_T
00000040 T _fc_array_context_deallocate_imp_SPEC_T
00000078 T _fc_array_context_zero_imp_SPEC_T
00000728 T _fc_array_copy_imp_SPEC_T
000006f0 T _fc_array_fixed_SPEC_T
000002b4 T _fc_array_free_imp_SPEC_T
000004a8 T _fc_array_get_imp_SPEC_T
U _fc_array_grow_imp_SPEC_T
00000104 T _fc_array_init_imp_SPEC_T
....

Those should be "int" instead of "SPEC_T". I think it has to do with
the order of the definition of those macros. I can make duplicate
output macros because they're in the order they're supposed to be
already. I use other similar macro constructions and don't see why
SPEC_T isn't evaluated.

Can you refer me to other examples of implementing template function
specialization in C?

On another note, I'm wondering about good practices for basically what
I'm calling "parser interlock" in throughput and how to implement at
once the single- and multi-pass patch functions, or rather, I guess
parallelization/deparallelization, single-threaded. Never mind.

Anyways if you can help me understand the preprocessor's defined
limits, or implementation in this case, and how to go about merging
those identifiers it would be appreciated.

Thank you.

Ross F.
 
R

Ross A. Finlayson

Hi,

Well, it seems that there is a simple solution.

....
fc.a(fc_array_int.o):
0000068c T _fc_array_capacity_imp_int
00000aec T _fc_array_concatenate_fixed_imp_int
00000ab4 T _fc_array_concatenate_imp_int
00000000 T _fc_array_context_allocate_imp_int
00000040 T _fc_array_context_deallocate_imp_int
00000078 T _fc_array_context_zero_imp_int
....

Where I had macros like:

#define fc_array_func(T,X) X##_##T

I change it as so:

#define fc_array_func_in(T,X) X##_##T
#define fc_array_func(T,X) fc_array_func_in(T,X)

and then, the preprocessor expands the macro argument T before
merging/concatenating the identifiers. It replaces the identifiers
before passing it to the merge macro.

In searching comp.lang.c for "## macro expansion" and stuff I found
answers to my question. There's not yet much on "template function
specialization" on comp.lang.c.

So, the specialization of the template functions is as follows for
declaration fc_array_int.h:

#ifndef h__fc_array_int_h
#define h__fc_array_int_h

#ifdef SPEC_T
#undef SPEC_T
#endif
#define SPEC_T int

#include "fc_array_declare.h"

#endif /* h__fc_array_int_h */

and for definition the contents of fc_array_int.c:

#include "fc_array_int.h"
#include "fc_array_define.h"

And then there are specialized template functions for the growable
array, prototyped and implemented. Then, if I do want to replace that
with a C++ STL container implementation, the specialization would be
longer than ten lines.

I got to looking at the GTK toolkit, that has some interesting things
in it. Sometimes I wonder why there aren't standard containers for C,
or pools and stuff. After all, there's malloc as a heap allocation
function, I guess the C containers are considered too domain-specific.
I searched a little for various public domain containers and stuff.
When I get this specializable C growable array done I'll put it on a
web page.

Speaking of macros, one feature I thought would be convenient in the
past is a strlen macro on a string constant, eg for:

fwrite( __TIME__, 1, __strlen__(__TIME__), stdout);

That might cause problems, but for lots of those it might improve
performance.

A problem with using macros to generate function prototypes is that it
doesn't necessarily play well with other code tools. I use Doxygen all
the time to get another perspective of the code from the project
builder, mine is an old version, and hopefully in the current or a
future version it will preprocess some of the macros in terms of its
identifier listings. I am content that the compiler is quite happy
with it, however.

I like the idea of simple template functions in C, but it's cumbersome
when it doesn't work well with the other code tools. That's a reason
to use a language, or tools, with built-in support for such features,
but by the same token C is arguably the most widely-used computer
programming language.

Thank you,

Ross F.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,995
Messages
2,570,230
Members
46,819
Latest member
masterdaster

Latest Threads

Top