Dynamically-allocated Multi-dimensional Arrays

S

Serpent

The C-FAQ describes some techniques here: http://c-faq.com/aryptr/dynmuldimary.html

I was using something slightly different from the C-FAQ and I was
wondering if it was legal.

Say I want a two-dimensional array, like this:

int x[2][3];

but I want it dynamically-allocated, and I want expressions that refer
to its elements to use the common subscript syntax.

int i = x[0][1];

I also want it to be one contiguous piece of memory, to avoid overhead
and complications.

Is this legal?

int (*x)[3] = (int (*)[3]) malloc(sizeof(int) * 2 * 3);

x[0][0] = 1;
x[0][1] = ...;
x[1][1] = ...;

I don't see anything in the standard that wouldn't allow this. It is
also cleaner than what the C-FAQ mentions, which is:

int (*x)[2][3] = (int (*)[2][3]) malloc(sizeof(int) * 2 * 3);

(*x)[0][0] = 1;
(*x)[0][1] = ...;
(*x)[1][1] = ...;

which I want to avoid.

Thanks for any input. If this is illegal, a pointer into the standard
would be appreciated.

if this is legal, I will look into updating the C-FAQ.

-Frank
 
B

brent.buescher

The C-FAQ describes some techniques here:http://c-faq.com/aryptr/dynmuldimary.html

I was using something slightly different from the C-FAQ and I was
wondering if it was legal.

Say I want a two-dimensional array, like this:

int x[2][3];

but I want it dynamically-allocated, and I want expressions that refer
to its elements to use the common subscript syntax.

int i = x[0][1];

I also want it to be one contiguous piece of memory, to avoid overhead
and complications.

The second way listed in the FAQ is what you want:

int **array2 = malloc(nrows * sizeof(int *));
array2[0] = malloc(nrows * ncolumns * sizeof(int));
for(i = 1; i < nrows; i++)
array2 = array2[0] + i * ncolumns;

It happens to also be more or less the way that BLAS does it, if I
remember correctly.

Brent
 
S

Serpent

I also want it to be one contiguous piece of memory,
to avoid overhead and complications.

The second way listed in the FAQ is what you want:

int **array2 = malloc(nrows * sizeof(int *));
array2[0] = malloc(nrows * ncolumns * sizeof(int));
for(i = 1; i < nrows; i++)
array2 = array2[0] + i * ncolumns;


This looks like two pieces of memory to me. It also looks like
additional overhead.

I'm looking for a solution that uses one piece, like the one I
suggested. Is my suggestion legal C? It compiles and executes with no
out-of-bounds accesses on the platforms and compilers I've tested, and
I can't find any contradicting paragraphs in the standard, but I may
have missed something.

-Frank
 
P

pete

Serpent said:
The C-FAQ describes some techniques here: http://c-faq.com/aryptr/dynmuldimary.html

I was using something slightly different from the C-FAQ and I was
wondering if it was legal.

Say I want a two-dimensional array, like this:

int x[2][3];

but I want it dynamically-allocated, and I want expressions that refer
to its elements to use the common subscript syntax.

int i = x[0][1];

I also want it to be one contiguous piece of memory, to avoid overhead
and complications.

Is this legal?

int (*x)[3] = (int (*)[3]) malloc(sizeof(int) * 2 * 3);

x[0][0] = 1;
x[0][1] = ...;
x[1][1] = ...;

I don't see anything in the standard that wouldn't allow this.

The only thing wrong with it,
is that only one of the dimensions can be a variable.
If that's OK with you, then there's nothing wrong with it.

/* BEGIN new.c */

#include <stdio.h>
#include <stdlib.h>

#define XX 2
#define YY 3

int main(void)
{
int x, y;
int (*array)[YY];

x = XX;
array = malloc(x * sizeof *array);
if (array == NULL) {
puts("array == NULL");
exit(EXIT_FAILURE);
}
while (x-- != 0) {
for (y = 0; y != YY; ++y) {
array[x][y] = x + y;
}
}
for (x = 0; x != XX; ++x) {
for (y = 0; y != YY; ++y) {
printf("%d\n", array[x][y]);
}
}
free(array);
return 0;
}

/* END new.c */
 
B

brent.buescher

The second way listed in the FAQ is what you want:
int **array2 = malloc(nrows * sizeof(int *));
array2[0] = malloc(nrows * ncolumns * sizeof(int));
for(i = 1; i < nrows; i++)
array2 = array2[0] + i * ncolumns;


This looks like two pieces of memory to me. It also looks like
additional overhead.


You have to allocate the pointers for *array2[nrows]. A two-
dimensional array in C looks like a pointer to an array of pointers.

The reason that technique is useful is that you can dynamically
allocate an array of any dimensions at runtime.

You can also, for efficiency, use array2[0] as if it were a one-
dimensional array of size rows*columns.
I'm looking for a solution that uses one piece, like the one I
suggested. Is my suggestion legal C?

Now that I thought through what your code does, you do end up with an
array that's laid out like a statically allocated array. The
disadvantage is, you have to declare the number of columns at compile
time.

int (*x)[3] = (int (*)[3]) malloc((sizeof int) * nrows * 3);

is legal.

int (*x)[ncols] = (int (*)[ncols]) malloc((sizeof int) * nrows *
ncols);

is not.

And now that I've thought it through, I remember there's a long
discussion of these matters in Peter van Der Linden's _Expert C
Programming: Deep C Secrets_.

Brent
 
B

brent.buescher

int (*x)[3] = (int (*)[3]) malloc((sizeof int) * nrows * 3);
is legal.
int (*x)[ncols] = (int (*)[ncols]) malloc((sizeof int) * nrows *
ncols);
is not.

Isn't the second form legal as of C99 ?

I don't know. So you have a pointer to an array of ncols ints. C99
should let you allocate that, right? And OK, it ends up getting
pointed to a contiguous block of ints on the heap that's ncols*nnrows
big. Seems like it should work.

Note that in that case you could just write

int x[ncols][nrows];

couldn't you? Or does that approach gives you an Iliffe vector like
in the example I pulled from the FAQ? It does apparently allocate the
array on the stack, which you might want to avoid.

As an aside, is it safe to trust real-world implementations of C99
variable-size arrays yet?

Brent
 
C

Chris Torek

The C-FAQ describes some techniques here:
http://c-faq.com/aryptr/dynmuldimary.html

I was using something slightly different from the C-FAQ and I was
wondering if it was legal.

Say I want a two-dimensional array, like this:

int x[2][3];

but I want it dynamically-allocated, and I want expressions that refer
to its elements to use the common subscript syntax.

int i = x[0][1];

I also want it to be one contiguous piece of memory, to avoid overhead
and complications.

Is this legal?

int (*x)[3] = (int (*)[3]) malloc(sizeof(int) * 2 * 3);

Yes, but it is better to write, e.g.:

int (*x)[3];
...
x = malloc(n * sizeof *x);

where "n" is the number of rows to use. If the malloc() succeeds,
you can then do:

for (i = 0; i < n; i++)
for (j = 0; j < 3; j++)
... operate on x[j] ...

Note that the number of columns is fixed: it must always be exactly
three. If you want the number of columns to be variable as well,
you have a problem.

In C99 (but not older versions of C), there is a new feature called
a "variable length array" or VLA. This new feature allows you to
change the number of columns, as well as the number of rows. Here,
you might write, e.g.:

void do_something(size_t m, size_t n) {
int arr[m][n];
size_t i, j;

for (i = 0; i < m; i++)
for (j = 0; j < n; j++)
... operate on arr[j] ...

/* "arr" is automatic, so it vanishes when we return */
}

If the array needs to have "allocated" storage duration, i.e., come
from malloc(), rather than having automatic storage duration, you
can still do that:

void do_more(size_t m, size_t n) {
int (*p)[n];
size_t i, j;

p = malloc(m * sizeof *p);
if (p == NULL)
... handle error ...

for (i = 0; i < m; i++)
for (j = 0; j < n; j++)
... operate on p[j] ...
}

but now there is an annoying problem: the type of "p" is "pointer
to VLA (of size n) of int", where n is determined by the call to
do_more(). So you must give up some degree of type-safety in order
to do something useful with the value in p (such as store it in a
return value, or store it in some "global" variable, or whatever)
before returning. It then becomes very easy to store the pointer
in a VLA of different "shape" (e.g., "pointer to VLA (of size 99)
of int" instad of "pointer to VLA (of size 33) of int"). If you
do this, tragedy will ensue. Hence, while VLAs are quite useful,
they are not a panacea.

If you do not have C99 (most people still do not, or at least not a
full version) and/or are unwilling to depend on having at least the
VLA feature, you have just two options.

A) Use a minimum of two malloc() calls: one to create the space
for the array, and one to create the space for a vector of
"row pointers". This technique is shown in the FAQ. It is
tempting to use one malloc() to create both regions, but to
do so invites the Gods of Alignment to strike your program
down when you least expect it. :) (More seriously: some
machines have particular alignment constraints, and getting
both the row-vector *and* the data-area aligned requires
machine dependent trickery. The malloc() function contains
exactly this trickery, and calling it twice makes both areas
properly aligned.)

B) Use one malloc() call to allocate the data area, then do your
own "manual" calculations:

int *do_it_with_C89(size_t m, size_t n) {
int *p;
size_t i, j;

p = malloc(m * n * sizeof *p);
if (p == NULL)
... handle error ...
for (i = 0; i < m; i++)
for (j = 0; j < n; j++)
... operate on p[i * n + j] ...
return p;
}

Method (B) works in C99 as well, of course. While it is type-safe,
it is easy to make mistakes in remembering "m" and/or "n" and/or
doing the subscript calculation. You can minimize the chances of
such problems by using a structure to encapsulate everything:

/* in some header */
struct int_matrix {
size_t m;
size_t n;
int *p;
};
#ifndef NDEBUG
#define INT_MAT_ELEM(mp, i, j) \
(assert((i) < (mp)->m && (j) < (mp)->n), (mp)->p[(i) * (mp)->n + (j)])
#else
#define INT_MAT_ELEM(mp, i, j) ((mp)->p[(i) * (mp)->n + (j)])
#endif

/* in some source file that includes the above header, plus <stdlib.h> */
struct int_matrix *int_mat_create(size_t m, size_t n) {
struct int_matrix *mp;
size_t i, end;

mp = malloc(sizeof *mp);
if (mp == NULL)
return NULL;
mp->p = malloc(m * n * sizeof *mp->p);
if (mp->p == NULL) {
free(mp);
return NULL;
}
mp->m = m;
mp->n = n;

/* init to all-zeros */
for (i = 0, end = m * n; i < end; i++)
mp->p = 0;

return mp;
}

void int_mat_destroy(struct int_matrix *mp) {
free(mp->p);
free(mp);
}

/* in some file that uses the integer matrix */
...
...
struct int_matrix *p;

p = int_mat_create(m, n);
if (p == NULL)
... handle error ...
...
INT_MAT_ELEM(p, i, j) = 42;
... other operations on INT_MAT_ELEM(p, i, j) ...

It is a little peculiar to have a function-like macro that expands
to an lvalue (so that you can assign to INT_MAT_ELEM(p,i,j) as in
the example above), but it should work.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,995
Messages
2,570,230
Members
46,819
Latest member
masterdaster

Latest Threads

Top