c99 multidimensional arrays contiguous?

L

Luca Forlizzi

sorry in my previous post I mistyped a sentence. Below is the correct
text
The problem that I have is not exactly the array size. You informally
say *array pointed into*
but the standard says "pointer operand points to an element of an
array object".
What exactly is an element of an array. In my mind, since X is an
array of arrays, the elements of X are arrays,
so p2 and p3 do not point to element of X (so 6.5.6 p8 does not apply
to them and X).

The problem that I have is not exactly the array size. You informally
say *array pointed into* but the standard says "pointer operand points
to an *element* of an
array object". I am not convinced that the objects pointed by p2 and
p3 are elements of X.
What exactly is an element of an array? In my mind, since X is an
array of arrays, the elements of X are arrays,
so p2 and p3 do not point to element of X (so 6.5.6 p8 does not apply
to them and X).
 
S

Seebs

I hope the following clarification helps, because I am not sure we've
got to the bottom of Luca's question. I'm not replying to you to tell
you stuff, I am hanging this post here because this is the place with
the right context.

Sounds fine to me. I'm not totally sure I have this right, myself.
int X[10][10];
int *p1 = &X[0][0]; /* or = X[0]; since X[0] gets converted */
int *p2 = (void *)&X[0]; /* or = X; since X gets converted */
int *p3 = (void *)&X;
(I've used void * simply to avoid questions about implementation
defined conversions and I've written the addresses in the most
explicit form I can, without any array to pointer conversions. I've
also used X as the array name because 'a' is confusing in English
text).
Okay.

I would summarise the majority view as being that p1[10] is an invalid
access and that p3[10] (indeed p3[99]) is valid. Luca's example is
the same as p2 and I think the majority view is that p2[10] is also
fine.

I would agree.
The array X consists of 11 arrays -- the whole one and 10 sub-arrays.
The central question is what is the array into which the various
p[1-3] pointers point?

Good point. This is like those brain-teasers where you see a triangle
subdivided into smaller triangles, and you're supposed to figure out
how many triangles there are.
p1 points to an element of X[0] so it is natural to deduce the array
over which is ranges is just X[0] and not X as a whole. I don't think
there any support in the standard for the idea that p1 points to an
element of X, at least not formally.
Right.

p2 is a converted from a pointer that clearly points to an element of
X (the first one) so here one can reasonably say that the converted
pointer may range of the whole of X.

Yes. Now, for an illustration of the boundary, consider:
int *p4 = (void *)X[0];

I think that, because "X[0]" decays into "the address of X[0][0]", this
is more like p1 than like p2.

Basically, the "&" jumps you out a level.

int i = X[0][0]; /* not useful as a pointer to anything */
/* = int */
int *q1 = (void *) &X[0][0]; /* a pointer to the contents of X[0] */
/* = pointer to int */
int *q2 = (void *) X[0]; /* a pointer to the contents of X[0] */
/* = array[10] of int, => pointer to int */
int *q3 = (void *) &X[0]; /* a pointer to the contents of X */
/* = pointer to array[10] of int */
int *q4 = (void *) X; /* a pointer to the contents of X */
/* = array[10] of array[10] of int,
=> pointer to array[10] of int */
int *q5 = (void *) &X; /* a pointer to the contents of X */
/* = pointer to array[10] of array[10] of int */

Looking at the types (see the /* = <type> */ comments), you can see
that q1 and q2 have the same type, and q3 and q4 have the same type.
q5's really a different case, but since X itself is not a member
of an array, it's treated as an array[1] of its own type, so everything
still works.
So &X is considered (for this purpose) to be a pointer into to the
first element of a one-element array. Thus "the array" referred to in
paragraph 8 is the whole of X.

Yes! I believe that's the key -- &X is a pointer to a one-element array
of arrays of 10 arrays of 10 integers. So it can be used to point to anything
in any of those 10 arrays of 10 integers.

-s
 
S

Seebs

(context)
int X[10][10];
You (and Peter, too) seem to imply that the int objects that are
elements of the elements of X are also elements
of X (i.e. "being element" is a transitive relation). But I can't find
this in the standard.

It's not immediately obvious, but:

Consider the array object X[1] (not the pointer it decays into the moment
you refer to it in an expression, but the thing that is the operand of
sizeof(X[1]).

Clearly, X[1] is an element of X, so a pointer derived from X can be used
to access X[1].

But what can you do with X[1]? It's an array. You can look at its members.

Basically, you can use a pointer derived from X to look at X[1][5] for the
same reason that you can derive a pointer derived from X[0][0] to iterate
over the individual bytes in the object; once you have access to an object,
you're allowed to access its components.

Think about, as an example:
struct abc { int a, b, c; };
struct abc Y[10] = { { 0 } };
struct abc *p = &Y[0];
++p;
p->a = 1;

In other words: Yes, being an element-of is transitive. If you can access
an aggregate object, you can access its members.

-s
 
S

Seebs

Seebs said:
sizeof(<type> [10]) == sizeof(<type>) * 10

There's no room for any padding. The array of 10 ints must have size
precisely 10*sizeof(int). The array of 100 ints must have size
precisely 10*(10*sizeof(int)).

So a is a region of 100*sizeof(int) bytes, while a[0] is a region of
10*sizeof(int) bytes.

Yes, so you have the sizes stitched up neatly. Now consider fat
pointers.

The bounds stored in the fat pointers have to match the size of the object,
and casting allows you to view them as different types, as long as you
know for sure what the layout is. Which you do, with arrays.

-s
 
B

Ben Bacarisse

Luca Forlizzi said:
I hope the following clarification helps, because I am not sure we've
got to the bottom of Luca's question.  I'm not replying to you to tell
you stuff, I am hanging this post here because this is the place with
the right context.

it does help to me! But I am still not convinced that p2 and p3 can
legally access any int inside X
(see below)
There are three situations:

  int X[10][10];
  int *p1 = &X[0][0];      /* or = X[0]; since X[0] gets converted */
  int *p2 = (void *)&X[0]; /* or = X; since X gets converted */
  int *p3 = (void *)&X;

<snip>

I would summarise the majority view as being that p1[10] is an invalid
access and that p3[10] (indeed p3[99]) is valid.  Luca's example is
the same as p2 and I think the majority view is that p2[10] is also
fine.

The arguments all revolve around 6.5.6 p8 about adding to a pointer.
That clause defines the result of the addition only when the result is
within the array pointed "into" by the pointer.  Specifically: "if the
pointer operand points to an element of an array object, and the array
is large enough...".

[I have imported a correction from another post (marked with |) into
this paragraph in an attempt to keep everything in one thread.]
The problem that I have is not exactly the array size. You
informally say *array pointed into* but the standard says "pointer
operand points to an element of an array object".
| I am not convinced that the objects pointed by p2 and p3 are
| elements of X.
What exactly is an element of an array. In my mind, since X is an
array of arrays, the elements of X are arrays, so p2 and p3 do not
point to element of X (so 6.5.6 p8 does not apply to them and X).

That's a good point. I don't have any sound chapter an verse to
counter it. What I can do is explain a bit more about the way I read
things.

There is no doubt that &X[0] points to an element of X (the first
sub-array) and so must (void *)&X[0]. (Converting a pointer to void *
does not change what the pointer points to). I would extend that to
the converted int *. In other words, my view would be that p2 is a
pointer to an element of X even though it is not of the natural type
for such a pointer. A similar argument applies to p3 -- it is an
unnatural type for a pointer to one and only element pointer to by
&X.

I fully accept that there is then a leap from this converted pointer
to allowing it to be a "pointer to an element" (of the now converted
type) for the purposes of the pointer arithmetic described in 6.5.6 p8
but it does not seem an unnatural or excessively contrived leap.
You (and Peter, too) seem to imply that the int objects that are
elements of the elements of X are also elements
of X (i.e. "being element" is a transitive relation). But I can't find
this in the standard.

That's not quite how I see it. I don't think the argument above is
exactly that same as "element of" being transitive. For example, if
the standard included wording that support the idea that an element of
a sub array was an element of the containing array, then p1 would be
able to range over the whole of X. At least, I think that is how I
would then have to read 6.5.6 p8,
Please note that I would love yours to be the right interpretation of
the standard, I find it more
comfortable and close to real usage of the language.

I am not wedded to any particular reading, and your arguments have
made me think again. I'd be happy if "element of" were to be either
defined to be transitive of if that could be deduced from the current
wording. I have no objection to 6.5.6 p8 having a looser meaning
where any pointer into a sub-array can range over the whole array.
I'd be less happy with your very strict reading, though I think I
could live with even that.

BTW, when I say "my reading" and so forth, all I mean is my reading as
coloured by the various arguments on this matter that have happened
here. There is no original interpretation here, just a regurgitation
of those arguments that I've found to be persuasive in the past.
 
S

Seebs

That's not quite how I see it. I don't think the argument above is
exactly that same as "element of" being transitive. For example, if
the standard included wording that support the idea that an element of
a sub array was an element of the containing array, then p1 would be
able to range over the whole of X. At least, I think that is how I
would then have to read 6.5.6 p8,

Ooh, you have a point.

I think the problem is that the *intent* (clearly agreed on by everyone
I've talked to) is that obviously pointers derived from the sub-arrays
can only be used to walk those sub-arrays, while pointers derived from
the big array can be used to walk the whole big array.

It's there to match the intended semantics of bounds checking. Basically,
imagine that you have an array of (array[80] of char) which maps onto the
display for an old greenscreen terminal. Clearly, if you hand someone
a pointer into an 80-character string, you *intend* for it to be a boundary
violation for them to go past the end of that string.

-s
 
P

Phil Carmody

Seebs said:
This conclusion is clear and logical, but is it really deducible from
the standard?

Yes.

sizeof(<type> [10]) == sizeof(<type>) * 10

I thought that this was loopholed a few months ago. IIRC, All that is
required explicitly is that (the normative part is to be interpreted
such that):
There's no room for any padding.

less than sizeof(<type>) trailing padding would be permitted, as
the rounding down would make the division in the example yield
the correct count.

Of course, this is bizarre; but I suspect that it's available as a
plugin module for the DS9000.

I'm also sure that within the last 6 months at least one member
of the C standardisation committee has indicated that what you
say was the committee's intention (I'm sure Larry was one).

Phil
 
E

Ersek, Laszlo

Seebs said:
a is an array of 10 arrays of ints.  The integers in each array are
necessarily contiguous, and the arrays are necessarily contiguous, so
it's going to point to a region containing 100 integers.  If you derive
a pointer from a, it is a pointer into that whole object.
This conclusion is clear and logical, but is it really deducible from
the standard?

Yes.

sizeof(<type> [10]) == sizeof(<type>) * 10

I thought that this was loopholed a few months ago. IIRC, All that is
required explicitly is that (the normative part is to be interpreted
such that):
There's no room for any padding.

less than sizeof(<type>) trailing padding would be permitted, as
the rounding down would make the division in the example yield
the correct count.

Of course, this is bizarre; but I suspect that it's available as a
plugin module for the DS9000.

I'm also sure that within the last 6 months at least one member
of the C standardisation committee has indicated that what you
say was the committee's intention (I'm sure Larry was one).


X-ludens comp.lang.c.moderated:25727
From: "Clive D. W. Feather" <[email protected]>
Subject: Re: [comp.lang.c.moderated] Does "sizeof array" equal "nel * sizeof
Date: Wed, 13 Jan 2010 16:02:02 -0600 (CST)
Message-ID: <[email protected]>


X-ludens comp.lang.c.moderated:25755
From: "Clive D. W. Feather" <[email protected]>
Subject: Re: [comp.lang.c.moderated] Does "sizeof array" equal "nel * sizeof
Date: Wed, 20 Jan 2010 15:26:17 -0600 (CST)
Message-ID: <[email protected]>


I think.

lacos
 
P

Phil Carmody

Seebs said:
sizeof(<type> [10]) == sizeof(<type>) * 10
....
I'm also sure that within the last 6 months at least one member
of the C standardisation committee has indicated that what you
say was the committee's intention (I'm sure Larry was one).


X-ludens comp.lang.c.moderated:25727
From: "Clive D. W. Feather" <[email protected]>
Subject: Re: [comp.lang.c.moderated] Does "sizeof array" equal "nel * sizeof
Date: Wed, 13 Jan 2010 16:02:02 -0600 (CST)
Message-ID: <[email protected]>


X-ludens comp.lang.c.moderated:25755
From: "Clive D. W. Feather" <[email protected]>
Subject: Re: [comp.lang.c.moderated] Does "sizeof array" equal "nel * sizeof
Date: Wed, 20 Jan 2010 15:26:17 -0600 (CST)
Message-ID: <[email protected]>


I think.

Thanks for doing the sniffing, that looks like it.

Phil
 
L

Luca Forlizzi

Having read yours and Peter's posts I now fully believe that the
intent of the standard is
according to your interpretation. I like this because it is then
strictly conforming to
"flatten" a multidimensional array.

I fully accept that there is then a leap from this converted pointer
[snip]
That's not quite how I see it.  I don't think the argument above is
exactly that same as "element of" being transitive.  For example, if
the standard included wording that support the idea that an element of
a sub array was an element of the containing array, then p1 would be
able to range over the whole of X.  At least, I think that is how I
would then have to read 6.5.6 p8,

you are right, it cannot be a transitive relation. If that was the
case there would not be a unique initial element of an array. no no
no. I think that the basic problem is that the intent is that the
validiti of the indexing depends on how the pointer has been
generated, but the wording in 6.5.6 p8 does not say this. It just
refer to a pointer "into" the array, it does not mention how the
pointer has been generated.

I have to say that I would be happier if the next standard could
clarify the intent. English is not my mother language and if I had
just read it without reading also c.l.c. and other newsgroups, I would
have been completely sure of my previous interpretation.

Thanks for your help!
 
L

Luca Forlizzi

There is no doubt that &X[0] points to an element of X (the first
sub-array) and so must (void *)&X[0].  (Converting a pointer to void *
does not change what the pointer points to).  I would extend that to
the converted int *.  In other words, my view would be that p2 is a

does the standard guarantee that converting (void *)&X[0] to int gives
a pointer to the first of the ints
inside X[0] ? I was only able to find that 6.3.2.3 p7 says that
converting to a character type we have
a pointer to the first byte.
Please note that in the case of the implicit array-to-pointer
conversion this is stated explicitely, and
so is the analogous statement for structs and unions.
The fact that it's not explicit makes me suspect that this is not
guaranteed.
Am I too pedantic? :)
 
B

Barry Schwarz

There is no doubt that &X[0] points to an element of X (the first
sub-array) and so must (void *)&X[0].  (Converting a pointer to void *
does not change what the pointer points to).  I would extend that to
the converted int *.  In other words, my view would be that p2 is a

does the standard guarantee that converting (void *)&X[0] to int gives

I assume you meant int* here since converting it to an int will cause
it not to point anywhere.
a pointer to the first of the ints
inside X[0] ? I was only able to find that 6.3.2.3 p7 says that

You neglected to tell us what X was since you changed threads but if
it was an N dimensional array of int, then yes.
converting to a character type we have
a pointer to the first byte.
Please note that in the case of the implicit array-to-pointer
conversion this is stated explicitely, and
so is the analogous statement for structs and unions.

What implicit conversion are you referring to?
The fact that it's not explicit makes me suspect that this is not
guaranteed.

What is not explicit?
Am I too pedantic? :)

We won't know till you fill in the holes.
 
B

Ben Bacarisse

Luca Forlizzi said:
There is no doubt that &X[0] points to an element of X (the first
sub-array) and so must (void *)&X[0].  (Converting a pointer to void *
does not change what the pointer points to).  I would extend that to
the converted int *.  In other words, my view would be that p2 is a

does the standard guarantee that converting (void *)&X[0] to int gives
a pointer to the first of the ints
inside X[0] ? I was only able to find that 6.3.2.3 p7 says that
converting to a character type we have
a pointer to the first byte.

(1) I would not loose a moment's sleep over it. :)

(2) Yes, one can probably deduce this from the standard. I'd start
with 6.2.5 ("Types") p27:

"A pointer to void shall have the same representation and alignment
requirements as a pointer to a character type. [...]"

to which is attached a footnote (all together now: "non-normative"):

"The same representation and alignment requirements are meant to
imply interchangeability as arguments to functions, return values
from functions, and members of unions."
Please note that in the case of the implicit array-to-pointer
conversion this is stated explicitely, and
so is the analogous statement for structs and unions.
The fact that it's not explicit makes me suspect that this is not
guaranteed.
Am I too pedantic? :)

You can't use the above argument for (int *)&X[0], of course but,
again, I wouldn't loose sleep over it.
 
T

Tim Rentsch

Luca Forlizzi said:
Having read yours and Peter's posts I now fully believe that the
intent of the standard is
according to your interpretation. I like this because it is then
strictly conforming to
"flatten" a multidimensional array.

[snip]
I have to say that I would be happier if the next standard could
clarify the intent. English is not my mother language and if I had
just read it without reading also c.l.c. and other newsgroups, I would
have been completely sure of my previous interpretation.

Luca,

Most of what I would have said in this thread has been said
already (comments by Ben Bacarisse are an excellent example),
so I have only a little to add, but I would like to express
those additions.

First, the Standard does not do a good job of clarifying
what is intended here. I have looked and looked for
something that would give some sort of indication for
when accesses are allowed and when they aren't, and there
is precious little to find (or at least that I've found).
As far as I know the only remarks explicitly related to
this general question about multi-dimensional arrays are
non-normative -- I think one is in a footnote and the
other is in a non-normative Annex. And _no_ remarks
anywhere in the Standard (as far as I know) that explicitly
say anything about the specific case of re-interpreting
a multi-dimensional array as a single dimensional array.

Second, the problem is complicated by the Standard using
the word "object" with two related but still very distinct
meanings, namely, one, just as a region of storage, and
two, as _the_ region of storage defined by an identifier
or by a component of that identifier if the top-level
type is an aggregate or union type. To ask a variation
of one of the points brought up in the thread, given the
declaration

int x[100];

how many arrays are there? Hint: the answer is an awful
lot! (Working assumption -- arrays of a given type all
have the same alignment no matter how many elements the
array has. This assumption is not guaranteed by the
Standard but it is true of all implementations that I know
of.) Ready? The answer is there are, at a guess, tens
of thousands of arrays. Just for starters, all the
one-dimensional sub-arrays, such as

*(int (*)[1])x, *(int (*)[2])x, *(int (*)[3])x, ...
*(int (*)[99])x+1, *(int (*)[98])x+2, *(int (*)[97])x+3, ...
... /* you get the idea */

That's about 5000 right there. Now add in all the two-dimensional
sub-arrays, all the three-dimensional sub-arrays, all the four ...
again you get the idea. Each of these different regions of
storage, and also the different dimensional overlays that go
on top of them, can make a different "array object". For example,
the two arrays '*(int (*)[2][5])x' and '*(int (*)[5][2])x' occupy
the same region of storage, and are both two-dimensional, yet are
very different arrays. Which of these thousands of different arrays
count as far as what indexing operations are allowed? The Standard
says almost nothing that elucidates the question.

Given all of this confusion, what are we to conclude?
Speaking for myself, I have concluded two things. One,
this area is one where the Standard is pretty weak in
describing what requirements are intended. It's regrettable
that that's true, but I accept that is probably won't
change much. Two, I think there is a general understanding
"in the community" that what's allowed is determined by
where a pointer value comes from -- where it comes from
in terms of program identifiers if these can be determined
statically, or where the value comes from dynamically if
there is no static analysis available. I believe this
view has even gotten some level of official blessing (I
recall it being discussed in some DR's or something but
I don't have any references), however, whether it has or
not I've decided for myself to take the "communal wisdom"
viewpoint as the official stance for the C language.
(At least, that is, until some later official statement
changes it. :)

Thanks for your help!

I hope my comments have also helped at least in
explaining why the issue is so murky and why
despite that there is still a pretty strong
consensus about what the Standard "intends" as
requirements for array-reshaping conversions.
 
L

Luca Forlizzi

(1) I would not loose a moment's sleep over it. :)

agreed :)

(2) Yes, one can probably deduce this from the standard.  I'd start
with 6.2.5 ("Types") p27:

  "A pointer to void shall have the same representation and alignment
  requirements as a pointer to a character type. [...]"

to which is attached a footnote (all together now: "non-normative"):

  "The same representation and alignment requirements are meant to
  imply interchangeability as arguments to functions, return values
  from functions, and members of unions."

so this basically mean that converting a pointer p to void* gives a
pointer to the first byte,
just as converting p to unsigned char*, isn't it?

thanks for answering my silly questions!
Luca
 
L

Luca Forlizzi

I assume you meant int* here since converting it to an int will cause
it not to point anywhere.

yes, excuse me
a pointer to the first of the ints
inside X[0] ? I was only able to find that 6.3.2.3 p7 says that

You neglected to tell us what X was since you changed threads but if
it was an N dimensional array of int, then yes.

yes again
What implicit conversion are you referring to?

the convertion described in 6.3.2.1 p.3
What is not explicit?

the fact that converting a pointer to an array to the type of an array
element, gives a pointer to the initial element
We won't know till you fill in the holes.

excuse me for being sloppy.
Anyway, Ben's answer has satisfied my curiosity!

Luca
 
L

Luca Forlizzi

I hope my comments have also helped at least in
explaining why the issue is so murky and why
despite that there is still a pretty strong
consensus about what the Standard "intends" as
requirements for array-reshaping conversions.

yes they have.
I find your conclusions reasonble and covincing.
I will adopt as rule the comminity consensus, unless an official
statement
would adress the issue.

Thanks a lot to all of you for your help
 
B

Ben Bacarisse

Luca Forlizzi said:
(1) I would not loose a moment's sleep over it. :)

agreed :)
(2) Yes, one can probably deduce this from the standard.  I'd start
with 6.2.5 ("Types") p27:

  "A pointer to void shall have the same representation and alignment
  requirements as a pointer to a character type. [...]"

to which is attached a footnote (all together now: "non-normative"):

  "The same representation and alignment requirements are meant to
  imply interchangeability as arguments to functions, return values
  from functions, and members of unions."

so this basically mean that converting a pointer p to void* gives a
pointer to the first byte,
just as converting p to unsigned char*, isn't it?

Not explicitly but it is hard to see what else could be intended.

The trouble is that the standard does not say what a pointer points
to, and it does so for good reason:

extern void *p;
short *sp = p;
int *ip = p;

Provided everything is properly aligned, *sp and *ip refer to
different objects in one sense of the word. In another, they point to
the same object that p pointed to from the start. Tim has expanded on
this in another post, though with reference to the multiple objects in
a single array.

<snip>
 
L

Luca Forlizzi

Not explicitly but it is hard to see what else could be intended.
ok


The trouble is that the standard does not say what a pointer points
to, and it does so for good reason:

   extern void *p;
   short *sp = p;
   int *ip = p;

Provided everything is properly aligned, *sp and *ip refer to
different objects in one sense of the word.  In another, they point to
the same object that p pointed to from the start.  Tim has expanded on
this in another post, though with reference to the multiple objects in
a single array.

I see. It is important to keep in mind this ambiguity in the standard,
thanks
for pointing it out.

I have no more questions for now, thanks a lot!

Luca
 
D

David Thompson

Sorta yes, sorta no.

1. They must indeed be laid out contiguously in memory.
2. If you derive a pointer from one of the sub-arrays, you should not
then try to derive pointers outside that sub-array from it.
And it's not clear if you meant this to be important to your question,
but this is not new in C99 -- the same was true in C89. Before that,
you always had contiguity (back to 'prehistory' in dmr's terms)
and de facto usually had 'all pointer arithmetic works' based on
the B heritage of a single flat address-space.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,997
Messages
2,570,241
Members
46,831
Latest member
RusselWill

Latest Threads

Top