Alignment of foo[1][1][1][1]

Harald van DÄ³k · Jul 8, 2011

However, an unfortunate consequence of this "freedom" seems to be that
it is, in general, not safe to re-interpret arrays as having different
dimensions by using array types. One might expect that one could:

typedef int a_5_by_5[5][5];
a_5_by_5 array = {0};
typedef int a_flat_5_by_5[sizeof (a_5_by_5) / sizeof (int)];
a_flat_5_by_5 * flat_array_ptr = (void *) &array;

But that doesn't seem to be the case. :S
Right.

Of course, using:

int * ip = array[0];

and iteration should be fine.

Not even that, that only allows accesses up to array[0][4]. You aren't
allowed to access an array beyond its last member, even if you know
the contents of the bytes that follow. This case is mentioned in J.2:

-- An array subscript is out of range, even if an object is apparently
accessible with the
given subscript (as in the lvalue expression a[1][7] given the
declaration int
a[4][5]) (6.5.6).

Seebs · Jul 8, 2011

-- An array subscript is out of range, even if an object is apparently
accessible with the
given subscript (as in the lvalue expression a[1][7] given the
declaration int
a[4][5]) (6.5.6).

The interesting thing is:

Given
int a[4][5];
int *p1 = &a[0][0];
int *p2 = (int *) &a;

so far as I can tell, p1 can access 5 ints, and p2 can access 20. We
know that the layout of a is all contiguous, and we know what its bounds
are.

-s

Harald van DÄ³k · Jul 8, 2011

-- An array subscript is out of range, even if an object is apparently
accessible with the
given subscript (as in the lvalue expression a[1][7] given the
declaration int
a[4][5]) (6.5.6).

Click to expand...

The interesting thing is:

Given
int a[4][5];
int *p1 = &a[0][0];
int *p2 = (int *) &a;

so far as I can tell, p1 can access 5 ints, and p2 can access 20. We
know that the layout of a is all contiguous, and we know what its bounds
are.

Is the conversion of &a to int * is guaranteed to point to a[0][0]?
Besides, the description of the + operator doesn't talk about
contiguous memory, it talks about array subscripts. p2 + 7 is valid if
p2 points to an array of length 20. The only array of int that starts
at (int *) &a -- if the result of the conversion is defined -- is
a[0], which has a length of no more than 5.

Seebs · Jul 9, 2011

Is the conversion of &a to int * is guaranteed to point to a[0][0]?

Pretty sure it is.

Besides, the description of the + operator doesn't talk about
contiguous memory, it talks about array subscripts. p2 + 7 is valid if
p2 points to an array of length 20. The only array of int that starts
at (int *) &a -- if the result of the conversion is defined -- is
a[0], which has a length of no more than 5.

Doesn't matter, so far as I can tell. If you malloc sizeof(int)*4*5 items,
and you convert the resulting address to an array[][5], and use a[0], you
are limited to 5 items, because you are currently using an lvalue of type
array[5] of int. What matters is the size of the object you started with,
for bounds checking, and the correctness of the accessing lvalue types,
for trap representations. So far as I can tell, that's basically it.

-s

Harald van DÄ³k · Jul 9, 2011

On 2011-07-08, Harald van D??k <[email protected]> wrote:

[ int a[4][5];
int *p1 = &a[0][0];
int *p2 = (int *) &a; ]

Is the conversion of &a to int * is guaranteed to point to a[0][0]?

Click to expand...

Pretty sure it is.

Looking deeper into it, I see a lot that isn't, strictly speaking,
guaranteed by the standard, but will work on any sane implementation.
Not even malloc is guaranteed to be useful:

int *a = malloc(sizeof *a);

The result of the conversion from void * to int * is not specified as
pointing to the allocated memory, as far as I can tell. But "everybody
knows" that it does. The same cannot be said for a cast from int(*)[]
[] to int *, but it may very well be possible that it is meant to be
allowed.

Besides, the description of the + operator doesn't talk about
contiguous memory, it talks about array subscripts. p2 + 7 is valid if
p2 points to an array of length 20. The only array of int that starts
at (int *) &a -- if the result of the conversion is defined -- is
a[0], which has a length of no more than 5.

Click to expand...

Doesn't matter, so far as I can tell. If you malloc sizeof(int)*4*5 items,

The rules concerning objects' types are different for dynamically
allocated memory, and it is unclear to me how exactly the rules are
meant to interact with arrays. But your example didn't use malloc, and
automatically allocated memory's type is fixed until the end of its
lifetime. If you declare int arr[4][5];, no matter what tricks you
use, you cannot reliably access it as an int[5][4], int[10][2], or
int[20].

Shao Miller · Jul 9, 2011

Of course, using:

int * ip = array[0];

and iteration should be fine.

Click to expand...

Not even that, that only allows accesses up to array[0][4]. You aren't
allowed to access an array beyond its last member, even if you know
the contents of the bytes that follow. This case is mentioned in J.2:

-- An array subscript is out of range, even if an object is apparently
accessible with the
given subscript (as in the lvalue expression a[1][7] given the
declaration int
a[4][5]) (6.5.6).

Oops. Right. I ought to have typed:

int * ip = (int *) &array;

How quickly I forget (thread "Bounds Checking as Undefined Behaviour?")
that a contiguous region of addressable memory can have bounds != the
absolute bounds of that addressable memory, because bounds-checking is
argued to be akin to type-checking. Thanks for the reminder.

int array[5][5];
char * ptr = (char *) array[0];
ptr += sizeof array[0];
#if 0 /* Uh oh */
++ptr;
#endif
ptr = (char *) &array;
ptr += sizeof array[0];
++ptr;

Tim Rentsch · Jul 9, 2011

Harald van Dijk said:
However, an unfortunate consequence of this "freedom" seems to be that
it is, in general, not safe to re-interpret arrays as having different
dimensions by using array types. One might expect that one could:

typedef int a_5_by_5[5][5];
a_5_by_5 array = {0};
typedef int a_flat_5_by_5[sizeof (a_5_by_5) / sizeof (int)];
a_flat_5_by_5 * flat_array_ptr = (void *) &array;

But that doesn't seem to be the case. :S
Right.

Of course, using:

int * ip = array[0];

and iteration should be fine.

Click to expand...

Not even that, that only allows accesses up to array[0][4]. You aren't
allowed to access an array beyond its last member, even if you know
the contents of the bytes that follow. [snip J.2 reference]

Unfortunately the Standard is woefully unclear on this issue,
not whether some UB exists around array indexing but which
cases constitute UB and which do not. Can this conclusion
be demonstrated using only normative text and not informative
text (such as Annex J)? I agree that a case can be made, but
can you make a case that's convincing without relying on
informative text or having to generalize from examples?

Harald van DÄ³k · Jul 9, 2011

Harald van Dijk said:
Harald van Dijk said:

However, an unfortunate consequence of this "freedom" seems to be that
it is, in general, not safe to re-interpret arrays as having different
dimensions by using array types. One might expect that one could:
typedef int a_5_by_5[5][5];
a_5_by_5 array = {0};
typedef int a_flat_5_by_5[sizeof (a_5_by_5) / sizeof (int)];
a_flat_5_by_5 * flat_array_ptr = (void *) &array;
But that doesn't seem to be the case. :S

Click to expand...

Right.

Click to expand...

Of course, using:
int * ip = array[0];
and iteration should be fine.

Click to expand...

Click to expand...

Not even that, that only allows accesses up to array[0][4]. You aren't
allowed to access an array beyond its last member, even if you know
the contents of the bytes that follow. [snip J.2 reference]

Click to expand...

Unfortunately the Standard is woefully unclear on this issue,
not whether some UB exists around array indexing but which
cases constitute UB and which do not. Can this conclusion
be demonstrated using only normative text and not informative
text (such as Annex J)? I agree that a case can be made, but
can you make a case that's convincing without relying on
informative text or having to generalize from examples?

The behaviour of the + operator is only defined for those cases where
both the pointer operand and the result point into (or just past) the
same array. Since that is not the case for &array[0][0] + 7 (not for
an array of int, anyway), the behaviour is undefined by omission.

I cannot, however, show that (&array[0][0])[5] is invalid. The +
points one past array[0], but the description of the unary * operator
talks about "if [the operand] points to an object", and if you were to
claim that &array[0][0] + 5 points to &array[1][0], I couldn't explain
why that's wrong.

Tim Rentsch · Jul 9, 2011

Harald van Dijk said:
-- An array subscript is out of range, even if an object is apparently
accessible with the
given subscript (as in the lvalue expression a[1][7] given the
declaration int
a[4][5]) (6.5.6).

Click to expand...

The interesting thing is:

Given
int a[4][5];
int *p1 = &a[0][0];
int *p2 = (int *) &a;

so far as I can tell, p1 can access 5 ints, and p2 can access 20. We
know that the layout of a is all contiguous, and we know what its bounds
are.

Click to expand...

Is the conversion of &a to int * is guaranteed to point to a[0][0]?

The short answer is yes. Similar questions about pointer
conversions have been debated in comp.std.c and IIRC there were
a few holdouts for the idea that it possible (as they read the
Standard) to construct a conforming implementation where such
relationships would not hold. However, outside the milieu of
comp.std.c, I think is it fair to say that the overwhelming
consensus is that the Standard does indeed guarantee the
aforementioned property.

Besides, the description of the + operator doesn't talk about
contiguous memory, it talks about array subscripts. p2 + 7 is valid if
p2 points to an array of length 20. The only array of int that starts
at (int *) &a -- if the result of the conversion is defined -- is
a[0], which has a length of no more than 5.

Again we are saddled with the problem that what the Standard
requires (or is intended to require) simply isn't clear. If
we fall back to "general understanding of the community", based
on that metric I believe the judgments would be that 'p1+7' is
not defined, 'p2+7' is defined (as is 'p2[7]'), and various
intermediate cases aren't always clear.

Tim Rentsch · Jul 9, 2011

Harald van Dijk said:
On 2011-07-08, Harald van D??k <[email protected]> wrote:

Click to expand...

[ int a[4][5];
int *p1 = &a[0][0];
int *p2 = (int *) &a; ]

Is the conversion of &a to int * is guaranteed to point to a[0][0]?

Click to expand...

Pretty sure it is.

Click to expand...

Looking deeper into it, I see a lot that isn't, strictly speaking,
guaranteed by the standard, but will work on any sane implementation.
Not even malloc is guaranteed to be useful:

int *a = malloc(sizeof *a);

The result of the conversion from void * to int * is not specified as
pointing to the allocated memory, as far as I can tell. [snip]

Did you read 7.20.3p1?

Besides, the description of the + operator doesn't talk about
contiguous memory, it talks about array subscripts. p2 + 7 is valid if
p2 points to an array of length 20. The only array of int that starts
at (int *) &a -- if the result of the conversion is defined -- is
a[0], which has a length of no more than 5.

Click to expand...

Doesn't matter, so far as I can tell. If you malloc sizeof(int)*4*5 items,

Click to expand...

The rules concerning objects' types are different for dynamically
allocated memory, and it is unclear to me how exactly the rules are
meant to interact with arrays.

Objects don't have types. Objects have an /effective type/ for
the purposes of a particular access, but objects do not have a
type. If E is an lvalue such that &E is legal, we may always
construct '(unsigned char (*)[ sizeof (E) ]) &E'. Do you agree?
And that such a pointer may be used to access any part of the
object to which the unconverted expression E refers?

But your example didn't use malloc, and
automatically allocated memory's type is fixed until the end of its
lifetime. If you declare int arr[4][5];, no matter what tricks you
use, you cannot reliably access it as an int[5][4], int[10][2], or
int[20].

Despite how some people read the Standard, the general
understanding (including AFAIK the WG14 committee) is that
you can (assuming no alignment incompatibility issues),
using the technique of casting the address of the entire
array.

Tim Rentsch · Jul 9, 2011

Harald van Dijk said:
Harald van Dijk said:

However, an unfortunate consequence of this "freedom" seems to be that
it is, in general, not safe to re-interpret arrays as having different
dimensions by using array types. One might expect that one could:

Click to expand...

typedef int a_5_by_5[5][5];
a_5_by_5 array = {0};
typedef int a_flat_5_by_5[sizeof (a_5_by_5) / sizeof (int)];
a_flat_5_by_5 * flat_array_ptr = (void *) &array;

Click to expand...

But that doesn't seem to be the case. :S

Of course, using:

Click to expand...

int * ip = array[0];

Click to expand...

and iteration should be fine.

Click to expand...

Not even that, that only allows accesses up to array[0][4]. You aren't
allowed to access an array beyond its last member, even if you know
the contents of the bytes that follow. [snip J.2 reference]

Click to expand...

Unfortunately the Standard is woefully unclear on this issue,
not whether some UB exists around array indexing but which
cases constitute UB and which do not. Can this conclusion
be demonstrated using only normative text and not informative
text (such as Annex J)? I agree that a case can be made, but
can you make a case that's convincing without relying on
informative text or having to generalize from examples?

Click to expand...

The behaviour of the + operator is only defined for those cases where
both the pointer operand and the result point into (or just past) the
same array. [snip elaboration]

That's true but it begs the question because it doesn't
say _which_ array. What "the array" is can be affected
by pointer conversion. Consider this example:

int i20[20];
int (*i45)[5] = (int (*)[5]) &i20; /* assume alignment is okay */
i45[1][7];

I claim that the last line exhibits undefined behavior. Or do
you think the behavior here should be defined?

Harald van DÄ³k · Jul 9, 2011

[ int a[4][5];
int *p1 = &a[0][0];
int *p2 = (int *) &a; ]

Is the conversion of &a to int * is guaranteed to point to a[0][0]?
Pretty sure it is.

Click to expand...

Click to expand...

Looking deeper into it, I see a lot that isn't, strictly speaking,
guaranteed by the standard, but will work on any sane implementation.
Not even malloc is guaranteed to be useful:

Click to expand...

int *a = malloc(sizeof *a);

Click to expand...

The result of the conversion from void * to int * is not specified as
pointing to the allocated memory, as far as I can tell. [snip]

Click to expand...

Did you read 7.20.3p1?

Thanks, I missed that. By putting it there, it only applies to (c|m|
re)alloc, as an exception to the general rules for pointer conversions
as defined in 6.3 "Conversions". It does not apply to other
conversions from void * to T *, so unless part of a T * -> void * ->
T* conversion sequence, is the behaviour of such a conversion defined?
Consider a conversion sequence T * -> void * -> char * -> void * -> T
*, which happens with a custom implementation of qsort or similar
functions.

Disclaimer: I'm not suggesting that relying on that conversion is a
bad idea. I'm saying the standard fails to state that relying on that
conversion is valid.

Besides, the description of the + operator doesn't talk about
contiguous memory, it talks about array subscripts. p2 + 7 is valid if
p2 points to an array of length 20. The only array of int that starts
at (int *) &a -- if the result of the conversion is defined -- is
a[0], which has a length of no more than 5.
Doesn't matter, so far as I can tell. If you malloc sizeof(int)*4*5items,

Click to expand...

Click to expand...

The rules concerning objects' types are different for dynamically
allocated memory, and it is unclear to me how exactly the rules are
meant to interact with arrays.

Click to expand...

Objects don't have types. Objects have an /effective type/ for
the purposes of a particular access, but objects do not have a
type.

Right, that was badly worded. It is the effective type that is handled
differently for dynamically allocated memory.

If E is an lvalue such that &E is legal, we may always
construct '(unsigned char (*)[ sizeof (E) ]) &E'. Do you agree?

No, but I think it's not relevant to this discussion. I agree that we
may always construct (unsigned char *) &E, and believe that serves
your point just as well.

And that such a pointer may be used to access any part of the
object to which the unconverted expression E refers?
Yes.

Despite how some people read the Standard, the general
understanding (including AFAIK the WG14 committee) is that
you can (assuming no alignment incompatibility issues),
using the technique of casting the address of the entire
array.

There is a special exception for character types, but can you show the
support this for any other type?

Harald van DÄ³k · Jul 9, 2011

Consider this example:

int i20[20];
int (*i45)[5] = (int (*)[5]) &i20; /* assume alignment is okay */
i45[1][7];

I claim that the last line exhibits undefined behavior. Or do
you think the behavior here should be defined?

I agree the behaviour is undefined, but for a different reason than
you: my point in this thread was that you cannot access an int[4][5]
as an int[20], but that works both ways: I believe you cannot access
an int[20] as an int[4][5] either.

Tim Rentsch · Sep 4, 2011

Harald van Dijk said:
Consider this example:

int i20[20];
int (*i45)[5] = (int (*)[5]) &i20; /* assume alignment is okay */
i45[1][7];

I claim that the last line exhibits undefined behavior. Or do
you think the behavior here should be defined?

Click to expand...

I agree the behaviour is undefined, but for a different reason than
you: my point in this thread was that you cannot access an int[4][5]
as an int[20], but that works both ways: I believe you cannot access
an int[20] as an int[4][5] either.

My apologies for a very delayed response.

I find your statement mildly astonishing. What section(s) of
the Standard do you believe such an access would violate
(assuming as usual no alignment problems on the pointer
conversion)?

Tim Rentsch · Sep 4, 2011

[again my apologies for a much delayed response]

Harald van Dijk said:
Harald van Dijk said:

On 2011-07-08, Harald van D??k <[email protected]> wrote:

Click to expand...

[ int a[4][5];
int *p1 = &a[0][0];
int *p2 = (int *) &a; ]

Click to expand...

Is the conversion of &a to int * is guaranteed to point to a[0][0]?

Click to expand...

Pretty sure it is.

Click to expand...

Looking deeper into it, I see a lot that isn't, strictly speaking,
guaranteed by the standard, but will work on any sane implementation.
Not even malloc is guaranteed to be useful:

Click to expand...

int *a = malloc(sizeof *a);

Click to expand...

The result of the conversion from void * to int * is not specified as
pointing to the allocated memory, as far as I can tell. [snip]

Click to expand...

Did you read 7.20.3p1?

Click to expand...

Thanks, I missed that. By putting it there, it only applies to (c|m|
re)alloc, as an exception to the general rules for pointer conversions
as defined in 6.3 "Conversions". It does not apply to other
conversions from void * to T *, so unless part of a T * -> void * ->
T* conversion sequence, is the behaviour of such a conversion defined?

I believe it is, yes.

Consider a conversion sequence T * -> void * -> char * -> void * -> T
*, which happens with a custom implementation of qsort or similar
functions.

That particular conversion sequence actually is guaranteed fairly
directly by 6.3.2.3. But I assume you mean to ask a more subtle
question.

Disclaimer: I'm not suggesting that relying on that conversion is a
bad idea. I'm saying the standard fails to state that relying on that
conversion is valid.

It doesn't say it as directly as some people would like. Different
people make different assumptions about how the Standard should be
interpreted. My best understanding is, reading the Standard as WG14
intends or expects it to be interpreted, the Standard actually does
guarantee that pointer conversions work the way most (informed)
people expect them to work. (More details below...)

Besides, the description of the + operator doesn't talk about
contiguous memory, it talks about array subscripts. p2 + 7 is valid if
p2 points to an array of length 20. The only array of int that starts
at (int *) &a -- if the result of the conversion is defined -- is
a[0], which has a length of no more than 5.

Click to expand...

Doesn't matter, so far as I can tell. If you malloc sizeof(int)*4*5 items,

Click to expand...

The rules concerning objects' types are different for dynamically
allocated memory, and it is unclear to me how exactly the rules are
meant to interact with arrays.

Click to expand...

Objects don't have types. Objects have an /effective type/ for
the purposes of a particular access, but objects do not have a
type.

Click to expand...

Right, that was badly worded. It is the effective type that is handled
differently for dynamically allocated memory.

The notion of effective type is irrelevant to these questions
about pointer conversion and array indexing. By the time an
access happens, all the pointer conversion and indexing
arithmetic has been done; as long as the ultimate element
type matches, if the conversions and indexings worked then
the access works -- and effective type doesn't enter into
the stipulations for pointer conversions or array indexing.

If E is an lvalue such that &E is legal, we may always
construct '(unsigned char (*)[ sizeof (E) ]) &E'. Do you agree?

Click to expand...

No, but I think it's not relevant to this discussion.

Right, I should have mentioned the condition about alignment,
which was meant to be implied.

I agree that we
may always construct (unsigned char *) &E, and believe that serves
your point just as well.

Probably you're right; I thought putting in the size
explicitly made the point more clearly.

There is a special exception for character types, but can you show the
support this for any other type?

I will ask a question: do you believe the Standard defines
"pointers" or "pointer values" in such a way that valid pointers
(that aren't null) point to some object? I believe it does -
pointers (even pointers of type (void*)) don't point "nowhere", they
point at something (as before assuming they aren't null or don't
have alignment problems). The behavior of various pointer
conversions follows from this implication.

I know some people read some sections of 6.3.2.3 (notably, for
example, the third sentence of 6.3.2.3p7) as limiting how pointer
conversions work. IMO that interpretation is incorrect; these
sentences are meant to add to what is allowed for pointer
conversions, not take away from them. For example, consider this
fragment:

int *p0 = malloc( sizeof *p0 );
long *p1 = (long*) p0;
int *p2 = (int*) p1;
// now p2 == p0 must be true, even if sizeof (long) > sizeof (int)

We know all these conversions are legal just by the first sentence
of 6.3.2.3p7 (as usual assuming no alignment problems (if indeed
any assumption is actually needed, because of calling malloc())).
However, there is some question about what the conversions do,
because the object actually allocated may not be large enough to
accommodate a (long) object. The third sentence of 6.3.2.3p7
means the conversions have to work the way we expect even in this
case, where the object size actually available isn't large enough
for the pointer type in question. When the object size /is/ large
enough for the converted-to pointer type, then the pointer also
has to work for purposes of dererencing (again assuming no
alignment issues, and no violations of effective type rules).
This conclusion follows from the principle that any valid pointer
value (that isn't null) points to some object.

I agree that the Standard doesn't say all this as clearly or as
directly as I would like. But I believe it is the most consistent
reading of what the Standard does say, and also is how WG14
intends and expects the Standard will be interpreted. I don't
have any specific examples to cite in support of this last part;
as often happens with deeply held assumptions, it comes out more
as a general sense from the writing (including the Standard, DR's,
and the Rationale document primarily) than from any specific
statement or set of statements.

Harald van DÄ³k · Sep 4, 2011

Harald van Dijk said:
Harald van Dijk said:

Consider this example:
int i20[20];
int (*i45)[5] = (int (*)[5]) &i20; /* assume alignment is okay */
i45[1][7];
I claim that the last line exhibits undefined behavior. Or do
you think the behavior here should be defined?

Click to expand...

Click to expand...

I agree the behaviour is undefined, but for a different reason than
you: my point in this thread was that you cannot access an int[4][5]
as an int[20], but that works both ways: I believe you cannot access
an int[20] as an int[4][5] either.

Click to expand...

My apologies for a very delayed response.

Not a problem.

I find your statement mildly astonishing. What section(s) of
the Standard do you believe such an access would violate
(assuming as usual no alignment problems on the pointer
conversion)?

I believe it would be a violation of the effective type rules.

Do you consider

struct A { int a; };
struct B { int b; int c; };
int get_a(struct B b) { return ((struct A *) &b)->a; }

to be valid? Going by the literal wording of the effective type rules,
there is nothing to disallow this. The member b.b has type int, and is
accessed by an lvalue expression of type int. However, we know that
this is intended to be invalid, from examples in the standard, from
examples in DRs, and from the behaviour of real-world compilers when
fed the above. (If you disagree, please do let me know why.)

The effective type rules do not distinguish between structure and
array types, so if the above is invalid, the array equivalent must be
equally invalid.

Harald van DÄ³k · Sep 4, 2011

[again my apologies for a much delayed response]

Again, no problem.

That particular conversion sequence actually is guaranteed fairly
directly by 6.3.2.3. But I assume you mean to ask a more subtle
question.

Oh? I can see that 6.3.2.3 guarantees that T * -> void * compares
equal to T * -> void * -> char * -> void *. But to compare equal does
not imply equivalence, so while the former may be safely converted
back to T *, how do you conclude the same for the latter?

It doesn't say it as directly as some people would like. Different
people make different assumptions about how the Standard should be
interpreted. My best understanding is, reading the Standard as WG14
intends or expects it to be interpreted, the Standard actually does
guarantee that pointer conversions work the way most (informed)
people expect them to work. (More details below...)

No argument there; for all practical purposes, we should treat more
pointer conversions as valid than the literal text of the standard
does. The problem is that different people come to different
conclusions about which other conversions should be valid. A recent
example was whether a direct pointer cast from struct { struct
{ int } } * to int * is valid. Good arguments can be made both for and
against.

The notion of effective type is irrelevant to these questions
about pointer conversion and array indexing.

See my reply to your other message.

[...]

I will ask a question: do you believe the Standard defines
"pointers" or "pointer values" in such a way that valid pointers
(that aren't null) point to some object?

Or to some function, presumably.

No, I do not. The standard requires that this assert passes:

int i;
double *p;
p = (double *) &i;
assert((int *) p == &i);

However, that is the _only_ guarantee made about the value of p. Since
dereferencing p is not allowed in strictly conforming programs anyway,
the standard does not need to, and does not, address the question of
whether p points to any object at all.

I believe it does -
pointers (even pointers of type (void*)) don't point "nowhere", they
point at something (as before assuming they aren't null or don't
have alignment problems).

The standard could have been written in such a way without any
inconsistencies, and it would have made sense if it had been, but I do
not think it is so, for the simple reason that I have not seen
anything to support this view.

The behavior of various pointer
conversions follows from this implication.

I know some people read some sections of 6.3.2.3 (notably, for
example, the third sentence of 6.3.2.3p7) as limiting how pointer
conversions work. IMO that interpretation is incorrect; these
sentences are meant to add to what is allowed for pointer
conversions, not take away from them.

The standard does not define the behaviour of pointer conversions
anywhere else (except for a special case with the memory allocation
functions, as you rightly noted). If the behaviour of a pointer
conversion is not defined by 6.3.2.3, and if it is not defined
anywhere else in the standard, it is undefined by omission.

[snip sensible example]
This conclusion follows from the principle that any valid pointer
value (that isn't null) points to some object.

If any valid pointer value is null or points to some object, then your
conclusion is probably correct. If there are valid pointer values that
are neither null nor pointing to any object, then your conclusion is
still probably correct.

I agree that the Standard doesn't say all this as clearly or as
directly as I would like. But I believe it is the most consistent
reading of what the Standard does say, and also is how WG14
intends and expects the Standard will be interpreted. I don't
have any specific examples to cite in support of this last part;
as often happens with deeply held assumptions, it comes out more
as a general sense from the writing (including the Standard, DR's,
and the Rationale document primarily) than from any specific
statement or set of statements.

Right. Ambiguities and omissions are a natural result of writing the
standard in English, so we have to make guesses and assumptions about
the intent.

Keith Thompson · Sep 4, 2011

Tim Rentsch said:
Harald van Dijk said:

Consider this example:

int i20[20];
int (*i45)[5] = (int (*)[5]) &i20; /* assume alignment is okay */
i45[1][7];

I claim that the last line exhibits undefined behavior. Or do
you think the behavior here should be defined?

Click to expand...

I agree the behaviour is undefined, but for a different reason than
you: my point in this thread was that you cannot access an int[4][5]
as an int[20], but that works both ways: I believe you cannot access
an int[20] as an int[4][5] either.

Click to expand...

My apologies for a very delayed response.

I find your statement mildly astonishing. What section(s) of
the Standard do you believe such an access would violate
(assuming as usual no alignment problems on the pointer
conversion)?

This doesn't *quite* address this specific case, but C99 J.2
(non-normative) says that the following has undefined behavior:

An array subscript is out of range, even if an object is
apparently accessible with the given subscript (as in the lvalue
expression a[1][7] given the declaration int a[4][5])(6.5.6).

My (perhaps not entirely justified) inference from this is that
arrays have their declared types, and only their declared types;
trying to treat them as if they had a different type yields
undefined behavior.

What section(s) of the Standard actually define the behavior of
treating an int[20] as an int[4][5], or vice versa?

Keith Thompson · Sep 4, 2011

Harald van DÄ³k said:
... The standard requires that this assert passes:

int i;
double *p;
p = (double *) &i;
assert((int *) p == &i);

However, that is the _only_ guarantee made about the value of p. Since
dereferencing p is not allowed in strictly conforming programs anyway,
the standard does not need to, and does not, address the question of
whether p points to any object at all.

[...]

I don't believe the standard guarantees that at all. It does
guarantee that a value of any pointer-to-object or pointer-to
incomplete type (i.e., any pointer other than a function pointer)
may be converted to void* and back again without loss of information.
There is no such guarantee for converting to double* and back again.

Consider a hypothetical system where CHAR_BIT == 8, int is 32
bits and requires 4-byte alignment, and double is 64 bits and
requires 8-byte alignment. A machine word is 64 bits or 8 bytes.
The hardware supports two kinds of addresses: a word pointer that
specifies a 64-bit word and a (larger) byte pointer that consists
of a word pointer plus a byte offset within the word. An int* is
a byte address. Converting from int* to double* drops the offset;
converting back from double* to int* doesn't restore it.

James Kuyper · Sep 4, 2011

On 09/04/2011 04:29 PM, Harald van DÄ³k wrote:
....

No, I do not. The standard requires that this assert passes:

int i;
double *p;
p = (double *) &i;

You're assuming that the alignment requirements of double are no
stricter than those for int.

....

If any valid pointer value is null or points to some object, then your
conclusion is probably correct. If there are valid pointer values that
are neither null nor pointing to any object, then your conclusion is
still probably correct.

I don't see the relevance of that assumption. Even if the conversion of
a pointer to an object type to a pointer to a different object type
necessarily points at an object, that still does not help resolve the
question of which object it points at. In several special cases, the
standard does provide a guarantee, but not in the general case.

Engineering a list container. Part 1.	71	Dec 7, 2013
Data alignment questin, structures	46	Jan 12, 2013
"struct hack" with non-character array[1]	6	Aug 19, 2011
Memory alignment	53	Oct 3, 2008
Adding adressing of IPv6 to program	1	Feb 16, 2023
List of undefined behaviour and other sneeky bugs	52	May 3, 2012
malloc and alignment	13	Jan 24, 2009
Alignment of a structure.	6	Jan 23, 2008

Alignment of foo[1][1][1][1]

Harald van DÄ³k

Seebs

Harald van DÄ³k

Seebs

Harald van DÄ³k

Shao Miller

Tim Rentsch

Harald van DÄ³k

Tim Rentsch

Tim Rentsch

Tim Rentsch

Harald van DÄ³k

Harald van DÄ³k

Tim Rentsch

Tim Rentsch

Harald van DÄ³k

Harald van DÄ³k

Keith Thompson

Keith Thompson

James Kuyper

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads