contiguity of arrays

I

Ivan A. Kosarev

Douglas A. Gwyn said:
It's a matter of the source code, and what it does.
If the source invokes undefined behavior, then the
Standard doesn't specify what the abstract machine
should do.

Douglas, there is obvious inconsistency in your wordings:
In that example the compiler cannot see any array bound.

However, your last wording above says that if the code should cause
undefined behavior, then it's not important whether the compiler can see the
bounds.

Naturally, assuming

void *p = malloc(sizeof(int) + sizeof(double));

which of the lines

(int*) p + 1 or
(double*) p + 1

contradicts C99 6.5.6, i.e. causes undefined behavior on the abstract
machine? If they don't, can you refer to a wording of the Standard that
says that there is an exception for the point?
 
M

Mark McIntyre

James said:
Because it's explicitly stated that the memory allocated by malloc() is
suitable for storing an object of any type, including in particular an
int[4].


The memory is *suitable* for an object of any type, but does that mean
that it actually *is* an object of all the possible types, at the same time?

This is equivalent to "Is Schdoedinger's cat in the box?"

In other words, there's no way to know.
 
W

Wojtek Lerch

David Hopwood said:
Wojtek said:
I presume that the same reasoning applies to this, too:

struct { int a, b, c, d } s;
if ( sizeof(s) == 4 * sizeof(int) ) {
int *ptr = &s.a + 3;
}

Yes, it does. [...]
But what about this -- do you think it's OK, too:

struct { int a[2]; char c[ 2 * sizeof(int) ]; } s;
int *ptr = s.a + 3;

No, that's undefined behaviour. It isn't possible to infer that s.a points
to an array of 4 ints in this example.

What you seem to be saying is that for a region of memory to constitute an
array of four ints, it doesn't have to be declared with a type that involves
an array of four ints, or designated by an lvalue with such a type, but
nevertheless it must be declared with some aggregate type ultimately
consisting of four ints and no padding bytes between them. But, I presume,
possibly with some extra stuff after the four ints. Is that more or less
correct? Do you have a way of describing it that doesn't feel arbitrary and
inconsistent?...
 
W

Wojtek Lerch

Mark McIntyre said:
James said:
Because it's explicitly stated that the memory allocated by malloc() is
suitable for storing an object of any type, including in particular an
int[4].


The memory is *suitable* for an object of any type, but does that mean
that it actually *is* an object of all the possible types, at the same
time?

This is equivalent to "Is Schdoedinger's cat in the box?"

In other words, there's no way to know.

If that were indeed the case, there would be no way to know that

int *ptr = malloc( 4 * sizeof(int) );
ptr += 1;

does not invoke undefined behaviour, would there?

Luckily it's not, because the standard explicitly allows treating ptr as a
pointer to an array element:

"The pointer returned if the allocation succeeds is suitably aligned so that
it may be assigned to a pointer to any type of object and then used to
access such an object or an array of such objects in the space allocated"
(7.20.3p1).
 
M

Mark McIntyre

If that were indeed the case, there would be no way to know that
int *ptr = malloc( 4 * sizeof(int) );
ptr += 1;
does not invoke undefined behaviour, would there?

Your example has nothing to do with your question. And indeed, even if it
did, it wouldn't matter because the standard, as you pointed out yourself,
explicitly defines the meaning of that grammar.

However I think you missed the point of my answer. The only way to know if
schroedinger's cat is in the box is to take a look. Likewise the only way
to know that a malloc'ed memory block *is* an block of all possible types
is to access it as if it were. Since this is impossible, you can't know.

And by the way, you might like to consider the meaninglessness of memory
being of a type. Memory is just bits. It has no type*.

*igoring such types as ECC or GaAs or whatever... :)
 
J

James Kuyper

I won't be participating in this discussion for about a week. I just
wanted people who've responded to my messages to know that I haven't
lost interest, I'm not rudely ignoring them, and I certaintly haven't
run out of arguments. I'm just on my honeymoon. I'll try to catch up
when I get back. Bye!
 
D

David Hopwood

Wojtek said:
Wojtek said:
James Kuyper wrote:

But the C code contains no definition of such an object.

It doesn't need to; it only needs to be possible to infer that such
an object must exist and that arr must point to it.

I presume that the same reasoning applies to this, too:

struct { int a, b, c, d } s;
if ( sizeof(s) == 4 * sizeof(int) ) {
int *ptr = &s.a + 3;
}

Yes, it does. [...]
But what about this -- do you think it's OK, too:

struct { int a[2]; char c[ 2 * sizeof(int) ]; } s;
int *ptr = s.a + 3;

No, that's undefined behaviour. It isn't possible to infer that s.a points
to an array of 4 ints in this example.

What you seem to be saying is that for a region of memory to constitute an
array of four ints, it doesn't have to be declared with a type that involves
an array of four ints, or designated by an lvalue with such a type, but
nevertheless it must be declared with some aggregate type ultimately
consisting of four ints and no padding bytes between them.

Yes. This view is based on the definition of "array type" as being the type
of any contiguous nonempty sequence of objects of the element type. In order
to perform an access via a given pointer, say of type T, you need to be able
to infer that it currently points to a region that *can* hold an object of
type T. The memory allocation functions are a special case that return
pointers to regions able to hold objects of any type.
But, I presume,
possibly with some extra stuff after the four ints. Is that more or less
correct? Do you have a way of describing it that doesn't feel arbitrary and
inconsistent?...

Not particularly. It would help if the definitions of "object", "array" etc.
in the standard were clearer.
 
D

David Hopwood

David said:
Yes. This view is based on the definition of "array type" as being the type
of any contiguous nonempty sequence of objects of the element type. In
order to perform an access via a given pointer, say of type T, you need to be
able to infer that it currently points to a region that *can* hold an object
of type T. The memory allocation functions are a special case that return
pointers to regions able to hold objects of any type.

.... any type that is no larger than the allocated size, of course.
 
K

Keith Thompson

I won't be participating in this discussion for about a week. I just
wanted people who've responded to my messages to know that I haven't
lost interest, I'm not rudely ignoring them, and I certaintly haven't
run out of arguments. I'm just on my honeymoon. I'll try to catch up
when I get back. Bye!

<OT>
Congratulations!
</OT>
 
W

Wojtek Lerch

Sorry, I meant 3 here, not 1...
Your example has nothing to do with your question. And indeed, even if it
did, it wouldn't matter because the standard, as you pointed out yourself,
explicitly defines the meaning of that grammar.

I asked that question before I noticed that there was such an explicit
answer in the Library section of the standard. The connection between my
question and the above example is 6.5.6p8: according to that paragraph,
whether ptr+=3 is valid depends on whether ptr points to an array element.
Whether the standard defines the behaviour depends on whether the allocated
object has an array type. If there's no guarantee that there is an array of
at least three ints there, there's no guarantee that ptr+=3 does anything
sensible, and therefore the behaviour is undefined. There's no
Schroedinger's cat here.
However I think you missed the point of my answer. The only way to know if
schroedinger's cat is in the box is to take a look. Likewise the only way

Um... The Schroedinger's cat I heard about is known to be in the box, but
it's neither completely alive nor completely dead until you look.
to know that a malloc'ed memory block *is* an block of all possible types
is to access it as if it were. Since this is impossible, you can't know.

No, the way to know what type the allocated block of memory is and what it
isn't is to find the answer in the standard. Accessing the memory wouldn't
help, because if it invokes undefined behaviour, it might be impossible to
notice that it has invoked undefined behaviour.
And by the way, you might like to consider the meaninglessness of memory
being of a type. Memory is just bits. It has no type*.

But that's exactly what the problem is with 6.5.6p8. When you have a
pointer that points to an int somewhere in memory, what does it *mean* for
that int to be an array element?
*igoring such types as ECC or GaAs or whatever... :)

:)
 
W

Wojtek Lerch

David Hopwood said:
Yes. This view is based on the definition of "array type" as being the
type
of any contiguous nonempty sequence of objects of the element type. In
order

Not just *any* contiguos sequence. A struct type that consists of four ints
and turns out to have no padding bytes is not an array type. It's a struct
type.
to perform an access via a given pointer, say of type T, you need to be
able
to infer that it currently points to a region that *can* hold an object of
type T. The memory allocation functions are a special case that return
pointers to regions able to hold objects of any type.

That is not a very accurate wording: regions of memory don't hold objects,
they are objects. An allocated object can be given any effective type by
using an lvalue of that type to store a value in the object. But when it
has an effective type, it's one type. When it's a struct, it's not an array
of int.
 
D

David Hopwood

Wojtek said:
Not just *any* contiguous sequence. A struct type that consists of four ints
and turns out to have no padding bytes is not an array type. It's a struct
type.

Yes, but a region that holds an object declared using such a struct type can
also be accessed as an object of the array type int[4]. All that is required
for this is that the four ints be contiguous.

If, OTOH, we did not have any way to infer that the region can hold four ints,
then we couldn't access it as an object of type int[4].
That is not a very accurate wording: regions of memory don't hold objects,
they are objects.

No. Objects are typed; memory regions aren't.
An allocated object can be given any effective type by
using an lvalue of that type to store a value in the object.

More accurately, an allocated region can be accessed as an object of any given
effective type, by using an lvalue expression that designates an object of
that type to store a value in that object.
 
D

Dan Pop

In said:
Dan said:
In <[email protected]> James Kuyper
Dan Pop wrote: ...
arr[0][0], arr[0][1], arr[1][0] and arr[1][1] together match the definition of such an object. Do you claim that after


But the C cod contains no definition of such an object.



It doesn't have to. Think about dynamically allocated objects

This isn't one of those.

It doesn't matter, as it satisfies exactly the same set of conditions:
the object is correctly aligned for the type int and large enough to hold
4 int's. There is nothing magic in the definition of malloc that doesn't
apply here.

Dan
 
D

Dan Pop

In said:
David said:
"int[2][2];" is guaranteed to allocate an array of 4 contiguous ints:
# 6.2.5 #20:

No. "contiguously allocated set of objects" is not the
same as "array".

Then, malloc is useless for dynamically allocating arrays, right?

Dan

P.S. Deliberately ignoring the alignment issue, because it is satisfied
in all cases discussed here.
 
D

Dan Pop

In said:
In that example the compiler cannot see any array bound.

Where does the standard say anything at all about array bounds being
visible to the compiler? Chapter and verse, please.

The rules are *exactly* the same for

int foo[4];

and

int *bar = malloc(4 * sizeof(int));

Dan
 
W

Wojtek Lerch

David Hopwood said:
Wojtek said:
Not just *any* contiguous sequence. A struct type that consists of four
ints and turns out to have no padding bytes is not an array type. It's a
struct type.

Yes, but a region that holds an object declared using such a struct type
can
also be accessed as an object of the array type int[4]. All that is
required
for this is that the four ints be contiguous.

If, OTOH, we did not have any way to infer that the region can hold four
ints,
then we couldn't access it as an object of type int[4].

But when we're not accessing it as an object of type int[4], it's not an
object of tye int[4]. "When an object is said to have a particular type,
the type is specified by the lvalue used to designate the object"
(6.3.2.1p1).
No. Objects are typed; memory regions aren't.

No. Objects are regions of memory; lvalues that designate objects are
typed. "When an object is said to have a particular type, the type is
specified by the lvalue used to designate the object."
More accurately, an allocated region can be accessed as an object of any
given
effective type, by using an lvalue expression that designates an object of
that type to store a value in that object.

Even more accurately, the effective type of an allocated object can be
*changed* to any given type by using an lvalue of that type to store a value
in the object. But at any given point in time, the object either doesn't
have an affective type, or has one particular effective type. When it's a
struct, it's not an array of four ints.
 
J

Joe Wright

Wojtek said:
Wojtek said:
Wojtek Lerch wrote:


What you seem to be saying is that for a region of memory to constitute
an array of four ints, it doesn't have to be declared with a type that
involves an array of four ints, or designated by an lvalue with such a
type, but nevertheless it must be declared with some aggregate type
ultimately consisting of four ints and no padding bytes between them.

Yes. This view is based on the definition of "array type" as being the
type of any contiguous nonempty sequence of objects of the element type.
In order

Not just *any* contiguous sequence. A struct type that consists of four
ints and turns out to have no padding bytes is not an array type. It's a
struct type.

Yes, but a region that holds an object declared using such a struct type
can
also be accessed as an object of the array type int[4]. All that is
required
for this is that the four ints be contiguous.

If, OTOH, we did not have any way to infer that the region can hold four
ints,
then we couldn't access it as an object of type int[4].


But when we're not accessing it as an object of type int[4], it's not an
object of tye int[4]. "When an object is said to have a particular type,
the type is specified by the lvalue used to designate the object"
(6.3.2.1p1).

No. Objects are typed; memory regions aren't.


No. Objects are regions of memory; lvalues that designate objects are
typed. "When an object is said to have a particular type, the type is
specified by the lvalue used to designate the object."

More accurately, an allocated region can be accessed as an object of any
given
effective type, by using an lvalue expression that designates an object of
that type to store a value in that object.


Even more accurately, the effective type of an allocated object can be
*changed* to any given type by using an lvalue of that type to store a value
in the object. But at any given point in time, the object either doesn't
have an affective type, or has one particular effective type. When it's a
struct, it's not an array of four ints.

You are an incredibly hard sell. You are stubborn and intransigent.
Hardheaded and set in your ways. I like you.

But you can still be wrong, as any of us can be.

If I know that I own sufficient memory to hold precisely four
consecutive integers and know the address of the first integer, I'm
home. No matter how I come into ownership of this memory.

The original example, well upthread, was ..

int a[2][2] = {{1,2},{3,4}};

... which declares, defines and initializes a. Do we all agree that
there are four int's there? Are they contiguous in memory at
ascending addresses? Yes, they are.

void *v = a;
int *b = v;

Does anyone doubt that b[3] == 4 ? Are any rules broken? No. The
object at a[1][1] is the object at b[3]. They are the same object of
type int.

typedef struct {
int a;
int b;
int c;
int d;
} stru;

stru *st = v;

Now, a[1][1] and b[3] and st->d are the same int object == 4. At the
same time.

Come on in. The water's fine.
 
A

Arthur J. O'Dwyer

Wojtek said:
David Hopwood said:
Wojtek Lerch wrote:

Not just *any* contiguous sequence. A struct type that consists of
four ints and turns out to have no padding bytes is not an array type.
It's a struct type.

Yes, but a region that holds an object declared using such a struct type
can also be accessed as an object of the array type int[4]. All that is
required for this is that the four ints be contiguous.

If, OTOH, we did not have any way to infer that the region can hold four
ints, then we couldn't access it as an object of type int[4].

But when we're not accessing it as an object of type int[4], it's not an
object of tye int[4]. "When an object is said to have a particular type,
the type is specified by the lvalue used to designate the object"
(6.3.2.1p1).
[big snip]
No. Objects are typed; memory regions aren't.

No. Objects are regions of memory; lvalues that designate objects are
typed. "When an object is said to have a particular type, the type is
specified by the lvalue used to designate the object." [...]
Even more accurately, the effective type of an allocated object can be
*changed* to any given type by using an lvalue of that type to store a
value in the object. But at any given point in time, the object either
doesn't have an affective type, or has one particular effective type.
When it's a struct, it's not an array of four ints.

You are an incredibly hard sell. You are stubborn and intransigent.
Hardheaded and set in your ways. I like you.

But you can still be wrong, as any of us can be.

If I know that I own sufficient memory to hold precisely four consecutive
integers and know the address of the first integer, I'm home. No matter
how I come into ownership of this memory.

Definitely. I haven't been watching this thread closely, but I doubt
anyone will argue with you there. (Assuming that by "I" you really my
"an anthropomorphization of my compiler," not "the human programmer,"
naturally!)
The original example, well upthread, was ..

int a[2][2] = {{1,2},{3,4}};

.. which declares, defines and initializes a. Do we all agree that there
are four int's there? Are they contiguous in memory at ascending addresses?
Yes, they are.

Yep.
void *v = a;
int *b = v;

Does anyone doubt that b[3] == 4 ? Are any rules broken? No. The object at
a[1][1] is the object at b[3]. They are the same object of type int.

Right.
typedef struct {
int a;
int b;
int c;
int d;
} stru;

stru *st = v;

Now, a[1][1] and b[3] and st->d are the same int object == 4. At the same
time.

Whoops! Structs can have padding bytes, so it's perfectly possible
for any evaluation of 'st->d' to yield undefined behavior. Without the
padding bits, though, you're right.

I'll bite. The original thread was about the /invalid/ code

int a[2][2] = {1,2,3,4};
int *b = (int*)a;
b[3];

which is /not/ required to do anything useful or reasonable. Are you
(Joe) disputing this? If so, say so, and provide some corroboration
for your opinion. I think the two words "fat pointers" ought to settle
your doubts.

(Apologies if I've misunderstood Joe's subtle arguments...)

-Arthur
 
A

Arthur J. O'Dwyer

I'll bite. The original thread was about the /invalid/ code

int a[2][2] = {1,2,3,4};
int *b = (int*)a;

Major whoops! I meant:

int *b = (int*) a[0];
b[3];

which is /not/ required to do anything useful or reasonable.

-Arthur
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,148
Messages
2,570,838
Members
47,385
Latest member
Joneswilliam01

Latest Threads

Top