About casts (and pointers)

D

Daniel Vallstrom

Keith said:
I suspect it would be safe under any non-perverse implementation, but
I don't think the standard guarantees it.

How does that follow?) As Wolf and Bau point out, a+2 is properly
aligned for int32_t* (since there is no padding etc. and because a
points to a valid "int32_t* address" and therefor "a+32bits" points
to a valid "int32_t* address"). Hence converting to int32_t* and back
must yield the same pointer.
As Chris Croughton points
out, the standard guarantees that you can convert from int16_t* to
int32_t* and back to int16_t* (assuming the intermediate pointer is
properly aligned),

Which it must be in this case.
but performing arithmetic on the intermediate
pointer before converting it back voids the warranty.

There is no arithmetic on the intermediate pointer here.
Note that if a were a pointer to a declared object rather than to a
chunk of memory allocated by malloc(), there could be alignment
problems.

Which is why malloc was used.
I'm sure this breaks on the mythical DS9K; it would be interesting to
see how.

Are you saying that DS9K isn't a conforming implementation?p


Daniel Vallstrom
 
M

Mark Piffer

Eric said:
The conclusion is correct (they are interconvertible),
but I think the reasoning is faulty. It's not that the target
structs have the same alignment requirement -- they needn't --

Please note that I didn't refer to the target objects (I wrote (struct
sockaddr *) meaning it to name the pointer-to type) and I said that
(struct sockaddr **) points to aligned objects, hence refering to
(struct sockaddr *) and not (struct sockaddr). I even think it was you
who corrected my faulty understanding (that I indeed had) towards
alignment of pointers-to-structs a few weeks ago.
but that the representations of the two types of struct pointer
are identical. At "the bare bits level," the conversion from
one type to the other is therefore a no-op, hence no information
is lost and when the pointer is converted back again it still
compares equal to the original and properly points to the same
target.

There are a bunch of special cases about pointers, mostly
(I think) to protect pre-Standard code -- a Standard that broke
a significant fraction of that large amount of code would have
had difficulty gaining acceptance! Off the top of my head:

- Qualified and unqualified versions of pointers to the
same type have the same representation. Thus, `int*'
and `const int*' and `volatile int*' ... look alike.

- `void*' and `char*' and `unsigned char*' and `signed char*'
share the same representation.
Another question which I didn't see (or missed) being sufficiently
answered: is it possible for a pointer to an object (other than the
char's above) to be converted to void* and then NOT pointing to the
first byte of the object? i.e.

struct { int bar; }foo;
void *vp = &foo;
unsigned char *ucp = (unsigned char*)&foo; // guaranteed to point to
1st byte
vp!=ucp; // possible to evaluate to 1???

regards,
Mark
 
K

Keith Thompson

Daniel Vallstrom said:
How does that follow?) As Wolf and Bau point out, a+2 is properly
aligned for int32_t* (since there is no padding etc. and because a
points to a valid "int32_t* address" and therefor "a+32bits" points
to a valid "int32_t* address"). Hence converting to int32_t* and back
must yield the same pointer.

I think you're right.
Which it must be in this case.


There is no arithmetic on the intermediate pointer here.

Right again. I mis-read the code the first time through.

[...]
Are you saying that DS9K isn't a conforming implementation?p

Of course not; the DS9K is conforming by definition. I was saying
that the code invokes undefined behavior -- but I was probably
mistaken.
 
C

Chris Croughton

I don't think that's possible. If an integer type with a width of 32
bits requires 64 bits of storage, then it has 32 padding bits and
isn't eligible to be called int32_t.

Are they necessarily padding bits? They might be required for the
hardware, and non-accessible for arithmetic types. I remember a machine
(Honeywell?) where the extra bits coded whether the value was a pointer
or an arithmetic value, so although it had 48 bits total only some of
them were usable for arithmetic types. Assuming 32 bits of actual
arithmetic data, would that not qualify as an int32_t? Or is it
guaranteed that CHAR_BIT * sizeof intX_t == X?
It's still likely that int16_t and int32_t have different alignment
requirements, and even possible (but not likely) that int16_t has
stricter alignment requirements than int32_t.

If int32_t were emulated using chars but int16_t was a native type it
could happen. Probably more likely with int64_t though.

Chris C
 
K

Keith Thompson

Chris Croughton said:
Are they necessarily padding bits?

Yes, I believe so. For a signed integer type, the bits of the object
representation are divided into three groups: value bits, padding bits
(there may be none), and the sign bit. If those extra 32 bits aren't
value or sign bits, they must be padding bits.

[snip]
 
C

Christian Kandeler

[ ... ]
'a' is guaranteed to be correctly aligned for int32_t, since it's the
result of a malloc.

Why does a malloc() for type int16_t guarantee correct alignment for type
int32_t?


Christian
 
K

Keith Thompson

Christian Kandeler said:
[ ... ]
'a' is guaranteed to be correctly aligned for int32_t, since it's the
result of a malloc.

Why does a malloc() for type int16_t guarantee correct alignment for type
int32_t?

Because it's not a malloc() for type int16_t. malloc() just gets an
argument specifying the number of bytes to allocate; it has no idea
what's going be stored in the allocated memory. The pointer returned
by malloc (if not non-null) is suitably aligned for any object type.
 
C

Chris Torek

Another question which I didn't see (or missed) being sufficiently
answered: is it possible for a pointer to an object (other than the
char's above) to be converted to void* and then NOT pointing to the
first byte of the object? i.e.

struct { int bar; }foo;
void *vp = &foo;
unsigned char *ucp = (unsigned char*)&foo; // guaranteed to point to
1st byte
vp!=ucp; // possible to evaluate to 1???

I think this is not possible.

One pitfall I think *is* possible (but certainly not common; indeed,
I have never encountered this in anything other than Lisp systems'
internals) is to have pointers to other types have low-order bits
set, that (in C) are removed by the process of converting to either
"void *" or "char *". For instance:

void *vp;
int (*a)[4];
...
a = vp; /* might compile to "add 3,vp,a" ~= a <- (int)vp+3 */
...
vp = a; /* might compile to "sub 3,vp,a" ~= vp <- (int)a-3 */

Lisp systems sometimes use this kind of address arithmetic (in
assembly as produced by the Lisp-to-machine-code compiler) to
implement type-tagging, especially if the machine has fast trap
handling.
 
C

Christian Kandeler

Keith said:
Because it's not a malloc() for type int16_t. malloc() just gets an
argument specifying the number of bytes to allocate; it has no idea
what's going be stored in the allocated memory.

Ah yes, of course. Silly me. A look at the prototype of malloc() probably
would have helped...


Christian
 
R

Richard Bos

Christian Kandeler said:
Why does a malloc() for type int16_t guarantee correct alignment for type
int32_t?

There is no such thing as a malloc() "for type <type>". malloc() is a
malloc() is a malloc() returns a void * which is correctly aligned for
all object types.

Richard
 
C

Christian Bau

"Daniel Vallstrom said:
Are you saying that DS9K isn't a conforming implementation?p

The DS 9000 is a conforming implementation.

The DS 9001 is so cleverly designed that experts can't agree whether it
is conforming or not!
 
K

Keith Thompson

Christian Bau said:
The DS 9000 is a conforming implementation.

The DS 9001 is so cleverly designed that experts can't agree whether it
is conforming or not!

And the DS 9002 is deliberately designed so that determining whether
it's a conforming implementation is equivalent to solving the halting
problem.

(Hmm, I wonder if that's true for most real-world implementations.)
 
R

Richard Bos

Keith Thompson said:
And the DS 9002 is deliberately designed so that determining whether
it's a conforming implementation is equivalent to solving the halting
problem.

I thought the DS 9002 was the one that does absolutely nothing useful,
but does document to perfection _how_ it does nothing useful.

And shouldn't it be called the DS 9001:2000 now?
(Hmm, I wonder if that's true for most real-world implementations.)

Probably.

Richard
 
T

Tim Rentsch

Eric Sosman said:
Off the top of my head:

- Qualified and unqualified versions of pointers to the
same type have the same representation. Thus, `int*'
and `const int*' and `volatile int*' ... look alike.

- `void*' and `char*' and `unsigned char*' and `signed char*'
share the same representation.

- Pointers to all kinds of structs look alike.

- Pointers to all kinds of unions look alike.

- A pointer to a struct can be converted to and from a
pointer to the struct's first element (if it isn't a
bit field) safely.

- A pointer to a union can be converted to and from a
pointer to any element of the union (except bit fields)
safely.

- There's the above-mentioned rule about structs with
identical initial sequences of elements, although (as
Christian Bau mentioned) there are a few additional
conditions on this one.

- ... and there may be a few others I can't think of at
the moment.

A nice list. Also:

- Pointers to arrays with the same element type but with
differing lengths have the same representation.
 
R

Richard Bos

[ And so forth. ]
A nice list. Also:

- Pointers to arrays with the same element type but with
differing lengths have the same representation.

Surely not? A pointer-to-int that happens to point at the first element
of an array of 5 ints must have the same representation as one that
points at the first element of an array of 13 ints, yes, but
pointer-to-array[5]-of-int is a completely different type from
pointer-to-array[13]-of-int, and need not have the same representation
(or alignment) at all.

Richard
 
T

Tim Rentsch

Eric Sosman said:
Yeah, I thought about the "in a union" thing when composing
my post, but decided to ignore it. As far as I can tell, the
compiler could only behave perversely if it could somehow prove
that that the two structs could never possibly appear as members
of the same union, even in translation units the compiler has not
yet seen that might or might not be linked into the same final
program. I believe such a proof is beyond the capabilities of
current compilers, and is likely to remain so until I die and no
longer care about it ...

Micro nit: ITYM "... could behave perversely only if ...".

Surely it's always possible that identically declared struct's *might*
be put in a union in another translation unit. Together with 6.2.7 p1
this would imply that struct's that share a common initial sequence
must have the same representation for the common initial sequence.
Right?
 
T

Tim Rentsch

[ And so forth. ]
A nice list. Also:

- Pointers to arrays with the same element type but with
differing lengths have the same representation.

Surely not? A pointer-to-int that happens to point at the first element
of an array of 5 ints must have the same representation as one that
points at the first element of an array of 13 ints, yes, but
pointer-to-array[5]-of-int is a completely different type from
pointer-to-array[13]-of-int, and need not have the same representation
(or alignment) at all.

Perhaps surprising but true. 6.2.5 p26, pointers to compatible types
shall have the same representation and alignment requirements. Both
'int[5]' and 'int[13]' are compatible with 'int[]', hence 'int (*)[5]'
and 'int (*)[13]' both have the same representation and alignment
requirements as 'int (*)[]', and hence as each other. So for example,

int a[5];
int (*pa5)[5] = &a;
int (*pa)[] = pa5;
int (*pa13)[13] = pa;

All the initializing assignments take place between compatible pointer
types (pointers to compatible types are compatible types); the value
and the representation hence stays the same (6.3 p2).

The initializing assignment to 'pa13' may evoke undefined behavior, if
the alignment requirements for an 'int[13]' array aren't satisfied by
the value '&a'. But the representation and alignment requirements of
the pointer variables themselves must be the same.


(Minor note: Technically, it's the implicit conversion of 'pa' rather
than the initializing assignment that can produce undefined behavior.)
 
C

Christian Bau

Tim Rentsch said:
Surely it's always possible that identically declared struct's *might*
be put in a union in another translation unit. Together with 6.2.7 p1
this would imply that struct's that share a common initial sequence
must have the same representation for the common initial sequence.
Right?

Not really.

The compiler must make things "work" _if_ two structs _are_ in fact
members of the same union and _if_ this is known when the code is
compiled, and in no other case. If you give the compiler a source file
containing main () and some other functions, and the compiler can deduce
that no function outside that file are actually called, then it is
clearly allowed to use different offsets for both structs.

And having the same layout doesn't guarantee defined behavior, as the
compiler can often assume that pointers to different structs don't point
to the same memory.
 
T

Tim Rentsch

Christian Bau said:
Not really.

The compiler must make things "work" _if_ two structs _are_ in fact
members of the same union and _if_ this is known when the code is
compiled, and in no other case. If you give the compiler a source file
containing main () and some other functions, and the compiler can deduce
that no function outside that file are actually called, then it is
clearly allowed to use different offsets for both structs.

I believe that whether or not the compiler can deduce that no function
outside the single-translation-unit program is called is irrelevant.
First, 6.2.7 p1 doesn't say translation units in the same executable;
it just says translation units. Second, there are lots of ways a
structure value might be transmitted between different executables,
eg,

- writing bytes out to a file and another executable reading
them in;

- storing a value in shared memory;

- transmitting bytes over a pipe or a socket;

- just leaving a value in memory somewhere and expecting the
next program to pick it up;

- for a more subtle example - the value of an 'offsetof'
macro call might be stored by one executable and used
by another.

The C standard makes (some) guarantees about program execution, but it
seems silly to think it makes guarantees *only* about program
execution. There also are guarantees about what representations are
used in (some) data types. I think most people reading sections
6.2.5, 6.2.6, 6.2.7 and 6.3 would conclude that guarantees are made
about the representations of compatible types (of structs) in
different translation units, regardless of whether they were ever
bound into a single executable. In response to that, do you have
anything to offer other than just an assertion to the contrary?
Saying "... it is clearly allowed ..." without giving any supporting
statements isn't very convincing.

And having the same layout doesn't guarantee defined behavior, as the
compiler can often assume that pointers to different structs don't point
to the same memory.

Certainly having the same layout doesn't guarantee defined behavior in
all cases. But I think it *does* guarantee defined behavior in *some*
cases that otherwise would be undefined.

I acknowledge the point about the compiler being allowed to make
assumptions for pointers pointing to different types. I saw that
point in an earlier posting of yours, and it's a good one. What
I'm asking about, however, is something different.
 
P

Peter Nilsson

Christian said:
Well, the compiler is "in practice" forced to use the same layout for
all structs starting with members of the same type (for example, for all
structs starting with a short and an int, offsetof gives the same result
for the second struct member).
Why?

However, the compiler is free to assume that two pointers to different
struct types cannot access the same memory without undefined
behavior.

Again, why?
So if you only have

#include <stdio.h>
#include <stdlib.h>

struct s1 { short x; int y; }
struct s2 { short a; int b; double c; }

void f (struct s1* p1, struct s2* p2) {
p1->y = 0;
p2->b = 1;

These assignments are made through int lvalues. The struct types are
incidental.
// Here the compiler can assume that because the types
// of *p1 and *p2 are different, p1 and p2 cannot point
// to the same memory without undefined behavior.
if (p1->y == 1) printf ("It worked!\n");
if (p1->y == 0) printf ("It didn't work!\n");
}

Consider the following implementation...

short: 2 bytes, 2 byte alignment
int: 4 bytes, 4 byte alignment
double: 16 bytes, 16 byte alignment

....and consider the layouts (where . is padding)...

struct s1: |xx..yyyy|
struct s2: |aa......bbbb....cccccccccccccccc|

Can you cite chapter and verse where an implementation cannot adopt
this choice of layouts?

Can you cite why this cannot be done "in practice"?
int main (void) {
void* p = malloc (sizeof (struct s1) + sizeof (struct s2));
if (p) f (p, p);
return 0;
}

could print either "It worked!" or "It didn't work!". The compiler can
assume that the assignment to p2->b cannot change p1->y (unless there is
undefined behavior).

I can't see a distinction between this and the usual pointer aliasing
problems. If you want the compiler to make the assumption that p1
and p2 point to different locations, then you have to make them
restrict qualified. [That's what restrict is for, after all.]
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Similar Threads

Pointer casts for OOP 2
Casts 81
casts and pointers 0
function casts 27
Union and pointer casts? 13
Unraveling Pointers and Arrays in C++: Seeking Expert Advice. 1
Help with pointers 1
Sizes of pointers 233

Members online

No members online now.

Forum statistics

Threads
474,164
Messages
2,570,901
Members
47,439
Latest member
elif2sghost

Latest Threads

Top