Order of Variable

A

Alf P. Steinbach /Usenet

* tni, on 11.09.2010 14:40:
I can't give you a simple example, but what I have seen is similar to the
behavior in this case (in my complex case, the structs started with the same
sequence):

#include <iostream>
#include <stdint.h>

struct A {
uint32_t x;
uint32_t y;
};

struct B {
float x;
uint32_t y;
};

int main() {
A a = A();
B& b = *reinterpret_cast<B*>(&a);
b.x = 42;
std::cout << a.x << " " << b.x << std::endl;
return 0;
}

Output (g++ -O2):
0 42

OK, this is an optimizer problem/bug. The optimizer "knows" that 'a' has not
been modified, by assuming a little too much, and optimizes away 'a.x'. You can
use '-fno-strict-aliasing' to turn off that silliness, or, you can deny the
optimizer the knowledge about the particular instance 'a'.

E.g. in the OP's example the instance was allocated dynamically.

Like ...


<code>
#include <iostream>
typedef unsigned uint32_t;

struct A {
uint32_t x;
uint32_t y;
};

struct B {
uint32_t x;
uint32_t y;
};

int main() {
A& a = *new A();
B& b = *reinterpret_cast< B* >( &a );
b.x = 42;
std::cout << a.x << " " << b.x << std::endl;
}
</code>


.... which due to the new-ing (as in the OP's article) produces no strict
aliasing warning and the result "42 42".

About how the g++ optimizer's "strict aliasing" assumption is invalid: the Holy
Standard has no notion of strict aliasing.


Cheers & hth.,

- Alf
 
J

Johannes Schaub (litb)

Alf said:
* tni, on 11.09.2010 14:40:

OK, this is an optimizer problem/bug. The optimizer "knows" that 'a' has
not been modified, by assuming a little too much, and optimizes away
'a.x'. You can use '-fno-strict-aliasing' to turn off that silliness, or,
you can deny the optimizer the knowledge about the particular instance
'a'.

I believe it's completely valid for the compiler to optimize to this output.
"a.x" requires "x" to refer to an uint32_t object by 3.10/15, but you have
no guarantee that this is the case anymore after b.x was assigned the value
42. If storage of b.x and and a.x overlap, lifetime of a.x ends because you
have reused its memory (just like in the union case, you would have switched
the active member). a.x needs to be a char or unsigned char for this to be
valid.
 
A

Alf P. Steinbach /Usenet

* Johannes Schaub (litb), on 11.09.2010 16:49:
I believe it's completely valid for the compiler to optimize to this output.

Oh yes, in that particular case, since the two x'es have different type.

But not in the case under discussion where the x'es have the same type.

"a.x" requires "x" to refer to an uint32_t object by 3.10/15, but you have
no guarantee that this is the case anymore after b.x was assigned the value
42. If storage of b.x and and a.x overlap, lifetime of a.x ends because you
have reused its memory (just like in the union case, you would have switched
the active member). a.x needs to be a char or unsigned char for this to be
valid.

It's valid to access a.x in that case, and in the case of same type. However, I
don't think the g++ optimizer cares. I believe (I don't have time to check it)
that it will produce the 0 no matter the type of a.x as long as it's numeric.

Which goes to show that the assumption it relies on is invalid.

It would be simpler if you had quoted the example in my reply.


Cheers,

- Alf
 
J

Johannes Schaub (litb)

tni said:
I can't give you a simple example, but what I have seen is similar to
the behavior in this case (in my complex case, the structs started with
the same sequence):

#include <iostream>
#include <stdint.h>

struct A {
uint32_t x;
uint32_t y;
};

struct B {
float x;
uint32_t y;
};

int main() {
A a = A();
B& b = *reinterpret_cast<B*>(&a);
b.x = 42;
std::cout << a.x << " " << b.x << std::endl;
return 0;
}

Output (g++ -O2):
0 42

g++ 4.5 does act differently, it seems. It doesn't optimize this anymore.
 
J

James Kanze

Nah to both. :)
[/QUOTE]
I've seen something like that miscompiled by g++ (IIRC
somewhere around version 4.0).

The issue is not simple, and formally g++ isn't conformant in
this respect (but it has nothing to do with §9.2/17---I've not
verified, but I suspect that g++ does handle that correctly).
But if I understand correctly, the C committee thinks that it is
the standard which is broken, not g++---the standard guarantees
too much.

The exact issue was something like:

union U { int a; double b; }

int f(int* a, double *b)
{
// g++ reorders the following two statements...
int result = *a;
*b = 3.14159;
}

// ...
U u;
u.a = 42;
int i = f(&u.a, &u.b);

Technically, the above code fulfills the requirements of the
standard; you're reading the last element written in the union,
then writing a different element. IIRC, the opinion of the C
committee was that this *should* only be guaranteed to work when
the actual access is through the union type.

This is, of course, more or less irrelevant here, and I suspect
that g++ will recognize layout compatible prefixes, and take
those into account when doing its aliasing analysis. (I suspect
this because the idiom is so prevelant in C. I seem to recall
having seen something similar in the sources of gcc, in fact.)
What's your point?
This:
float f;
int i = *(int*) &f;
is an old C technique as well with lots of code using it.
What makes you think it's not a violation of the aliasing
rules? Given your reference to the first member, you did
notice that the OP wanted to access the second one as well,
right?

That's maybe an issue in C++. The wording of the C standard
guarantees that all elements of a common prefix will work. And
I'm not sure, but I seem to recall that it was guaranteed even
if a union wasn't involved.
(I can see the argument that access to the first member might
be safe. But the language in the standard isn't really
explicit that you are not violating the aliasing rules and
that's always dangerous.)

The first member of a PODS is guaranteed to be at the same
address as the structure itself. That's one guarantee. There
is another one concerning PODS with layout compatible initial
sequences. In C++, this is only guaranteed if the structs are
in a union; in C, I think it is generally guaranteed (but I
don't have access to a C standard here to verify). In practice,
things like:

struct Node
{
int nodeType;
int childCount;
};

struct BinaryOperatorNode
{
int nodeType;
int childCount;
Node* left;
Node* right;
};

and then:

Node* currentNode;
// ...
if (isBinaryNode(currentNode->nodeType)) {
BinaryOperatorNode* binaryNode =
(BinaryOperatorNode*)currentNode;
// ...
}

is a more or less standard technique in C, used (IIRC) in the
gcc compiler itself. I don't think that g++ would break this.
 
J

James Kanze

I believe it's completely valid for the compiler to optimize
to this output.

The optimization is clearly valid; the above code contains
undefined behavior. (Think of it for a moment. In b.x = 42,
b.x is a float, so this is really b.x = 42.0f. And the bit
pattern for 42.0f coult be a trapping representation for
int.) In general, according to §3.10/15:

If a program attempts to access the stored value of an
object through an lvalue of other than one of the
following types the behavior is undefined:
-- the dynamic type of the object,
-- a cv-qualified version of the dynamic type of the
object,
-- a type that is the signed or unsigned type
corresponding to the dynamic type of the object,
-- a type that is the signed or unsigned type
corresponding to a cv-qualified version of the
dynamic type of the object,
-- an aggregate or union type that includes one of the
aforementioned types among its members (including,
recursively, a member of a subaggregate or contained
union),
-- a type that is a (possibly cv-qualified) base class
type of the dynamic type of the object,
-- a char or unsigned char type.

This paragraph clearly makes the above program undefined
behavior. This paragraph, however, says nothing about
something like:

struct A { int a; int b; };
struct B { int a; int b; int c; };

void f(A* pA, B* pB)
{
pA->b = 42;
pB->b = 0;
std::cout << pA->b << ',' << pB->b << std::endl;
}

int main()
{
union U { A a; B b; } u;
f( &u.a, &u.b );
return 0;
}

which is (according to the current wording of the standard),
required to work. What isn't really clear is something
like:

struct A { int a; int b; };
struct B { int a; int b; int c; };

void f(A* pA, B* pB)
{
pA->b = 42;
pB->b = 0;
std::cout << pA->b << ',' << pB->b << std::endl;
}

int main()
{
B b;
f( reinterpret_cast<A*>( &b ), &b );
return 0;
}

Given that the first is required to work, however, and that
the union and the function call may be in a different
translation unit from the function definition, it is
difficult to see how an implementation cannot make it work.
(Formally, the result of the reinterpret_cast here is
"unspecified". Practically, however, no information can be
lost, since it must be possible to recast it back to a B*,
and use that. So I can't imagine that being a problem in
practice. Formally, in f pA and pB point to different
types, so the compiler can assume that they don't alias (and
that the write through pB->b could not have modified the
value written through pA->b); practically, again, if the
objects pointed to by pA and pB are in a union, the pointers
are allowed to alias, so unless the compiler can prove that
they aren't in a union, it can't assume no aliasing.
 
J

James Kanze

I believe it's completely valid for the compiler to
optimize to this output. "a.x" requires "x" to refer to
an uint32_t object by 3.10/15, but you have no guarantee
that this is the case anymore after b.x was assigned the
value 42. If storage of b.x and and a.x overlap, lifetime
of a.x ends because you have reused its memory (just like
in the union case, you would have switched the active
member). a.x needs to be a char or unsigned char for this
to be valid.

It's completely valid for the compiler to do anything it
wants in this case, since the code has undefined behavior.
(I would be interested in seeing what happens with an
unoptimized build on a Unisys Libra system, which has a
tagged architecture.) From a quality of implementation
point of view: the reinterpret_cast is clearly visible,
which should tell any reasonable compiler that all bets
concerning aliasing are off. If you really got the output
above from the program above, I'd consider it a bug. If the
assignment and the output was in a separate function,
however, which had been passed references or pointers, it
would be perfectly normal.
 
J

Johannes Schaub (litb)

Alf said:
* James Kanze, on 11.09.2010 21:10:

Assuming for the sake of discussion that there is such a thing as aliasing
based reordering rule (I haven't found it), this interpretation where the
/call/ of a function determines the validity of possible code
transformations /within/ the function seems 100% out of spirit with the
standard.

I think it's certainly within the Spirit. See http://www.open-
std.org/jtc1/sc22/wg21/docs/cwg_active.html#636
 
A

Alf P. Steinbach /Usenet

* Johannes Schaub (litb), on 11.09.2010 22:08:

It's a different case, and no, it shows no such thing as call-dependent validity
of code transformations; instead it concerns the opposite.

James' example concerns validity of reordering when accessing objects of the
same type (members of a and b), which according to §3.10/15 can be aliased. With
James example the source code transformation is purportedly (I don't agree)
valid or invalid depending on how the function is called.

Gabriel's example, on the other hand, concerns accessing float as int, which
according to the same para yields UB. And he shows that the possibility of
aliasing in a call means that reordering is an invalid code transformation,
because although in general accessing float as int is UB, it's valid in a union.
That is, the types that the function know of are not the full story, and the
reason for the DR is exactly that call-dependent semantics are not in the spirit
of the standard -- it's so outlandish that it's not even considered.

Now consider

#include <stdio.h>

struct A { double d; int x; };

void foo( int* a, A* b )
{
b->x = 1;
*a = 2;
}

int main()
{
A aha;
foo( &aha.x, &aha );
printf( "%d\n", aha.x );
}

Since int* and A* are clearly pointers with different referent types, at least
the simplistic purported aliasing-based reordering rule mentioned so far should
allow reordering of the statements in foo, causing output 1 instead of 2.

I don't believe it -- although I may be forced to if James can hark up some
reference (in which case the standard, IMHO, needs to be fixed).


Cheers, & hth.,

- Alf
 
J

Johannes Schaub (litb)

Alf said:
* Johannes Schaub (litb), on 11.09.2010 22:08:

It's a different case, and no, it shows no such thing as call-dependent
validity of code transformations; instead it concerns the opposite.

James' example concerns validity of reordering when accessing objects of
the same type (members of a and b), which according to §3.10/15 can be
aliased. With James example the source code transformation is purportedly
(I don't agree) valid or invalid depending on how the function is called.

Gabriel's example, on the other hand, concerns accessing float as int,
which according to the same para yields UB. And he shows that the
possibility of aliasing in a call means that reordering is an invalid code
transformation, because although in general accessing float as int is UB,
it's valid in a union. That is, the types that the function know of are
not the full story, and the reason for the DR is exactly that
call-dependent semantics are not in the spirit
of the standard -- it's so outlandish that it's not even considered.

The way I understand it, it does not matter whether the reference is by an
union or not. Aliasing an int as a float is UB no matter by an union or not.
I thought that was the whole point of Gabriel's issue report? I.e if the
statements in foo are reordered, then "t.d" accesses an int object by a
double lvalue, violating aliasing.

Example to show what I mean:

union {
float f;
int i;
} u;

// now the object is a float, by 3.8/1.
u.f = 1.f;

// now it is an int, also by 3.8/1
u.i = 10;

// aliasing violation by 3.10/15 - UB, trying
// to access value of int object by float lvalue.
float f = u.f;

Now consider

#include <stdio.h>

struct A { double d; int x; };

void foo( int* a, A* b )
{
b->x = 1;
*a = 2;
}

int main()
{
A aha;
foo( &aha.x, &aha );
printf( "%d\n", aha.x );
}

Since int* and A* are clearly pointers with different referent types, at
least the simplistic purported aliasing-based reordering rule mentioned so
far should allow reordering of the statements in foo, causing output 1
instead of 2.

I don't believe it -- although I may be forced to if James can hark up
some reference (in which case the standard, IMHO, needs to be fixed).

I don't believe that the Standard allows reordering this either. b->x is int
lvalue and *a is int lvalue too. Both are allowed to alias.
 
J

Johannes Schaub (litb)

Johannes said:
The way I understand it, it does not matter whether the reference is by an
union or not. Aliasing an int as a float is UB no matter by an union or
not. I thought that was the whole point of Gabriel's issue report? I.e if
the statements in foo are reordered, then "t.d" accesses an int object by
a double lvalue, violating aliasing.

Example to show what I mean:

union {
float f;
int i;
} u;

// now the object is a float, by 3.8/1.
u.f = 1.f;

// now it is an int, also by 3.8/1
u.i = 10;

// aliasing violation by 3.10/15 - UB, trying
// to access value of int object by float lvalue.
float f = u.f;

Ah wait, i think i might have misinterpreted your sayings. You meant that
normally an "int*" could not share storage with a "float*" but in an union
they could. I agree.
 
A

Alf P. Steinbach /Usenet

* Johannes Schaub (litb), on 11.09.2010 22:54:
The way I understand it, it does not matter whether the reference is by an
union or not. Aliasing an int as a float is UB no matter by an union or not.
I thought that was the whole point of Gabriel's issue report? I.e if the
statements in foo are reordered, then "t.d" accesses an int object by a
double lvalue, violating aliasing.

Example to show what I mean:

union {
float f;
int i;
} u;

// now the object is a float, by 3.8/1.
u.f = 1.f;

// now it is an int, also by 3.8/1
u.i = 10;

// aliasing violation by 3.10/15 - UB, trying
// to access value of int object by float lvalue.
float f = u.f;

Yes. But if you consider

u.i = 42;
u.f = 2.71828;

float f = u.f;

Then you have valid code.

And then the compiler can't reorder the two assignment statements.

It can't reorder the assignment statements even if they're placed in a function
where it's not locally known that for a particular call the float and int are in
a union (Gabriel's point).

I don't believe that the Standard allows reordering this either. b->x is int
lvalue and *a is int lvalue too. Both are allowed to alias.

Yes.

And as I see it §3.10/15 means that that's still the case when accessing A::x as
B::x where B is layout-compatible with A (assuming same member names for this
example).

Otherwise there would have to be some rule about exactly which pointer referent
types are considered to be sufficiently different that the pointers can be
assumed to be unaliased -- int != A?, A != B?, what?


Cheers,

- Alf
 
J

Joshua Maurice

* Johannes Schaub (litb), on 11.09.2010 22:54:






Yes. But if you consider

   u.i = 42;
   u.f = 2.71828;

   float f = u.f;

Then you have valid code.

And then the compiler can't reorder the two assignment statements.

It can't reorder the assignment statements even if they're placed in a function
where it's not locally known that for a particular call the float and int are in
a union (Gabriel's point).

Indeed. This is a well known bug in the C and C++ specs. James talks
about it else-thread, quoted here:

The issue is not simple, and formally g++ isn't conformant in
this respect (but it has nothing to do with §9.2/17---I've not
verified, but I suspect that g++ does handle that correctly).
But if I understand correctly, the C committee thinks that it is
the standard which is broken, not g++---the standard guarantees
too much.

The exact issue was something like:

union U { int a; double b; }

int f(int* a, double *b)
{
// g++ reorders the following two statements...
int result = *a;
*b = 3.14159;
}

// ...
U u;
u.a = 42;
int i = f(&u.a, &u.b);

Technically, the above code fulfills the requirements of the
standard; you're reading the last element written in the union,
then writing a different element. IIRC, the opinion of the C
committee was that this *should* only be guaranteed to work when
the actual access is through the union type.

Basically, the problem is that with unions, the strict aliasing rules
no longer "work". The solution that James says that the C committee is
for is not abandoning the strict aliasing rules, but instead greatly
restricting usages of unions. Basically, in current practice, if a
union has 2 different members, and if you let pointers to 2 or more
members escape the current scope, then you're boned. I'm not exactly
sure what formal rule the C committee is pondering, but at the very
least it would have to effectively include that restriction.

Yes.

And as I see it §3.10/15 means that that's still the case when accessing A::x as
B::x where B is layout-compatible with A (assuming same member names for this
example).

Then you would be alone on your interpretation. I don't have a C spec
handy, but for C++03, that is not the case.

This is what I can find on the subject in the standard itself.

C++03 standard, 9.2 Class members / 16If a POD-union contains two or more POD-structs that share a common
initial sequence, and if the PODunion
object currently contains one of these POD-structs, it is permitted to
inspect the common initial part
of any of them. Two POD-structs share a common initial sequence if
corresponding members have layoutcompatible
types (and, for bit-fields, the same widths) for a sequence of one or
more initial members.
<<<<

The rule above only applies to PODs in a union.

C++03 standard, 9.2 Class members / 167A pointer to a POD-struct object, suitably converted using a
reinterpret_cast, points to its initial
member (or if that member is a bit-field, then to the unit in which it
resides) and vice versa. [Note: There
might therefore be unnamed padding within a POD-struct object, but not
at its beginning, as necessary to
achieve appropriate alignment. ]
<<<<

The section above, especially with the (non-binding) note, pretty
clearly states that the C-style hack of inheritance may not work in C+
+. There might be unnamed padding which differs between different POD
structs.

Frankly though, this entire thing is a mess. When you compare the
guarantees of the two quotes, /which appear right next to each other
in the standard/, I don't understand how you can reconcile them in a
sane implementation. So, when the POD types are members of a union,
there's no difference in padding bits, but when the same POD types are
not members of a union, there might be extra magical padding bits.
What?

We expect that a compiler has a single rule for handling member
offsets so that accessing a member of an object is efficient, so it
doesn't matter if the POD object is a complete object or a member sub-
object of a union - the expected result is that the compiler will
generate the same assembly to access a member sub-object of the POD
object from a pointer to the POD object in all cases. With this in
mind, I have no clue how you're supposed to reconcile those two
sections above, one of which says there is no difference in padding
between lay-out compatible POD-struct types, and the next section
which says there might be a difference.

However, at face value, most / all of the gcc examples in this thread
have been conforming.
Otherwise there would have to be some rule about exactly which pointer referent
types are considered to be sufficiently different that the pointers can be
assumed to be unaliased  --  int != A?, A != B?, what?

Yes. That is exactly what "3.10 Lvalues and rvalues / 15" does. Hell,
there's a note on it which reads: "The intent of this list is to
specify those circumstances in which an object may or may not be
aliased."
 
J

Johannes Schaub (litb)

Joshua said:
* Johannes Schaub (litb), on 11.09.2010 22:54:






Yes. But if you consider

u.i = 42;
u.f = 2.71828;

float f = u.f;

Then you have valid code.

And then the compiler can't reorder the two assignment statements.

It can't reorder the assignment statements even if they're placed in a
function where it's not locally known that for a particular call the
float and int are in a union (Gabriel's point).

Indeed. This is a well known bug in the C and C++ specs. James talks
about it else-thread, quoted here:

The issue is not simple, and formally g++ isn't conformant in
this respect (but it has nothing to do with §9.2/17---I've not
verified, but I suspect that g++ does handle that correctly).
But if I understand correctly, the C committee thinks that it is
the standard which is broken, not g++---the standard guarantees
too much.

The exact issue was something like:

union U { int a; double b; }

int f(int* a, double *b)
{
// g++ reorders the following two statements...
int result = *a;
*b = 3.14159;
}

// ...
U u;
u.a = 42;
int i = f(&u.a, &u.b);

Technically, the above code fulfills the requirements of the
standard; you're reading the last element written in the union,
then writing a different element. IIRC, the opinion of the C
committee was that this *should* only be guaranteed to work when
the actual access is through the union type.

Basically, the problem is that with unions, the strict aliasing rules
no longer "work". The solution that James says that the C committee is
for is not abandoning the strict aliasing rules, but instead greatly
restricting usages of unions. Basically, in current practice, if a
union has 2 different members, and if you let pointers to 2 or more
members escape the current scope, then you're boned. I'm not exactly
sure what formal rule the C committee is pondering, but at the very
least it would have to effectively include that restriction.

Yes.

And as I see it §3.10/15 means that that's still the case when accessing
A::x as B::x where B is layout-compatible with A (assuming same member
names for this example).

Then you would be alone on your interpretation. I don't have a C spec
handy, but for C++03, that is not the case.

This is what I can find on the subject in the standard itself.

C++03 standard, 9.2 Class members / 16If a POD-union contains two or more POD-structs that share a common
initial sequence, and if the PODunion
object currently contains one of these POD-structs, it is permitted to
inspect the common initial part
of any of them. Two POD-structs share a common initial sequence if
corresponding members have layoutcompatible
types (and, for bit-fields, the same widths) for a sequence of one or
more initial members.
<<<<

The rule above only applies to PODs in a union.

C++03 standard, 9.2 Class members / 167A pointer to a POD-struct object, suitably converted using a
reinterpret_cast, points to its initial
member (or if that member is a bit-field, then to the unit in which it
resides) and vice versa. [Note: There
might therefore be unnamed padding within a POD-struct object, but not
at its beginning, as necessary to
achieve appropriate alignment. ]
<<<<

The section above, especially with the (non-binding) note, pretty
clearly states that the C-style hack of inheritance may not work in C+
+. There might be unnamed padding which differs between different POD
structs.

Frankly though, this entire thing is a mess. When you compare the
guarantees of the two quotes, /which appear right next to each other
in the standard/, I don't understand how you can reconcile them in a
sane implementation. So, when the POD types are members of a union,
there's no difference in padding bits, but when the same POD types are
not members of a union, there might be extra magical padding bits.
What?

The result `offsetof` is necessarily determined by the type and member
inspected, so I don't believe that the offsets can be any different
depending on where an object is located at and as.

If we summarize the said paragraphs, my opinion is that one could conclude
that one can indeed access "A::x as B::x".
 
A

Alf P. Steinbach /Usenet

* Joshua Maurice, on 13.09.2010 20:56:
Indeed. This is a well known bug in the C and C++ specs. James talks
about it else-thread, quoted here:

You're calling a compiler bug a bug in the standard?

Well you make me laugh.

Also, note that we were discussing James' statements. It's completely
unnecessary to quote it again except to put it in a misleading new context.


[snip requoting of up-thread message]
Then you would be alone on your interpretation. I don't have a C spec
handy, but for C++03, that is not the case.

You can assert how much you want that other people agree with you, that none of
them agree with me, and that what the standard says is an "interpretation".

It's similar to your earlier assertion that a compiler bug is really a bug in
the standard.

It's a stupid argument.

This is what I can find on the subject in the standard itself.

C++03 standard, 9.2 Class members / 16If a POD-union contains two or more POD-structs that share a common
initial sequence, and if the PODunion
object currently contains one of these POD-structs, it is permitted to
inspect the common initial part
of any of them. Two POD-structs share a common initial sequence if
corresponding members have layoutcompatible
types (and, for bit-fields, the same widths) for a sequence of one or
more initial members.
<<<<

The rule above only applies to PODs in a union.

C++03 standard, 9.2 Class members / 167A pointer to a POD-struct object, suitably converted using a
reinterpret_cast, points to its initial
member (or if that member is a bit-field, then to the unit in which it
resides) and vice versa. [Note: There
might therefore be unnamed padding within a POD-struct object, but not
at its beginning, as necessary to
achieve appropriate alignment. ]
<<<<

The section above, especially with the (non-binding) note, pretty
clearly states that the C-style hack of inheritance may not work in C+
+.

No, it does not. It says nothing of the sort, or even related to that.

There might be unnamed padding which differs between different POD
structs.

Frankly though, this entire thing is a mess. When you compare the
guarantees of the two quotes, /which appear right next to each other
in the standard/, I don't understand how you can reconcile them in a
sane implementation. So, when the POD types are members of a union,
there's no difference in padding bits, but when the same POD types are
not members of a union, there might be extra magical padding bits.
What?

We expect that a compiler has a single rule for handling member
offsets so that accessing a member of an object is efficient, so it
doesn't matter if the POD object is a complete object or a member sub-
object of a union - the expected result is that the compiler will
generate the same assembly to access a member sub-object of the POD
object from a pointer to the POD object in all cases. With this in
mind, I have no clue how you're supposed to reconcile those two
sections above, one of which says there is no difference in padding
between lay-out compatible POD-struct types, and the next section
which says there might be a difference.

It's simple.

Whatever it is you are imagining is being said in the reinterpret_cast rule, is
not said there: it's only in your /imagination/.

However, at face value, most / all of the gcc examples in this thread
have been conforming.


Yes. That is exactly what "3.10 Lvalues and rvalues / 15" does. Hell,
there's a note on it which reads: "The intent of this list is to
specify those circumstances in which an object may or may not be
aliased."

We were discussing 3.10/15.


Cheers & hth.,

- ALf
 
J

Joshua Maurice

Then you would be alone on your interpretation. I don't have a C spec
handy, but for C++03, that is not the case.
This is what I can find on the subject in the standard itself.
C++03 standard, 9.2 Class members / 16
If a POD-union contains two or more POD-structs that share a common
initial sequence, and if the PODunion
object currently contains one of these POD-structs, it is permitted to
inspect the common initial part
of any of them. Two POD-structs share a common initial sequence if
corresponding members have layoutcompatible
types (and, for bit-fields, the same widths) for a sequence of one or
more initial members.
<<<<
The rule above only applies to PODs in a union.
C++03 standard, 9.2 Class members / 167
A pointer to a POD-struct object, suitably converted using a
reinterpret_cast, points to its initial
member (or if that member is a bit-field, then to the unit in which it
resides) and vice versa. [Note: There
might therefore be unnamed padding within a POD-struct object, but not
at its beginning, as necessary to
achieve appropriate alignment. ]
<<<<
The section above, especially with the (non-binding) note, pretty
clearly states that the C-style hack of inheritance may not work in C+
+. There might be unnamed padding which differs between different POD
structs.
Frankly though, this entire thing is a mess. When you compare the
guarantees of the two quotes, /which appear right next to each other
in the standard/, I don't understand how you can reconcile them in a
sane implementation. So, when the POD types are members of a union,
there's no difference in padding bits, but when the same POD types are
not members of a union, there might be extra magical padding bits.
What?

The result `offsetof` is necessarily determined by the type and member
inspected, so I don't believe that the offsets can be any different
depending on where an object is located at and as.

If we summarize the said paragraphs, my opinion is that one could conclude
that one can indeed access "A::x as B::x".

That is reading into the standard, and picking and choosing facts. I
view it as inconsistent. As the intent does not appear rather clear, I
would just stay clear of the whole thing entirely if at all possible.
If not possible, I would check with my compiler vendors if I could, or
at least confirm some assembly output.
 
A

Alf P. Steinbach /Usenet

* Joshua Maurice, on 13.09.2010 21:24:
That is reading into the standard, and picking and choosing facts.

No, that is a misleading statement.

There are no contrary facts.

You're effectively saying one can pick and choose facts in a discussion about
the flatness of the Earth. But one cannot. There are no contrary facts: the
Earth is round, or nearly so, it's not flat, and no facts support flatness.

I view it as inconsistent.

You'd have to point out an inconsistency first, in order to convince anybody.

As the intent does not appear rather clear, I
would just stay clear of the whole thing entirely if at all possible.

There's nothing unclear about the intent of the standard, as far as I can see.

If not possible, I would check with my compiler vendors if I could, or
at least confirm some assembly output.

Yes, that is a good idea.


Cheers & hth.,

- Alf
 
J

Joshua Maurice

* Joshua Maurice, on 13.09.2010 20:56:
Indeed. This is a well known bug in the C and C++ specs. James talks
about it else-thread, quoted here:

You're calling a compiler bug a bug in the standard?

Well you make me laugh.

Also, note that we were discussing James' statements. It's completely
unnecessary to quote it again except to put it in a misleading new context.

[snip requoting of up-thread message]
Then you would be alone on your interpretation. I don't have a C spec
handy, but for C++03, that is not the case.

You can assert how much you want that other people agree with you, that none of
them agree with me, and that what the standard says is an "interpretation".

It's similar to your earlier assertion that a compiler bug is really a bug in
the standard.

It's a stupid argument.

Could you please define compiler bug? I would loosely define it as a
compiler which does not follow its standard.

However, what if the standard is logically inconsistent? If the
standard is logically inconsistent, then I would not call any compiler
conforming nor non-conforming. One cannot abide by inconsistent rules,
and that is exactly the state of affairs with the C and C++ standards
with strict aliasing and unions. The rules are inconsistent, and
that's why I call it a bug in the standard.

Moreover, James has mentioned a possible preferred resolution of the C
committee, a resolution which makes the standard logically consistent,
and which also makes the gcc behavior talked about here conforming to
this new logically consistent standard.
This is what I can find on the subject in the standard itself.
C++03 standard, 9.2 Class members / 16
If a POD-union contains two or more POD-structs that share a common
initial sequence, and if the PODunion
object currently contains one of these POD-structs, it is permitted to
inspect the common initial part
of any of them. Two POD-structs share a common initial sequence if
corresponding members have layoutcompatible
types (and, for bit-fields, the same widths) for a sequence of one or
more initial members.
<<<<
The rule above only applies to PODs in a union.
C++03 standard, 9.2 Class members / 167
A pointer to a POD-struct object, suitably converted using a
reinterpret_cast, points to its initial
member (or if that member is a bit-field, then to the unit in which it
resides) and vice versa. [Note: There
might therefore be unnamed padding within a POD-struct object, but not
at its beginning, as necessary to
achieve appropriate alignment. ]
<<<<
The section above, especially with the (non-binding) note, pretty
clearly states that the C-style hack of inheritance may not work in C+
+.

No, it does not. It says nothing of the sort, or even related to that.

I know it's rather pedantic, but it specifically allows reading and
writing to the common leading part of layout-compatible POD-structs
but only when they're both members of the same union. By specifically
restricting it to only in a union, it is also saying that it does not
work outside of a union. With this in mind, you would need to find
another explicit rule which specifically allows casting from POD-
struct type T1 to layout-compatible POD-struct type T2 and reading and
writing to the common leading part. Can you find such a thing? I
cannot.
We were discussing 3.10/15.

Then I am confused how you could make such a statement - you said "But
then there would have to be rules clarifying exactly how different the
pointer types would have to be to not alias", and exactly that is
specified in 3.10/15, which is why I brought it up (again).
 
J

Joshua Maurice

* Joshua Maurice, on 13.09.2010 21:24:


No, that is a misleading statement.

There are no contrary facts.

You're effectively saying one can pick and choose facts in a discussion about
the flatness of the Earth. But one cannot. There are no contrary facts: the
Earth is round, or nearly so, it's not flat, and no facts support flatness.

Bad analogy. Whether the Earth is round a discussion of empirical
facts of the natural world, a scientific question, on which consensus
carries no weight, and only evidence matters. Our discussion is of
what a standard says. A standard is more like math (which is not a
science) and less like science. A standard defines the terms and the
universe. Whereas science describes the natural world, a standard like
C++ defines terms and whether or not parts of the natural world meet
those terms. I could define a standard where the word "up" means
"towards the center of the Earth", and I could evaluate whether or not
a plane's controls meet my standard. (Admittingly, such a standard
would be incredibly silly and mostly devoid of value.)
You'd have to point out an inconsistency first, in order to convince anybody.


There's nothing unclear about the intent of the standard, as far as I can see.

Hopefully we agree that the intent of 3.10 / 15 is to allow compilers
to optimize assuming that certain pointers do not alias. The note
there explicitly mentions as much. The rules given there effectively
say which pointers can and cannot alias. These rules are logically
consistent with the rest of the standard in large part. When obeying
the rest of the rules of the standard (excepting unions), you cannot
reach a state where two differently typed pointers (as defined by
3.10 / 15) can alias.

Except for unions. Unions break the rule, and they render the note in
3.10 / 15 "wrong". The explicit stated intent was to allow compilers
to optimize assuming certain pointers do not alias, but with unions,
any type of pointer can alias any other type of pointer. Thus we have
a logical inconsistency in the standard.
 
A

Alf P. Steinbach /Usenet

* Joshua Maurice, on 13.09.2010 21:45:
Bad analogy. Whether the Earth is round a discussion of empirical
facts of the natural world, a scientific question, on which consensus
carries no weight, and only evidence matters. Our discussion is of
what a standard says. A standard is more like math (which is not a
science) and less like science. A standard defines the terms and the
universe. Whereas science describes the natural world, a standard like
C++ defines terms and whether or not parts of the natural world meet
those terms. I could define a standard where the word "up" means
"towards the center of the Earth", and I could evaluate whether or not
a plane's controls meet my standard. (Admittingly, such a standard
would be incredibly silly and mostly devoid of value.)


Hopefully we agree that the intent of 3.10 / 15 is to allow compilers
to optimize assuming that certain pointers do not alias.

No, we don't.

For C++98 the intent is explicitly stated in footnote 48.

Non-normative, but intents are non-normative: "The intent of this list is to
specify the circumstances in which an object may or may not be aliased".

There is not a word about reordering of statements or about /compiler's/
optimization.

James' failure (so far) to come up with a reference about reordering adds to that

Furthermore, 3.10/15 explicitly supports the C hack (it says aliasing can occur
through "an aggregeate or union that includes one of the aforementioned types"),
it does not exclude it -- so even if the argument about intent was not
directly countered by an explicit note (as it is), it would be pretty moot.

The note there explicitly mentions as much.

No, it does not. I quoted it above for your benefit. Anything you see there
about optimization or reordering is only in your /imagination/; it's not there.

The rules given there effectively
say which pointers can and cannot alias. These rules are logically
consistent with the rest of the standard in large part.
Yes.


When obeying
the rest of the rules of the standard (excepting unions), you cannot
reach a state where two differently typed pointers (as defined by
3.10 / 15) can alias.

That is incorrect.

Several examples of such aliasing have been given up-thread.

Gabriel's DR is another example.

Except for unions. Unions break the rule, and they render the note in
3.10 / 15 "wrong".

No, they don't: 3.10/15 has explicit language about aggregates and unions,
quoted above.

Please try to add some logical inferences.

Sheep are black (they aren't), therefore Nelson Mandela is king of France (he
isn't, and there's currently no such thing as king of France), doesn't carry
much weight, you see.

The explicit stated intent was to allow compilers
to optimize assuming certain pointers do not alias,

No, it was not.

Requoting the intent, note 48, for your benefit (non-normative, but intents are
non-normative): "The intent of this list is to specify the circumstances in
which an object may or may not be aliased".

Where you read "compiler" and "optimization", that's just your /imagination/.

but with unions,
any type of pointer can alias any other type of pointer. Thus we have
a logical inconsistency in the standard.

No, we don't.


Cheers & hth.,

- Alf
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,145
Messages
2,570,826
Members
47,371
Latest member
Brkaa

Latest Threads

Top