A basic (?) problem with addresses (gcc)

BartC · Dec 16, 2010

Seebs said:
You don't. C doesn't support or allow for overlap like this, so far as I
know.

The Fortran example was to demonstrate what I meant. But in fact I use that
construction in another language, and it looks like this:

[20]int i
real a @ &i[7]

(although actual uses are usually more straightforward).

From time to time it might be necessary to rewrite that in C, and I relied
on the OP's trick to do the conversion. Now it seems that that may not work,
not because there's some weird hardware involved, but because some compilers
might exploit a 'loophole' in the standard so that they can get away with a
cheap optimisation, even thought the hardware wouldn't have a problem with
it.

That sounds wrong. It also seems like something that C ought to be capable
of: directly specifying an address where you want something stored, even if
relative to the address of another object.

Jens Thoms Toerring · Dec 16, 2010

The Fortran example was to demonstrate what I meant. But in fact I use that
construction in another language, and it looks like this:

[20]int i
real a @ &i[7]

(although actual uses are usually more straightforward).

How does that mysterious language make that work on all architectures,
even on those where there are different alignment requirements for
ints and reals? I guess it will have to do the equivalent of calling
memcpy() to a temporary variable on those architectures for accesses
to 'a', doesn't it? If it gets it right everywhere then I would con-
sider it to be a higher-level language than C, at least in that res-
pect.

From time to time it might be necessary to rewrite that in C, and I relied
on the OP's trick to do the conversion. Now it seems that that may not work,
not because there's some weird hardware involved, but because some compilers
might exploit a 'loophole' in the standard so that they can get away with a
cheap optimisation, even thought the hardware wouldn't have a problem with
it.

That sounds wrong. It also seems like something that C ought to be capable
of: directly specifying an address where you want something stored, even if
relative to the address of another object.

I guess you will agree that this kind of stuff is hardly ever
needed in any normal code (at least I can't remember ever ha-
ving had a reason to use the bit representation of a float as
an int and vice versa in 20 years of writing C, not even when
writing device drivers etc., and in the very few cases I nee-
ded to fiddle with the bit representation using memcpy() was
not that much of a bother). Thus I wouldn't consider it to be
"exploiting a loophole" by the compiler writers but merely
common sense on their part to use everything possible by which
they can speed up standard compliant code - that's what is ex-
pected of them (and some even get paid for it;-).

And "directly specifying an objects address" can be done by use
of pointers. What might fail is using the pointers naively - to
allow that the compilers would have apply a lot of extra checks
and possibly insert additional code (equivalent to the memcpy()
you don't seem to like to put in yourself plus extra temporary
variables when the architecture requires it). That would make
the compilers slower and more difficult to write without any
real benefits I can see at the moment. I can even imagine (hy-
pothetical) cases where the compiler being "too clever" about
that might interfere with writing near-to-the-metal code, e.g.
accessing memory mapped devices. But if you can show some real-
life examples what all that would be essential for you might be
able to even convince the people on the standard committee to
change the standard in your sense, but "that sounds wrong" seems
a bit weak as an argument...
Regards, Jens

Joshua Maurice · Dec 17, 2010

You doin't -- it violates one of the rules. Any attempt to do this
is *necessarily* undefined behavior.

The way you express that is with a union. Apart from that, it's undefined
behavior and you *can't* express it in plain C.

One of the points of using C, rather than assembly, is that the language
spec defines the language in a way that, at least a little, cares about
portability.

You might be able to fake something up by declaring things with "volatile"
somewhere in them, but...

Basically, if you are assuming you know the hardware has no problems with
this, you're not writing C, but a machine-specific variant which the
compiler may not support, and isn't obliged to.

I was wondering if you would give me your expert opinion on a very
related issue, namely the union DR and a related issue. I've had a
bunch of musing and questions up on comp.std.c++ for a while now, and
I've received no replies. It involves several problems inherited from
C, some of which are closely related to the strict aliasing rule and
the union DR.

http://groups.google.com/group/comp.std.c++/browse_thread/thread/69c81befe0cea264#

First post in the thread is:
Newsgroups: comp.std.c++
Followup-To: comp.std.c++
From: "Johannes Schaub (litb)" <[email protected]>
Date: Wed, 15 Sep 2010 17:26:01 CST
Local: Wed, Sep 15 2010 3:26 pm
Subject: Defect Report #1116: Comment about changes

Let me walk through a couple of examples, if I may.

#include <stdlib.h>
#include <stdio.h>
int main()
{
int* a = malloc(sizeof(int));
*a = 1;
printf("%d\n", *a);
free(a);
float* b = malloc(sizeof(float));
*b = 2;
printf("%f\n", *b);
free(b);
}

Presumably the above program is perfectly well defined, and it will
print something like:
1
2.000000
on all conforming implementations (barring QoI issues, like stack
space, out of memory, and so on).

However, suppose the malloc implementation ran such that a points to
the same memory address as b? How is this not a violation of the
strict aliasing rules? When does the lifetime of one object begin and
the lifetime of one object end? Is malloc and free special in this
regard? What if I used my own memory allocator, such as follows?

#include <stdlib.h>
#include <stdio.h>
#include <assert.h>

/* This is a very bad memory allocator.
It tracks only a single block of a fixed size.
It's not threadsafe.
It requires an explicit init and deinit.
It's global.
Let's ignore all that for now.*/
void* my_malloc_ptr;
int my_malloc_ptr_is_allocated_to_user;
void my_malloc_init()
{ assert( ! my_malloc_ptr);
my_malloc_ptr = malloc(1024);
my_malloc_ptr_is_allocated_to_user = 0;
}
void my_malloc_deinit()
{ assert(my_malloc_ptr);
assert( ! my_malloc_ptr_is_allocated_to_user);
free(my_malloc_ptr);
my_malloc_ptr = 0;
}
void* my_malloc()
{ assert(my_malloc_ptr);
if (my_malloc_ptr_is_allocated_to_user)
return 0;
my_malloc_ptr_is_allocated_to_user = 1;
return my_malloc_ptr;
}
void my_free(void* x)
{ assert(my_malloc_ptr);
assert(my_malloc_ptr_is_allocated_to_user);
my_malloc_ptr_is_allocated_to_user = 0;
}

int main()
{
my_malloc_init();
int* a = my_malloc();
*a = 1;
printf("%d\n", *a);
my_free(a);
float* b = my_malloc();
*b = 2;
printf("%f\n", *b);
my_free(b);
my_malloc_deinit();
}

Does the above program have undefined behavior in C? Can a user ever
write his own memory allocator in pure conforming C code? I access the
same piece of memory through sufficiently differently typed pointers,
so is that a violation of the strict aliasing rules or not?

Ok. Let's look at a couple of simpler examples.

#include <stdlib.h>
int main()
{
void* p;
int* x;
float* y;

p = malloc(sizeof(int) + sizeof(float));
x = p;
y = p;
*x = 1;
*y = 2;
return *y;
}

Does the above program have any undefined behavior? What about the
following program?

#include <stdlib.h>
void foo(int* x, float* y)
{
*x = 1;
*y = 2;
}
int main()
{
void* p;
int* x;
float* y;

p = malloc(sizeof(int) + sizeof(float));
x = p;
y = p;
foo(x, y);
return *y;
}

The problem I see is that the strict aliasing rule's intent AFAIK is
to allow a compiler to transform foo to:
void foo(int* x, float* y)
{
*y = 2;
*x = 1;
}
as an "int*" and a "float*" may not alias. However, this would break
the program from the read in main of "*y" which reads a memory
location through a float lvalue which was last written to through an
int lvalue. Hopefully you should see how this is related to the union
DR now, ex:

#include <stdlib.h>
void foo(int* x, float* y)
{
*x = 1;
*y = 2;
}
int main()
{
union { int x; float y; };
foo(&x, &y);
return y;
}

The linked thread in comp.lang.c++ itself links to a DR resolution on
the c++ standard committee's website which IMHO does nothing to
address these problems.

In short: We want the compiler to be able to optimize assuming that
sufficiently differently typed pointers do not alias. This allowance
has been phrased as "You may not read an object through an lvalue of a
sufficiently different type." We must also support starting an
object's lifetime by writing to member sub-objects through an lvalue
of that member sub-object, such as:

#include <stdlib.h>
typedef struct T { int x; int y; } T;
int main()
{ T* t;
t = malloc(sizeof(T));
t->x = 1;
return t->x;
}

Taken together, this leads to the union DR (mentioned above), and the
highly related problem that any userland memory allocator appears to
at least run afoul of the union DR, and also possibly have undefined
behavior because a piece of memory may be treated as several
sufficiently different types while in the care of the userland memory
allocator.

Frankly, I don't see how you can get your cake and eat it too. You
could throw out the strict aliasing rule as an optimization allowance,
and then it all works, but that's definitely not preferred. I think
it's unacceptable to say that you can't have userland memory
allocators either. If we could start over, some language-special
syntax which takes a pointer and a type could signal the start and end
of object lifetimes, and with that I think everything would fall into
place - but we can't do that without breaking all existing C code, so
that's out. Something has to give, and I don't see what.

PS: There's a minor issue as well, namely the following program:

#include <stdlib.h>
typedef struct T { int x; int y; } T;
typedef struct U { int x; int y; } U;
int main()
{ void* v;
T* t;
U* u;
int* x;
int* y;

v = malloc(sizeof(T) + sizeof(U));
t = v;
u = v;

if (&t->y != &u->y)
return 1;

x = &t->x;
y = &t->y;

*x = 1;
*y = 2;
/* Ok. Do we have a T object or a U object?
Why? All I see are writes through int lvalues. */

x = &u->x; /* UB? */
return *x; /* UB? */
}

What do you want to make of it? IIRC, the intent was to disallow
reading a T object through a U lvalue, but I don't quite know what
that means when all of the reads and writes are made through sub-
object fundamental type lvalues, never aggregate lvalues in C. (The
situation is a little different for C++ due to virtual et al.) I think
we could solve this by saying that as long as all of the fundamental
writes are reads are consistent with a single complete object type,
then it's defined behavior. This punts a little bit though, as we need
to define when object lifetimes begins and end, which I expounded upon
at length above.

Joshua Maurice · Dec 17, 2010

ISO/IEC 9899:1999 (E)

6.2.4 Storage durations of objects
1 An object has a storage duration that determines its lifetime.
There are three storage durations: static, automatic, and
allocated. Allocated storage is described in 7.20.3.

7.20.3 Memory management functions

The lifetime of an allocated object extends
from the allocation until the deallocation.

So, malloc and free are special, and a user cannot write his own
memory allocator in purely conforming C code?

BartC · Dec 17, 2010

Jens Thoms Toerring said:
integer*4 i(20)
real*8 a
equivalance (a,i(7))

(So the 8 bytes at i(7..8) are shared with the floating point number.)

Click to expand...

[20]int i
real a @ &i[7]

Click to expand...

How does that mysterious language make that work on all architectures,
even on those where there are different alignment requirements for
ints and reals? I guess it will have to do the equivalent of calling
memcpy() to a temporary variable on those architectures for accesses
to 'a', doesn't it?

No. This is just a way of bypassing the usual mechanisms in the language.

In x86 assembler, you can do, for example:

fld qword [i+28]

to access i[7..8] as a 64-bit floating point value. Why shouldn't a
low-level language (ie. C) do the same without the bureaucracy of wrapping
things up in unions, or having to the a block transfer first?

And if the alignment is wrong, then it won't work. But some programmer trust
is needed. (On x86, any alignment will work, at some cost of efficiency.)

I guess you will agree that this kind of stuff is hardly ever
needed in any normal code (at least I can't remember ever ha-
ving had a reason to use the bit representation of a float as
an int and vice versa in 20 years of writing C, not even when
writing device drivers etc., and in the very few cases I nee-
ded to fiddle with the bit representation using memcpy() was
not that much of a bother).

This stuff *is* needed, otherwise you wouldn't have unions. I'm just
suggesting something less cumbersome and restrictive than unions.

change the standard in your sense, but "that sounds wrong" seems
a bit weak as an argument...

I don't even know *why* the OP's example is bad code. Doesn't C allow you to
cast a pointer to T, to a pointer to U?

This doesn't really give me much confidence to either write this stuff in C
directly, or to use C as a target language for a translator. (BTW how do
those Fortran-2-C translators deal with the equivalence problem?)

Nick Bowler · Dec 17, 2010

suppose the malloc implementation ran such that a points to the same
memory address as b? How is this not a violation of the strict
aliasing rules? When does the lifetime of one object begin and the
lifetime of one object end? Is malloc and free special in this regard?
What if I used my own memory allocator, such as follows?

The key difference is that allocated storage has no declared type. In
this case, the following applies (C99 6.5#6, emphasis mine):

If a value is stored into an object having no declared type
through an lvalue having a type that is not a character type,
then the type of the lvalue becomes the effective type of the
object **for that access** and for subsequent accesses that do
not modify the stored value.

So both your first and second examples are perfectly fine w.r.t strict
aliasing rules, annotations inline:

/* This is a very bad memory allocator. It tracks only a single block
of a fixed size. It's not threadsafe.
It requires an explicit init and deinit. It's global.
Let's ignore all that for now.*/
void* my_malloc_ptr;
int my_malloc_ptr_is_allocated_to_user; void my_malloc_init()
{ assert( ! my_malloc_ptr);
my_malloc_ptr = malloc(1024);

Here, the object pointed to by my_malloc_pointer (assuming that
allocation was successful) has no declared type, so its effective
type is determined by subsequent accesses as per 6.5#6.

my_malloc_ptr_is_allocated_to_user = 0;
} [...]
int main()
{
my_malloc_init();
int* a = my_malloc();
*a = 1;

For this assignment, the effective type of the allocated object is int,
so the access is ok.

printf("%d\n", *a);

The last assignment to the allocated object set its efective type to
int, so this access is also ok.

my_free(a);
float* b = my_malloc();
*b = 2;

Despite the earlier access as int, the effective type of the allocated
object is float for this assignment, so this access is ok.

printf("%f\n", *b);

The last assignment to the allocated object set its effective type to
float, so this access is also ok.

my_free(b);
my_malloc_deinit();
}

You would violate the strict aliasing rules if your program assigned to
the allocated object via b and then subsequently read from the allocated
object via a (or vice versa), but your program does not do this.

Seebs · Dec 17, 2010

to access i[7..8] as a 64-bit floating point value. Why shouldn't a
low-level language (ie. C) do the same without the bureaucracy of wrapping
things up in unions, or having to the a block transfer first?

Because allowing you to do that makes a huge number of important
optimizations impossible. Consider that I've never, in twenty-some years
of using C, had any desire at all to do this thing. I've not yet seen
a single example of a case in which doing this would even be useful in
any way, just examples of how one would declare it in a language which does
it.

On the other hand, we could slow every real-world program down a few percent
in exchange for the ability to do this thing that I've never seen anyone
suggest a reason to do.

This stuff *is* needed, otherwise you wouldn't have unions.

Not obvious at all. One of the things unions get used for is cases where
you want to hold one of several things, but you don't know which in advance.
But you don't necessarily want to be able to store through one and read
through another.

I'm just
suggesting something less cumbersome and restrictive than unions.

It sounds very cumbersome and restrictive from the standpoint of an
implementor trying to decide whether to generate extremely slow memory
reads from an object that in theory shouldn't be getting modified.

I don't even know *why* the OP's example is bad code. Doesn't C allow you to
cast a pointer to T, to a pointer to U?

Sure. But it doesn't allow you to access something through a pointer of
the wrong sort.

This doesn't really give me much confidence to either write this stuff in C
directly, or to use C as a target language for a translator. (BTW how do
those Fortran-2-C translators deal with the equivalence problem?)

No clue. Probably memcpy.

-s

BartC · Dec 17, 2010

Seebs said:
It sounds very cumbersome and restrictive from the standpoint of an
implementor trying to decide whether to generate extremely slow memory
reads from an object that in theory shouldn't be getting modified.

This is the original code:

float x = 4.3;
int y;

y = *(int*)&x; /* copying of 4 bytes to int */
x = *(float*)&y; /* and back to float */

printf("x=%f\n",x); /* 4.3 expected here */
printf("y=%d\n",y);

And these are the wrong results:

x=-167393728255558576456059409056680378368.000000
y=134513634

x was originally 4.3 and so has been modified. We don't know what y was, so
perhaps it hasn't been written to. But why not? An assignment to it appears
in the code.

If it has been written to (initialising y to 0 would have been better), then
how did that rogue value get there? What were they trying to optimise?

Seebs · Dec 17, 2010

This is the original code:

I was talking about your more generic equivalence with arbitrary
overlap but...

If it has been written to (initialising y to 0 would have been better), then
how did that rogue value get there? What were they trying to optimise?

The general case is this:

int foo(int *a, float *b) {
float x = *b;
*a = 3;
x = *b;
}

Can the second read from *b be optimized away? In C, yes. In your
hypothesized language, no.

-s

Jens Thoms Toerring · Dec 17, 2010

BartC said:
Jens Thoms Toerring said:

integer*4 i(20)
real*8 a
equivalance (a,i(7))

(So the 8 bytes at i(7..8) are shared with the floating point number.)
[20]int i
real a @ &i[7]

Click to expand...

How does that mysterious language make that work on all architectures,
even on those where there are different alignment requirements for
ints and reals? I guess it will have to do the equivalent of calling
memcpy() to a temporary variable on those architectures for accesses
to 'a', doesn't it?

Click to expand...

No. This is just a way of bypassing the usual mechanisms in the language.

In x86 assembler, you can do, for example:

fld qword [i+28]

to access i[7..8] as a 64-bit floating point value. Why shouldn't a
low-level language (ie. C) do the same without the bureaucracy of wrapping
things up in unions, or having to the a block transfer first?

And if the alignment is wrong, then it won't work. But some programmer trust
is needed. (On x86, any alignment will work, at some cost of efficiency.)

This stuff *is* needed, otherwise you wouldn't have unions. I'm just
suggesting something less cumbersome and restrictive than unions.

I don't even know *why* the OP's example is bad code. Doesn't C allow you to
cast a pointer to T, to a pointer to U?

This doesn't really give me much confidence to either write this stuff in C
directly, or to use C as a target language for a translator. (BTW how do
those Fortran-2-C translators deal with the equivalence problem?)

Let's distinguish two things here: the OP's program didn't work
as expected with a certain version of gcc with a certain optimi-
zation. I can reproduce the problem with gcc 4.3.2 (on 64-bit x86)
and with '-O2' and higher but not with gcc 4.4.3 under otherwise
identical conditions. Now '-O2' is documented to include the
'-fstrict-aliasing' option. And about this the compiler documen-
tation states:

`-fstrict-aliasing'
Allows the compiler to assume the strictest aliasing rules
applicable to the language being compiled. For C (and C++), this
activates optimizations based on the type of expressions. In
particular, an object of one type is assumed never to reside at
the same address as an object of a different type, unless the
types are almost the same. For example, an `unsigned int' can
alias an `int', but not a `void*' or a `double'. A character type
may alias any other type.

If you compile with that option and also '-Wall' you get warned
about possible problems with breaking strict aliasing rules. And
if you specifically switch off that optimization, using the
'-f-no-strict-aliasing' option, then the program behaves again
as expected by the OP, even with gcc 4.3.2 (at least on my
machine). So, by asking for '-fstrict-aliasing' (though only
indirectly via '-O2') the OP made the compiler believe that
certain conditions would be satisfied by his program but which
wasn't the case.

Now, I would think that was a bit unlucky and the fact that the
problem doesn't seem to show up with gcc 4.4.3 anymore might be
taken as an indication that the compiler writers also thought
that they went a tiny bit over the top with that and now found
a way to avoid that kind of situations. But I don;t think we
can blame them for shoddy work since the program didn't follow
the stated rules for use of '-fstrict-aliasing'.

The other aspect is the question what kind of casts are required
to "work" according to the standard. And what I found (in 3.3.4
in the C89 standard) is the following:

A pointer to an object or incomplete type may be converted to
a pointer to a different object type or a different incomplete
type. The resulting pointer might not be valid if it is impro-
perly aligned for the type pointed to. It is guaranteed, how-
ever, that a pointer to an object of a given alignment may be
converted to a pointer to an object of the same alignment or a
less strict alignment and back again; the result shall compare
equal to the original pointer.

So a cast from a pointer to an object of type T to a pointer
to an object of a different type U might result in an invalid
pointer under the stated conditions, i.e. if the alignment
requirements of U are more strict than those of T. In that
case only the union-trick will do (see below).

The rationale for the way things are is probably rather
obvious - to allow implementations of C on all kinds of
architectues the standard can't make too harsh demands.
Since there are architectures with different alignment
requirements demanding more relaxed ones would make wri-
ting a compiler much more difficult without any true
benefits I can see at the moment.

In that sense I would consider the behavior of gcc 4.3.2 with
'-fstrict-aliasing' as not being standard compatible had the
documentation not at the same time warned about this fact -
the fix for the problem is thus not to use '-fstrict-aliasing'
under these conditions when one wants a fully standard com-
pliant compiler.

The other question is if the code might be considered broken.
The code was "well-formed" for a x86 (and perhaps a number of
other architectures) where there can't be alignment troubles.
On the other hand it was never clearly stated that this code
was intended for X86 only (the architecture wasn't ever men-
tioned), and for some other architectures it must be be con-
sidered "broken", i.e. those were a float has stricter align-
ment requirements than an int. And, moreover, since we're
here in clc and not a group for low level X86 programming I
think I can stand by calling it "broken". But then I have
been forced too often to deal with code that had trouble with
alignments since the writers obviously weren't even aware that
those issues exist and seemed to blissfully assume that all
machines have a x86 processor, so I may be a bit oversensitive
over this issue;-)

Now concerning unions. I don't think that unions were meant
specifically for that kind of stuff. You can do such things
with them because you then have at least the guarantee that
the union will be aligned in a way that all members can be
accessed correctly (i.e. the alignment of the union must be
suitable for the member with the strictest alignment require-
ments). But I would think that unions are meant to allow having
the same storage area for different types of data, and you're
expected not to use a member of type A when you have stored
a value via a member of type B. That this might work (for
whatever "work" might mean) as long as the size of A and B
are the same I would consider to be not more than a side-
effect, perhaps not even intened as such.

Finally, how a FORTRAN-to-C translator will deal with that I
don't know since I don't even know how FORTRAN handles such
issues - perhaps such EQUIVALENCE stuff will also only work
when there is proper alignment (some infos I found on the net
hinted in that direction, but I didn't find anything really
definitive). If I were to write a FORTRAN-to-C translator
and the specifications were that this kind of stuff works
under any circumstances I don't see much of an alternative
to using the union trick were possible and otherwise using
temporary variables of the correct types and memcpy() to
and from them on each access to the aliased variables. But
I hope it won't come to that;-)
Regards, Jens

Joshua Maurice · Dec 17, 2010

suppose the malloc implementation ran such that a points to the same
memory address as b? How is this not a violation of the strict
aliasing rules? When does the lifetime of one object begin and the
lifetime of one object end? Is malloc and free special in this regard?
What if I used my own memory allocator, such as follows?

Click to expand...

The key difference is that allocated storage has no declared type. In
this case, the following applies (C99 6.5#6, emphasis mine):

If a value is stored into an object having no declared type
through an lvalue having a type that is not a character type,
then the type of the lvalue becomes the effective type of the
object **for that access** and for subsequent accesses that do
not modify the stored value.

So both your first and second examples are perfectly fine w.r.t strict
aliasing rules, annotations inline:

/* This is a very bad memory allocator. It tracks only a single block
of a fixed size. It's not threadsafe.
It requires an explicit init and deinit. It's global.
Let's ignore all that for now.*/
void* my_malloc_ptr;
int my_malloc_ptr_is_allocated_to_user; void my_malloc_init()
{ assert( ! my_malloc_ptr);
my_malloc_ptr = malloc(1024);

Click to expand...

Here, the object pointed to by my_malloc_pointer (assuming that
allocation was successful) has no declared type, so its effective
type is determined by subsequent accesses as per 6.5#6.

my_malloc_ptr_is_allocated_to_user = 0;
} [...]
int main()
{
my_malloc_init();
int* a = my_malloc();
*a = 1;

Click to expand...

For this assignment, the effective type of the allocated object is int,
so the access is ok.

printf("%d\n", *a);

Click to expand...

The last assignment to the allocated object set its efective type to
int, so this access is also ok.

my_free(a);
float* b = my_malloc();
*b = 2;

Click to expand...

Despite the earlier access as int, the effective type of the allocated
object is float for this assignment, so this access is ok.

printf("%f\n", *b);

Click to expand...

The last assignment to the allocated object set its effective type to
float, so this access is also ok.

my_free(b);
my_malloc_deinit();
}

Click to expand...

You would violate the strict aliasing rules if your program assigned to
the allocated object via b and then subsequently read from the allocated
object via a (or vice versa), but your program does not do this.

That seems like the sensible interpretation. However, you didn't the
interesting followup for this interpretation. Let me post it again:

/* Program 1 */
#include <stdlib.h>
int main()
{
void* p;
int* x;
float* y;
p = malloc(sizeof(int) + sizeof(float));
x = p;
y = p;
*x = 1;
*y = 2;
return *y;
}

/* Program 2 */
#include <stdlib.h>
void foo(int* x, float* y)
{
*x = 1;
*y = 2;
}
int main()
{
void* p;
int* x;
float* y;
p = malloc(sizeof(int) + sizeof(float));
x = p;
y = p;
foo(x, y);
return *y;
}

This problem is otherwise known as the union DR. Is the compiler
allowed to change foo to:

void foo(int* x, float* y)
{
*y = 2;
*x = 1;
}

This happens in gcc all the time, for more complex code. gcc
frequently assumes that int* and float* do not alias - unless there is
something in scope that could make them alias. It uses some mostly (?)
undocumented and entirely non-standardized rules to determine when
sufficiently differently typed pointer may alias and may not alias.

As written, both of my examples follow the your rule, the rule that
there shall be no read on a piece of storage through type T which
reads the result of a write of a sufficiently different type U.
However, if the compiler is able to assume that sufficiently
differently typed pointers don't alias, then it can take a conforming
program and break it - specifically in example the above program and
reordering of foo.

Joshua Maurice · Dec 17, 2010

Yes.

K&R2 8.7 describes how make a storage allocator,
but it's for UNIX systems.

C89 was before they realised that "allocated"
was a kind of duration.

ISO/IEC 9899: 1990
6.1.2.4 Storage durations of objects
An object has a storage duration that determines its lifetime.
There are two storage durations: static and automatic.

This is an entirely consistent approach, I agree. You treat malloc and
free as special, despite the fact that common implementations IIRC do
nothing special with malloc and free - they exist purely as userland
libraries with little to no special compiler support. It's consistent,
but wholely unsatisfactory, and possibly (?) inconsistent with
practice at large.

I assume that plenty of people have written their own memory
allocators, and they expect that they should work. At least, I guess.
Am I right? Do C most programmers agree that they can't write a
conforming memory allocator in pure C code on top of malloc? I'm not
very knowledgeable in this area, but it just seems so contrary to the
practice which I've seen.

Nick Bowler · Dec 17, 2010

The other aspect is the question what kind of casts are required to
"work" according to the standard. And what I found (in 3.3.4 in the C89
standard) is the following:

A pointer to an object or incomplete type may be converted to a
pointer to a different object type or a different incomplete type.
The resulting pointer might not be valid if it is impro- perly aligned
for the type pointed to. It is guaranteed, how- ever, that a pointer
to an object of a given alignment may be converted to a pointer to an
object of the same alignment or a less strict alignment and back
again; the result shall compare equal to the original pointer.

So a cast from a pointer to an object of type T to a pointer to an
object of a different type U might result in an invalid pointer under
the stated conditions, i.e. if the alignment requirements of U are more
strict than those of T. In that case only the union-trick will do (see
below).

Note that the effective type rules (i.e., -fstrict-aliasing) do not
prevent you from converting pointers between two types with the same
alignment. The rules only apply when you dereference such a pointer.
So -fstrict-aliasing doesn't violate the above C89 requirements.

In that sense I would consider the behavior of gcc 4.3.2 with
'-fstrict-aliasing' as not being standard compatible had the
documentation not at the same time warned about this fact - the fix
for the problem is thus not to use '-fstrict-aliasing' under these
conditions when one wants a fully standard compliant compiler.

The OP's posted code has undefined behaviour in C99 because it violates
the "shall" requirement in 6.5#7. This is a requirement not in C89, but
as I don't have a copy handy, I can't say whether or not gcc's
-fstrict-aliasing option renders the compiler non-conforming to C89.

It would be interesting to know if there is a strictly conforming C89
program that both (a) violates no constraints of C99, and (b) has
undefined behaviour in C99 due to the new rules about effective types.

Now concerning unions. I don't think that unions were meant specifically
for that kind of stuff.

Yes and no. C99 adopted new wording which comes just shy of officially
blessing this use: in C89, the results of reading a union member other
than the last stored member were undefined. C99 drops this text and
instead says that assigning to a union member causes the bytes of the
object representation of other members to become unspecified.

Furthermore, unions are explicitly exempt from the aliasing problems
which the OP experienced.

So while no strictly conforming C99 program can use unions in this
fashion, and while the DS9K might ensure that trap representations occur
whenever possible, it's clear that the authors of the C99 standard
intended for this use of unions to be available. It is even explicitly
mentioned in a (non-normative) footnote.

BartC · Dec 18, 2010

Seebs said:
The general case is this:

int foo(int *a, float *b) {
float x = *b;
*a = 3;
x = *b;
}

Can the second read from *b be optimized away? In C, yes. In your
hypothesized language, no.

OK, thanks, this makes it clearer (although it's still not clear why the
assignment to y was optimised out; it looks like the compiler didn't even
attempt to perform one assignment, let alone two, for reasons of it's own).

Does that also mean the x=*b assignment *cannot* be optimised out here:

int foo(int *a, int *b) {
int x = *b;
*a = 3;
x = *b;
}

Nick Bowler · Dec 18, 2010

That seems like the sensible interpretation. However, you didn't the
interesting followup for this interpretation. Let me post it again:

I admit that I replied hastily and only skimmed the last half of your
post. Let's look at the first program:

/* Program 1 */
#include <stdlib.h>
int main()
{
void* p;
int* x;
float* y;
p = malloc(sizeof(int) + sizeof(float)); x = p;
y = p;

No access to the allocated storage has occurred yet, so all the above is ok
(modulo allocation failures).

*x = 1;

Effective type is int for the above assignment.

*y = 2;

That didn't last long; effective type is float for this assignment.

return *y;

Access of the allocated object with effective type float via lvalue of type
float is OK.

}

So there are no aliasing problems in the above. Let's look at the other
program:

/* Program 2 */
#include <stdlib.h>
void foo(int* x, float* y)
{
*x = 1;
*y = 2;
}
int main()
{
void* p;
int* x;
float* y;
p = malloc(sizeof(int) + sizeof(float)); x = p;
y = p;
foo(x, y);
return *y;
}

This is exactly the same as program #1; you've simply moved the accesses
into a function. The wording in the standard does not seem to care
about *where* in the program the various accesses to an object occur,
only *when*.

This problem is otherwise known as the union DR. Is the compiler allowed
to change foo to:

void foo(int* x, float* y)
{
*y = 2;
*x = 1;
}

I believe that the above reordering, as written, is not allowed, for
reasons I have already stated. However, if we can change foo ever so
slightly:

int bar(int *x, float *y)
{
*x = 1;
*y = 2;

return *x;
}

Now the implementation _can_ assume that *x and *y do not alias the
same object, because if they did, the second reference to *x would
violate the effective type rules and thus program behaviour is
undefined. So the compiler is well within its rights to replace the
second *x with a constant 1 or even to re-order the stores to *x and *y!

This happens in gcc all the time, for more complex code. gcc frequently
assumes that int* and float* do not alias - unless there is something in
scope that could make them alias. It uses some mostly (?) undocumented
and entirely non-standardized rules to determine when sufficiently
differently typed pointer may alias and may not alias.

Be sure that "more complex code" does not involve further references
like my above example. I agree that GCC's documentation is not very
specific in this regard.

As written, both of my examples follow the your rule, the rule that
there shall be no read on a piece of storage through type T which reads
the result of a write of a sufficiently different type U. However, if
the compiler is able to assume that sufficiently differently typed
pointers don't alias, then it can take a conforming program and break it

Right, I think that _only_ taking the types into account (without also
considering the contexts of the references) is insufficient for a
conforming implementation.

Nick Bowler · Dec 18, 2010

OK, thanks, this makes it clearer (although it's still not clear why the
assignment to y was optimised out; it looks like the compiler didn't
even attempt to perform one assignment, let alone two, for reasons of
it's own).

Analyzing why programs do what they do in the presence of undefined
behaviour is generally not productive.

Nevertheless, most likely what the compiler has optimized away is not
the assignments themselves, but rather it has optimized away stores of
some CPU registers to memory that would be required for the subsequent
accesses to make sense. Instead of storing the object representation
of an int to memory and then re-interpreting those bytes as a float, the
store has been optimized away and we just end up with garbage bytes
being re-interpreted as a float.

These are exactly the kind of optimizations that the aliasing rules in
C99 are designed to permit: by allowing the compiler to make assumptions
that certain references cannot alias the same object, the compiler can
emit code which keeps more things in CPU registers for longer, avoiding
(generally very expensive in comparison) memory accesses.

Seebs · Dec 18, 2010

Does that also mean the x=*b assignment *cannot* be optimised out here:

int foo(int *a, int *b) {
int x = *b;
*a = 3;
x = *b;
}

Right!

Which is why "restrict" was useful enough to justify specifying.

-s

Joshua Maurice · Dec 18, 2010

I believe that the above reordering, as written, is not allowed, for
reasons I have already stated. However, if we can change foo ever so
slightly:

int bar(int *x, float *y)
{
*x = 1;
*y = 2;

return *x;
}

Now the implementation _can_ assume that *x and *y do not alias the
same object, because if they did, the second reference to *x would
violate the effective type rules and thus program behaviour is
undefined. So the compiler is well within its rights to replace the
second *x with a constant 1 or even to re-order the stores to *x and *y!

Interesting. I didn't consider this case at all. I was about to ask
what actual non-trivial optimizations the compiler can perform through
your aliasing rules if it can't reorder my foo. I read your reply
again, and I realized that this example is a perfect example. *x and
*y may refer to the same memory location, but if they do, then your
foo has undefined behavior even before the compiler attempts any
reodering (quote unquote - I'll gloss over register allocation et
al.). Because if they alias, then foo has undefined behavior before
any reordering (quote unquote), the compiler is free to assume that
they don't, and thus it's free to reorder them (quote unquote).

Thank you.

Now, I just have to figure out if this is what most compiler writers,
most of the standard writers, and most competent practitioners
interpret the rules to mean. Your rules are sensible, but I don't know
if that's the actual common interpretation.

For example, I've already had a reply else-thread from pete
<[email protected]> which states that:
1- All of the programs we've been discussing have undefined behavior
because we've used a single piece of storage as two or more
sufficiently different object types.
2- malloc and free are special with regards to the language. Once a
piece of memory has been free'd and (re)malloc'ed, you can treat that
piece of memory as a different object type, but only once per
allocation. A corollary is that you cannot write a conforming memory
allocator in pure C on top of malloc, nor mmap, etc.

I'm concerned that there is no consensus because no one has yet to
reply to my comp.std.c++ thread, and because the proposed resolution
on the C++ standards committee website directly contradictions your
(Nick Bowler's) interpretation - it's a third interpretation, though
it's closer to pete's than Nick's.

Tim Rentsch · Jan 2, 2011

Seebs said:
It tried to read something throug an lvalue of the wrong type. Ultimately,
this violates the strict aliasing rules; the compiler is allowed to ignore
the reference or do anything it wants with it.

You mean effective type rules. The term 'strict aliasing' is a gcc-ism
and independent of the C Standard.

Tim Rentsch · Jan 2, 2011

C89. The rules were changed in C99 to bless what everyone expected and
all known implementations did anyway.

Looks more like a clarification of what was intended than an actual rule
change.

Problem with codewars.	5	Dec 4, 2023
A process take input from /proc/<pid>/fd/0, but won't process it	0	Oct 29, 2023
Weird Behavior with Rays in C and OpenGL	4	Feb 13, 2024
Drawing missing in bitmap in a pure C win32 program	4	Jun 3, 2023
SENTINEL CONTROL LOOP WHEN DEALING WITH TWO ARRAYS	1	Oct 26, 2023
I need help with a program	2	Apr 17, 2023
What a stupid gcc!	91	Jun 13, 2012
Bug in gcc?	5	Apr 4, 2013

A basic (?) problem with addresses (gcc)

BartC

Jens Thoms Toerring

Joshua Maurice

Joshua Maurice

BartC

Nick Bowler

Seebs

BartC

Seebs

Jens Thoms Toerring

Joshua Maurice

Joshua Maurice

Nick Bowler

BartC

Nick Bowler

Nick Bowler

Seebs

Joshua Maurice

Tim Rentsch

Tim Rentsch

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads