qarnos said:
I just have a quick question for people more familiar with the C
standards than myself.
If I have a union with an anonymous struct, as follows:
union my_union
{
unsigned int ccount[2];
struct
{
unsigned int rcount;
unsigned int lcount;
};
};
Am I guaranteed than ccount[0] will map to rcount and ccount[1] to
lcount? Or is the compiler allowed to re-order the struct members?
If you use a conforming compiler, your only guarantee is that you'll
get a diagnostic message. If you don't get one, your compiler is not
conforming, or at least you didn't run it in a conforming mode. (gcc
is not conforming by default; try "-ansi -pedantic".)
Standard C does not allow anonymous struct members.
But let's make it non-anonymous (that wasn't your main point anyway):
union my_union {
unsigned int ccount[2];
struct {
unsigned int rcount;
unsigned int lcount;
} foo;
};
union my_union obj;
The standard guarantees that members of a struct (other than bit
fields) are laid out in the order in which they're declared, that the
first member of a struct is at offset 0, and that each member of a
union is at offset 0. Thus obj.ccount[0] and obj.foo.rcount are
guaranteed to occupy the same location.
Compilers are allowed to insert arbitrary padding between struct
members and/or after the last member. Normally this is done for
alignment purposes, but the standard doesn't restrict it; a perverse
compiler could insert as much padding as it likes. I don't think it's
possible for padding between rcount and lcount to be necessary for
alignment purposes, so obj.ccount[1] and obj.foo.lcount almost
certainly occupy the same location, but the standard doesn't actually
guarantee it.
Furthermore, though unions are commonly used to treat a given chunk of
memory as if it were of two different types, the standard doesn't
actually support this usage except in a few cases. Storing a value in
one member of a union and then reading a value from another member is,
in most cases, undefined behavior. It's a common enough usage that
any compiler will probably let you get away with it, but even if the
obj.ccount[0] and obj.foo.rcount occupy the same location, an
optimizing compiler could theoretically rearrange the code so that it
doesn't behave that way. For example:
int n = 42;
printf("%d\n", n);
/* The generated code could use a literal 42 rather than
re-loading the value of n */
/* declarations as above */
obj.foo.rcount = 42;
obj.ccount[0] = 137;
printf("%d\n", obj.foo.rcount);
/* The generated code could use a literal 42 rather than
re-loading the value of obj.foo.rcount. Since the value must
be 42 unless you've done something that invokes undefined
behavior, this is a valid optimization. */
*But* there's a lot of code out there that does this kind of thing,
even though the standard doesn't support it, and it's unlikely that a
compiler vendor is going to break such code.
Having said all that, there is a way to do what you want that's fully
supported by the standard:
struct my_struct {
unsigned int ccount[2];
};
#define rcount ccount[0]
#define lcount ccount[1]
struct my_struct obj;
Now obj.rcount actually *means* obj.ccount[0], and obj.lcount means
obj.ccount[1].