(This deserves a separate thread, but since I asked the above
question here, I'll continue here too.)
As I understand the expression constitutes an access to the structure
member.
1. Does a member access constitute an access to the *whole* structure?
eg.:
The term 'access' is really only used in the C standard in conjunction
with the volatile qualifier, where the wording is unfortunately vague
enough that it can be construed several different ways.
struct A { int i; int _i; };
struct B { int i; float f; };
struct A a = {0};
struct B *pb = (struct B*)&a;
pb->i; //UB?
(*pb).i; //UB?
The two statements above actually have defined behavior, but not for
the reason you might think. The language guarantees that a pointer to
structure, suitably cast, is also a pointer to its first member. So
in given that 'pb' holds any of the following:
- a pointer to any structure type whose first member is an int
- a pointer to an array of ints
- a pointer to a single int
....and the int pointed to has a valid value, the expressions, though
not recommended, will work as designed. The compiler must generate
code equivalent to *(int *)pb, and if there is actually an int there
all is well.
Do I access the first int sub-object in `a' only, or do I access
the whole object `a'?
Now that depends what you mean by access. Let's assume a Pentium
(1/2/3/4) platform with a typical compiler, which means the size of
either of your structures is 8 8-bit bytes. Now let's also assume
that this implementation allocates all structures on an address evenly
divisible by 8, a not uncommon performance feature of such
implementations.
With the assumptions above, if the processor needs to access the
structure from memory it will perform a 64-bit access physically, so
even though your code does not direct the abstract machine to touch
the value of other members in any way, the entire memory space holding
the structure will be physically read.
2. I see certain similarity between structs and arrays (in fact,
both are called "aggregates").
Why is it that for array:
&a[5];
doesn't constitute object access (6.5.3.2#3), whereas for struct:
&s.m;
&ps->m;
the expressions do constitute access?
Why is the language designed like this?
There are actually more differences than similarities between structs
and arrays, despite the fact that both are aggregates. Structs are
first class objects, meaning they can be assigned, passed to and
returned from functions by value, and their names are never implicitly
converted to pointers. Arrays are not first class objects and do not
share any of the characteristics above.
As for other differences in this particular case, this is spelled out
by paragraph 3 of 6.5.3.2 of C99:
[begin quotation]
The unary & operator returns the address of its operand. If the
operand has type ‘‘type’’, the result has type ‘‘pointer to type’’. If
the operand is the result of a unary * operator, neither that operator
nor the & operator is evaluated and the result is as if both were
omitted, except that the constraints on the operators still apply and
the result is not an lvalue. Similarly, if the operand is the result
of a [] operator, neither the & operator nor the unary * that is
implied by the [] is evaluated and the result is as if the & operator
were removed and the [] operator were changed to a + operator.
Otherwise, the result is a pointer to the object or function
designated by its operand.
[end quotation]
Note the differences between applying '&' to the result of a '*'
operator and to the result of a '[]' operator. In the former case,
neither '&' nor '*' are evaluated as such, but note "the constraints
on the operators still apply".
Now let's back up to paragraph 1 of 6.5.3.2, which lists the
constraints for the unary '&' operator:
[begin quotation]
The operand of the unary & operator shall be either a function
designator, the result of a [] or unary * operator, or an lvalue that
designates an object that is not a bit-field and is not declared with
the register storage-class specifier.
[end quotation]
Notice that the expression under discussion,
(int)&((doomdata*)0)->a;
....is none of these things. Specifically, the operand of the '&'
operator, '((doomdata*)0)->a' is:
- not a function designator
- not the result of a [] operator
- not the result of a unary * operator
- and, because of the null pointer, not an lvalue
Finally consider one last thing, namely that regardless of whether
there is an actual access to an object, the expression explicitly
performs pointer arithmetic on a null pointer, and such use of a null
pointer is undefined in and of itself.