Ben Pfaff said:
So if you use a particular union access member... for a particular
union in memory, then you should always access that particular union-
variable, with the same member?
Is that the only intended use?
I believe that is the intention. I couldn't find relevant text
in the Rationale.
If you read the Defect Report (sorry, I don't know the
number offhand) that prompted the famous "type punning"
footnote, I think you'll find that the type punning use
cases were expected and intended all along, eg, even in
C89/C90. I presume they were important to make defined
because a significant amount of pre-ANSI C code relied
on them working.
In particular: is it a misuse of unions, to use them in order to
enable different memory-accesses (word, byte, etc.) to the same
portion of memory???? I have (mis)used it like this in above posts
quite often.
Is that strictly wrong?
I think there might be an "out" for accessing any type as an
array of unsigned char this way, but I believe that in general
this is strictly wrong. [snip]
How do you reconcile this view with the statement in
clear and plain English about how union member access
works to accomplish type punning? I think anyone who
takes the time to read through the relevant sections
defining the semantics involved will find accessing a
union member is defined irrespective of which member
was last set (although the result may depend on
unspecified values and consequently undefined behavior
because of trap representations, but the access itself
is well-defined).
********************************
Draft ANSI Standard
3.3.2.3 Structure and union members
[...]
Semantics
[... 3rd para]
With one exception, if a member of a union object is accessed after
a value has been stored in a different member of the object, the
behavior is implementation-defined./33/ One special guarantee is made
in order to simplify the use of unions: If a union contains several
structures that share a common initial sequence, and if the union
object currently contains one of these structures, it is permitted to
inspect the common initial part of any of them. Two structures share
a common initial sequence if corresponding members have compatible
types for a sequence of one or more initial members.
********************
This text corresponds to 6.3.2.3, fifth paragraph, in C90. The
footnote /33/ referenced above is footnote 41 in the C90 document.
That footnote says:
The ``byte orders'' for scalar types are invisible to isolated
programs that do not indulge in type punning (for example, by
assigning to one member of a union and inspecting the storage
by accessing another member that is an appropropriately sized
array of character type), but must be accounted for when
conforming to externally imposed storage layouts.
I don't think "implementation defined" is the same as well
defined. They could return zero every time if they wanted.
I think the footnote makes it clear that the intention of the
'implementation-defined' wording is to reinterpret the underlying
bytes in the new type. (Note that the quoted paragraph does
not exclude character types from the 'implementation-defined'
category.) This view is also supported by DR 283 (I was able
to find it, fortunately easily, by "defect report" "type punning"
as a google query)
http://www.open-std.org/jtc1/sc22/wg14/www/docs/dr_283.htm
which says in part (and written by someone on the committee)
It is not perfectly clear that the C99 words have the same
implications as the C89 words.
So even though C89/C90 says 'implementation-defined' and C99
doesn't, the indications are that the meaning intended is the
same in both cases, and also C11, which uses the same wording
as C99. (There are minor changes in the C11 footnote relative
to N1256, but these are incidental.)
Summing up - I think you're right in some technical sense that
the C90 allowed returning zero every time. But that isn't what
was intended, nor AFAIAA what any implementor took it to mean.
Unless there is a particular existing implementation that can be
pointed to that exhibits the problem, the point seems moot;
surely no current implementation would do anything other than
follow the revised wording in C99 or C11.