BartC said:
Jens Thoms Toerring said:
integer*4 i(20)
real*8 a
equivalance (a,i(7))
(So the 8 bytes at i(7..8) are shared with the floating point number.)
[20]int i
real a @ &i[7]
How does that mysterious language make that work on all architectures,
even on those where there are different alignment requirements for
ints and reals? I guess it will have to do the equivalent of calling
memcpy() to a temporary variable on those architectures for accesses
to 'a', doesn't it?
No. This is just a way of bypassing the usual mechanisms in the language.
In x86 assembler, you can do, for example:
to access i[7..8] as a 64-bit floating point value. Why shouldn't a
low-level language (ie. C) do the same without the bureaucracy of wrapping
things up in unions, or having to the a block transfer first?
And if the alignment is wrong, then it won't work. But some programmer trust
is needed. (On x86, any alignment will work, at some cost of efficiency.)
This stuff *is* needed, otherwise you wouldn't have unions. I'm just
suggesting something less cumbersome and restrictive than unions.
I don't even know *why* the OP's example is bad code. Doesn't C allow you to
cast a pointer to T, to a pointer to U?
This doesn't really give me much confidence to either write this stuff in C
directly, or to use C as a target language for a translator. (BTW how do
those Fortran-2-C translators deal with the equivalence problem?)
Let's distinguish two things here: the OP's program didn't work
as expected with a certain version of gcc with a certain optimi-
zation. I can reproduce the problem with gcc 4.3.2 (on 64-bit x86)
and with '-O2' and higher but not with gcc 4.4.3 under otherwise
identical conditions. Now '-O2' is documented to include the
'-fstrict-aliasing' option. And about this the compiler documen-
tation states:
`-fstrict-aliasing'
Allows the compiler to assume the strictest aliasing rules
applicable to the language being compiled. For C (and C++), this
activates optimizations based on the type of expressions. In
particular, an object of one type is assumed never to reside at
the same address as an object of a different type, unless the
types are almost the same. For example, an `unsigned int' can
alias an `int', but not a `void*' or a `double'. A character type
may alias any other type.
If you compile with that option and also '-Wall' you get warned
about possible problems with breaking strict aliasing rules. And
if you specifically switch off that optimization, using the
'-f-no-strict-aliasing' option, then the program behaves again
as expected by the OP, even with gcc 4.3.2 (at least on my
machine). So, by asking for '-fstrict-aliasing' (though only
indirectly via '-O2') the OP made the compiler believe that
certain conditions would be satisfied by his program but which
wasn't the case.
Now, I would think that was a bit unlucky and the fact that the
problem doesn't seem to show up with gcc 4.4.3 anymore might be
taken as an indication that the compiler writers also thought
that they went a tiny bit over the top with that and now found
a way to avoid that kind of situations. But I don;t think we
can blame them for shoddy work since the program didn't follow
the stated rules for use of '-fstrict-aliasing'.
The other aspect is the question what kind of casts are required
to "work" according to the standard. And what I found (in 3.3.4
in the C89 standard) is the following:
A pointer to an object or incomplete type may be converted to
a pointer to a different object type or a different incomplete
type. The resulting pointer might not be valid if it is impro-
perly aligned for the type pointed to. It is guaranteed, how-
ever, that a pointer to an object of a given alignment may be
converted to a pointer to an object of the same alignment or a
less strict alignment and back again; the result shall compare
equal to the original pointer.
So a cast from a pointer to an object of type T to a pointer
to an object of a different type U might result in an invalid
pointer under the stated conditions, i.e. if the alignment
requirements of U are more strict than those of T. In that
case only the union-trick will do (see below).
The rationale for the way things are is probably rather
obvious - to allow implementations of C on all kinds of
architectues the standard can't make too harsh demands.
Since there are architectures with different alignment
requirements demanding more relaxed ones would make wri-
ting a compiler much more difficult without any true
benefits I can see at the moment.
In that sense I would consider the behavior of gcc 4.3.2 with
'-fstrict-aliasing' as not being standard compatible had the
documentation not at the same time warned about this fact -
the fix for the problem is thus not to use '-fstrict-aliasing'
under these conditions when one wants a fully standard com-
pliant compiler.
The other question is if the code might be considered broken.
The code was "well-formed" for a x86 (and perhaps a number of
other architectures) where there can't be alignment troubles.
On the other hand it was never clearly stated that this code
was intended for X86 only (the architecture wasn't ever men-
tioned), and for some other architectures it must be be con-
sidered "broken", i.e. those were a float has stricter align-
ment requirements than an int. And, moreover, since we're
here in clc and not a group for low level X86 programming I
think I can stand by calling it "broken". But then I have
been forced too often to deal with code that had trouble with
alignments since the writers obviously weren't even aware that
those issues exist and seemed to blissfully assume that all
machines have a x86 processor, so I may be a bit oversensitive
over this issue;-)
Now concerning unions. I don't think that unions were meant
specifically for that kind of stuff. You can do such things
with them because you then have at least the guarantee that
the union will be aligned in a way that all members can be
accessed correctly (i.e. the alignment of the union must be
suitable for the member with the strictest alignment require-
ments). But I would think that unions are meant to allow having
the same storage area for different types of data, and you're
expected not to use a member of type A when you have stored
a value via a member of type B. That this might work (for
whatever "work" might mean) as long as the size of A and B
are the same I would consider to be not more than a side-
effect, perhaps not even intened as such.
Finally, how a FORTRAN-to-C translator will deal with that I
don't know since I don't even know how FORTRAN handles such
issues - perhaps such EQUIVALENCE stuff will also only work
when there is proper alignment (some infos I found on the net
hinted in that direction, but I didn't find anything really
definitive). If I were to write a FORTRAN-to-C translator
and the specifications were that this kind of stuff works
under any circumstances I don't see much of an alternative
to using the union trick were possible and otherwise using
temporary variables of the correct types and memcpy() to
and from them on each access to the aliased variables. But
I hope it won't come to that;-)
Regards, Jens