James Kuyper said:
For practical purposes, an implementation of malloc() could be written
that allocates from a fixed static array of memory. The array would have
a union type, where the union has members of many different types. The
wider the variety of types in the union, the more likely it is to have
universal alignment. In practice, if the union contains short, int,
long, long long, float, double, long double, void*, and void(*)(void),
it's got a pretty good chance of having universal alignment. However,
there's no finite set of types which is guaranteed by the standard to
give such a union universal alignment.
This thread has been interesting. Let's see if we can sort things out
a bit.
First, James Kuyper's statement that there is no finite set of types
which is guaranteed to give universal alignment is correct. That's
implied by array alignment not having to be the same for arrays of
differing lengths (see next), but even without considering arrays the
statement is correct, because of structs. Suppose sizeof(short) == 2
and alignment_of(short) == 1 (I trust everyone follows my meaning
here). Then the types
struct { short s; }
struct { struct { short s; } }
struct { struct { struct { short s; } } }
...
could have alignment of 1 up to, say, four nestings of struct, but for
five or greater nestings could have alignment of 2. I grant that an
implementation would be particularly perverse to do such a thing, but
it seems to be allowed. For a more pedestrian example, the types
struct { short a; }
struct { short a; short b; }
struct { short a; short b; short c; }
...
could have an unbounded number of different alignment requirements.
Right?
Getting back to arrays of differing numbers of elements, it seems
clear that the standard intends that arrays of different lengths can
have different alignment requirements, at least in some circumstances.
Looking at 6.7.2.1 p16, p17 and note 106,
16 As a special case, the last element of a structure with more
than one named memeber may have an incomplete array type; this
is called a flexible array member. With two exceptions, the
flexible array member is ignored. First, the size of the
structure shall be equal to the offset of the last element of an
otherwise identical structure that replaces the flexible array
member with an array of unspecified length.106) Second, when a
. (or ->) operator has a left operand that is (a pointer to) a
structure with a flexible array member and the right operand
names that member, it behaves as if that member were replaced
with the longest array (with the same element type) that would
not make the structure larger than the object being accessed;
the offset of the array shall remain that of the flexible array
member, even if this would differ from that of the replacement
array. If this array would have no elements, it behaves as if
it had one element but the behavior is undefined if any attempt
is made to access that element or to generate a pointer one past
it.
17 EXAMPLE Assuming that all array members are aligned the same,
after the declarations
struct s { int n; double d[]; };
struct ss { int n; double d[1]; };
the three expressions
sizeof (struct s)
offsetof(struct s, d)
offsetof(struct ss, d)
have the same value. The structure struct s has a flexible
array member d.
------------------------
106) The length is unspecified to allow for the fact that
implementations may give array members different
alignments according to their lengths.
Notice the second to last sentence in paragraph 16, especially the
part after the semicolon. Clearly the standard anticipates having
different alignment requirements for arrays of differing lengths, at
least when they are structure members; it isn't a big stretch to
conclude that they are allowed to have different alignment
requirements outside of structures also.
If an array of unspecified length can have a different alignment
requirement than an array of 1 element, then the alignment
requirement for the element type need not match the alignment
requirement for arrays of that type. For 'short', for example,
it is always true that
alignment_of(short) <= alignment_of(short[])
alignment_of(short) <= alignment_of(short[N])
where 'N' is the (compile-time constant) array dimension. (Another
way of expressing the first line is "the alignment of short is no more
restrictive than the alignment of short[].") The reason for this is
obvious - any array of short can easily be used to make an expression
that yields a short (lvalue), which much be properly aligned. Also,
it is always true that
alignment_of(short[]) <= alignment_of(short[N])
for any N, because a declaration 'extern short whatever[];' matches
array definitions of any length. So, if it can be true that
alignment_of(short[]) < alignment_of(short[1])
as is implied by the phrasing used in p17, then the alignment
requirements for short may differ from the alignment requirements for
an array of short (and similarly other types). This doesn't matter
too much in the context of the original discussion, since when we are
concerned only with more restrictive alignment requirements the
array type T[1] can be used in place of plain T with no loss of
alignment requirements.
As a practical matter, we would expect that alignment requirements
will match up for unspecified length arrays and length 1 arrays. That
is, we expect
alignment_of(T[]) == alignment_of(T[1])
reason being, consider the following code:
short short_array_TWO[2];
short (*pointer_to_short_array_TWO)[2];
short (*pointer_to_short_array)[];
short (*pointer_to_short_array_ONE)[1];
pointer_to_short_array_TWO = &short_array_TWO;
pointer_to_short_array = pointer_to_short_array_TWO;
pointer_to_short_array_ONE = pointer_to_short_array;
/* Most would expect the last assignment */
/* to be "safe" as well as legal. */
Or, to say this another way, we expect always to be able to use a
pointer to some array (that is, some actual array object) as a pointer
to an array of length 1. An implementation that didn't do this might
be useful as some sort of debugging or diagnostic aid, but it's highly
unlikely this choice is one an implementation intended for production
would make.
Even though the standard says very little about what the alignments of
various types are allowed to be, we can put some limits what they
can be. In particular,
sizeof(T) % alignment_of(T) == 0
must hold for any type T (perhaps of sufficiently small size), because
we can make arrays:
T t[2];
If sizeof(T) is 4, the alignment_of(T) must be 1, 2, or 4; it can't
be 3, 5, or 8, because then the alignment of t[1] would be wrong. So,
if sizeof(short) == 2, and if we accept sizeof(T[N]) == (N)*sizeof(T)
then the alignment of
short t2[2];
might be 4, but that alignment can't apply to short[3]; otherwise the
case of
short t_2_3[2][3];
would give incorrect alignment for the short[3] array at t_2_3[1].
It's hard to draw any absolute conclusions about what must be true
when converting between arrays of differing lengths. Any change of
length in array type means a conversion (normally of a pointer to
array), and the standard doesn't guarantee much about what pointer
conversions do. The relevant section (6.3.2.3 p7) also says that "if
the resulting pointer is not correctly aligned for the pointed-to
type, the behavior is undefined."
In practical terms it's unlikely that alignment requirements for
differing length arrays will cause a problem in most normal code. The
reason is, most code turns any array uses immediately into pointers,
and going to an element pointer is safe (as long as the array has
alignment suitable for some length array, which is to say, for T[]).
Using pointer-to-array types might seem more likely to cause problems
in the "universal union allocator" scenario. However, that's not as
likely as it seems, because code that seems perfectly reasonable (and
has nothing to do with malloc or storage allocation) would also cause
problems. Some examples:
short t_2_3[2][3];
short (*psa)[];
short (*psa2)[2];
psa = &t_2_3[1];
psa2 = psa;
and
short short_2049[2049];
short (*w)[];
short (*x)[][1];
short (*y)[];
short (*z)[2048];
w = & short_2049;
x = (short (*)[][1]) w; /* cast needed */
y = & (*x)[1];
z = y;
There is one cast here, the assignment of 'x' from 'w'. The cast is
required to convert the variable 'w', a pointer-to-array-of-shorts, to
the variable 'x', a pointer-to-array-of-array[1]-of-shorts; in
effect, each short in the original array is being converted into a
short[1] array. This conversion corresponds to the expectation
alignment_of(T[]) == alignment_of(T[1])
explained earlier. (And that pointer conversion works as we expect,
as mentioned by Wojtek Lerch.) All the other assignments are
conversions between compatible types.
In each example, the last statement may evoke undefined behavior
because the (implicit) conversion may produce a pointer of unsuitable
alignment. I certainly wouldn't deny that the standard allows that to
happen. It seems unlikely that it will though, at least directly;
consider 6.3 p2
Conversion of an operand value to a compatible type causes no
change to the value or to the representation.
So an implementation would pretty much need to go out of its way to
cause problems for these examples to cause problems directly. As long
as all that's done with these array pointers is turn them into regular
pointers, most likely all will be well.
Where there are likely to be problems is arrays or array types
(perhaps contained in structs) that are unknown to "your" code but
used in implementation files for some of the necessary library
functions or whatever. For example, the 'jmp_buf' type, stated as
being an array type, might require an alignment that must match the
alignment of a cache line. (So don't forget to include a jmp_buf
member in the "universal union" type.
Summing up: technically, doing almost any kind of pointer conversion
_not_ guaranteed to have no alignment problems might _have_ alignment
problems, and evoke undefined behavior; arrays of differing lengths
can have different alignment requirements, and converting between
pointers to arrays of differing lengths, even between compatible
types, it's still a conversion that can result in alignment mismatch
and so evoke undefined behavior.
In practical terms, however, it's safe for "regular" types to convert
between pointers to array types of differing lengths. If a conversion
to a pointer-to-array type is necessary, normally prefer use of a
pointer to array of unspecified length ('T(*)[]') type to one with an
explicit length.