Padding involved

E

Eric Sosman

[... a question about the size and padding of a struct
"defined" by uncompilable code ...]
I have not given a compilable code. I am just
asking the size of the struct given.

How big is this array:

int array<7>;

? In other words, if the code describing your struct won't even
compile, then you have not "given" a struct at all. If there is

no struct, it has no size and no padding -- and no existence.
Understood. How about below:
int main(void) {
struct test {
char a;
int b;
short c;
};

*Much* better!
printf("%d\n", sizeof(struct test));
return 0;
}
I completely understand what will be the size of the struct with and
without mapping but the question is what parameters decides the padding
involved? Such as size of registers or size of address/data bus of the
processor?

There are different answers at different levels.

At the hardware level, alignment and the padding to attain
it are artifacts of the memory subsystem: Not just the busses,
but the address translation circuitry, the various levels of
cache, the inter-processor data-consistency protocols, and so
on. This collection of components may find it easier or cheaper
or faster to access particular types on a restricted set of
addresses: For example, it might be advantageous to position a
`double' object on an eight-byte boundary or an `int' on a
four-byte boundary. (I don't think register widths have much
to do with this -- but I'm no hardware designer, so don't treat
my "I don't think" as Gospel.)

But that's not the whole story. A host seldom consists only
of the hardware; there's an operating system to think about. The
O/S usually specifies an "application binary interface" or ABI
that describes how data should be arranged when invoking system
services or when dissecting their results. For example, even if
the hardware is able to cope with an `int' at an arbitrary address,
the ABI might insist on four-byte alignment (one possible reason
to do so could be to simplify the "Does the caller really have
access to all the bytes implied by this pointer?" test, by not
having to worry about crossing page boundaries). Then again, an
ABI might choose *not* to cater to all the hardware's whims: For
example, a widely-used ABI calls for four-byte alignment of all
stack-allocated data, even `double' objects that would be
significantly faster if allocated on eight-byte boundaries.

But even that isn't the whole story. Eventually, it's the
developers of the compiler itself who decide what policies it will
enforce. One developer might say "Speed is important: We'll put
every object on an address that makes accesses the very fastest
they can possibly be." Another might say "Memory bloat should be
avoided: We'll pack the objects as tightly as we can while still
maintaining reasonable (not optimal) speed." Yet another might
say "This is an embedded machine with only 4KB of RAM, so memory
is an extremely scarce resource and we'll pack everything down to
the absolute minimum." In the end, that is, it's a human choice.
 
B

BartC

Keith Thompson said:
[...]
#pragma pack(1)
[...]

The OP may not be aware that #pragma pack is non-standard. It's an
extension implemented by gcc (and probably other C compilers).

I mentioned that in my post.

One more nitpick: sizeof yields a result of type size_t; the "%d"
format requires an argument of type int. Use the "%zu" format, or
convert the sizeof result to int (or to unsigned long and use "%lu").

But you didn't pick on the dozen or so non-uses of "%zu" in the rest of my
post.

However I dislike having to remember and use all these weird and wonderful
format specifiers (a 'zoo' of them almost!). If the right one is that
important, then the compiler should tell me about it (but only lccwin32
seems to do so at default warning levels).

Ideally it should figure it out for itself (a few years ago, I proposed a %?
specifier for that purpose, for use in the 99.9% of cases where the format
string was a constant). Because managing these format strings can be a lot
of work (you change one type from int to long long, then you have to change
hundreds of %d to %lld or %x to %llx).

(FWIW, not using %zu doesn't seem to matter on my machine; when I'm
compiling for 64-bits and a size_t value occupies 8 bytes, while int is 4
bytes, then presumably the parameter stack is also 64-bit aligned so use of
%d seems to have no ill-effects. Tested with 3 x64 compilers.)
 
B

Ben Bacarisse

BartC said:
But you didn't pick on the dozen or so non-uses of "%zu" in the rest of my
post.

You are commenting on a post to someone else. That post had a single
use of %d. What else was there to point out?
However I dislike having to remember and use all these weird and wonderful
format specifiers (a 'zoo' of them almost!). If the right one is that
important, then the compiler should tell me about it (but only lccwin32
seems to do so at default warning levels).

On my machine gcc does too, but in any case it's wise to choose the
warnings you care about. With gcc, I ask for almost everything a turn
off the couple that I find annoying.

<snip>
 
J

Joe Pfeiffer

glen herrmannsfeldt said:
And, more generally, the smallest it can be is the sum of
the sizeof of the members.


Wouldn't it have to be 12 in that case? I thought that sizeof
a struct had to be big enough that an array of them would result
in all elements being aligned.

Ah, yes, forgot that. Yep, might well be twelve.
I know that a pointer to a struct has to, with appropriate
casting, equal a pointer to its first member, but I am not sure of
the requirements after that.

Is the compiler allowed to keep the char at the beginning, but
move the short before the int, such that sizeof would be 8?

Is there a reason why sizeof can't be 16 or 32, if that happens
to be faster on a certain processor?

No reason at all. My point exactly.
 
M

Malcolm McLean

Struct abbcd{

Char c;
Int b;
Short d;

};



What will be the size of abbcd ? If padding involved and without padding?
Suppose that the processor has only 4 byte registers.


Note; size of char is 1,size of int, short is 4 and 2 respectively.
The compiler isn't allowed to alter the order of the members. So b must come after c in memory and d must come last. it also must place the first member
right at the top of the structure. So struct abbcd x; char *ptr = (char *)&x;
must give you the address of c.

But it can insert other padding elements at will. Register size isn't a good
guide, because often processors allow half word access, even have special half
word registers, but make it less efficient than full-word access.
 
K

Keith Thompson

BartC said:
Keith Thompson said:
[...]
#pragma pack(1)
[...]

The OP may not be aware that #pragma pack is non-standard. It's an
extension implemented by gcc (and probably other C compilers).

I mentioned that in my post.

Sorry I missed that. But you wrote that it "will vary between
compilers". Not all compilers necessarily have a way to specify
packing of structure members.
But you didn't pick on the dozen or so non-uses of "%zu" in the rest of my
post.

I was replying to someone else. I don't point out every error in every
post.
However I dislike having to remember and use all these weird and wonderful
format specifiers (a 'zoo' of them almost!). If the right one is that
important, then the compiler should tell me about it (but only lccwin32
seems to do so at default warning levels).

gcc warns about about mismatches between format strings and arguments,
at least in many cases. But warning about such mismatches in all cases
is not possible. Format strings are interpreted at run time. A format
string is commonly a string literal, but needn't be. You just have to
develop the habit of using the right format yourself if you want to
avoid undefined behavior.
Ideally it should figure it out for itself (a few years ago, I proposed a %?
specifier for that purpose, for use in the 99.9% of cases where the format
string was a constant). Because managing these format strings can be a lot
of work (you change one type from int to long long, then you have to change
hundreds of %d to %lld or %x to %llx).

You can always convert the argument to a known type. For example, if u
is of some unsigned type, but you're not sure which one, you can do:

printf("%llu\n", (unsigned long long)u);

Or if you happen to know that the value of u is fairly small (say,
because it's the size of a structure that you know is smaller than 32
kbytes), you can just convert to int:

printf("%d\n", (int)sizeof whatever);
(FWIW, not using %zu doesn't seem to matter on my machine; when I'm
compiling for 64-bits and a size_t value occupies 8 bytes, while int is 4
bytes, then presumably the parameter stack is also 64-bit aligned so use of
%d seems to have no ill-effects. Tested with 3 x64 compilers.)

I see the same behavior. I wouldn't be surprised to see it fail on a
big-endian system. (Actually I just tried it and it "worked"; I'm not
sure why.)

But by using the correct format, perhaps with a cast, I don't have to
worry about it; I know it will work.
 
D

David Thompson

This simply means that an N byte type has an alignment requirement to be on an
offset divisible by N. And when a structure member of size N is being
allocated, the next available offset that is divisible by N is chosen (the
"lowest available offset witha ppropriate alignment").
No, the reverse. The offset must divide the size, or the size must be
divisible by the offset. E.g. struct { long a; float b; } if both long
and float are 4 bytes (as is common, though not universal and not
required by the Standard) then the struct has size at least 8 but is
unlikely to have alignment more than 4.
The "structures and unions assume the alignment of their most strictly aligned
component" means that if the structure contains an element of size N, then
there is enough padding at the end of the structure so that this element
will be correctly aligned as an array member.
If it contains an element with >alignment< N then the struct or union
has alignment N or a multiple of N, and padding if necessary to make
the size a multiple the alignment.

Note that alignment can be less than size, most commonly on systems
that can align everything to 1, but I've used a system where int is 4
bytes and aligned to 2. Alignment cannot be more than size.
For instance

struct foo { char a; long long b; char c; }

char a is at offset zero. Then 64 bit wide b goes to an address divisible
by 8, leaving a padding of 7, bringing us to 16 bytes. Now c is
allocated to the 17th byte. 17 cannot be the final size, because
in an array struct foo x[2], x[1].b will end up on a funny address!
Given that b has both size AND alignment 8, yes. Which AIUI is true
for x86-64, but not in all architectures.
The "most strictly aligned component" is b: aligned to 8 byte boundaries. And
so, the structure must be padded so its size is divisible by 8: matching the
alignment requirement of the most strictly aligned component. The size will be
the next available multiple of 8 after 17: 24. Seven bytes of padding, again.

Also, the malloc function is required to return pointers that are at
least as strictly aligned as any basic data type.

In the structure layout, offset 0 is asssumed to be suitably aligned
for anything. Regardless of type, the first member is placed at offset
zero without any padding at the start. The allocator has to make that
assumption true.

Not only malloc, but the compiler and linker: how they lay out objects
in static storage and in automatic storage (the stack).

static and auto objects must be sufficiently aligned for their actual
type, but not necessarily for 'anything'. malloc() does have to align
for 'anything' because it doesn't what the actual type will be.
 
G

glen herrmannsfeldt

(snip)
No, the reverse. The offset must divide the size, or the size must be
divisible by the offset. E.g. struct { long a; float b; } if both long
and float are 4 bytes (as is common, though not universal and not
required by the Standard) then the struct has size at least 8 but is
unlikely to have alignment more than 4.
(snip)

Note that alignment can be less than size, most commonly on systems
that can align everything to 1, but I've used a system where int is 4
bytes and aligned to 2. Alignment cannot be more than size.

As I understand it, for some systems an alignment greater than
size is necessary for optimal use. Specifically, some of the SSE
instructions, as I understand it, will process pairs of doubles
aligned to 16 byte boundaries. I believe other combinations, such
as four floats or four ints.

Also, GPUs might have different alignment requirements than
traditional processors.

-- glen
 
J

James Kuyper

(snip)


As I understand it, for some systems an alignment greater than
size is necessary for optimal use. Specifically, some of the SSE
instructions, as I understand it, will process pairs of doubles
aligned to 16 byte boundaries. I believe other combinations, such
as four floats or four ints.

In C, every object of a given type must be allocated at a location which
is correctly aligned for it's type. In an array, that means that the
first element of the array and the second element must both be correctly
aligned - but those two positions are also required to be separated by
exactly sizeof(type) bytes. That's not possible unless sizeof(type) is
an integer multiple of _Alignof(type).

On the platform you describe, must every double be aligned on a 16 byte
address, so the SSE instructions can always be used? Then that means
that the SSE instructions will never actually operate on two 8-byte
doubles at the same time; at most, they will operate on one 8-byte
double and one 8-byte piece of padding. In that case, sizeof(double)
must include the padding in order to implement arrays of double
correctly, so sizeof(double)==16, not 8.

Alternatively, is it perfectly feasible to have one double object
aligned to a 16 byte address and the next double object aligned 8 bytes
later, allowing both to be processed by the same SSE instruction? If so,
then _Alignof(double) == 8, not 16.
Also, GPUs might have different alignment requirements than
traditional processors.

Having different alignment requirements is not, in itself, a problem for
C, for which those requirements are implementation-defined. It's only
inconsistencies of those requirements with other things such as
sizeof(type) that would be a problem.
 
E

Eric Sosman

(snip)


As I understand it, for some systems an alignment greater than
size is necessary for optimal use. Specifically, some of the SSE
instructions, as I understand it, will process pairs of doubles
aligned to 16 byte boundaries. I believe other combinations, such
as four floats or four ints.

Also, GPUs might have different alignment requirements than
traditional processors.

From C's standpoint, alignment can never exceed size: Arrays
would not work if it did.

It can still be true -- is true -- that the host system may
require or benefit from alignments that are unknown to C. For
example, O/S interfaces like Unix' mmap() require alignment on
memory pages. But "memory page" is not a C type, nor even a C
concept, and there's no direct way for C to control memory page
alignment. (Even with C11's _Alignas keyword, there's no way C
can discover the memory page size unaided -- and on systems that
support multiple page sizes simultaneously, the situation gets
even thornier.)
 
K

Keith Thompson

James Kuyper said:
On 03/28/2014 02:45 PM, glen herrmannsfeldt wrote: [...]
As I understand it, for some systems an alignment greater than
size is necessary for optimal use. Specifically, some of the SSE
instructions, as I understand it, will process pairs of doubles
aligned to 16 byte boundaries. I believe other combinations, such
as four floats or four ints.

In C, every object of a given type must be allocated at a location which
is correctly aligned for it's type. In an array, that means that the
first element of the array and the second element must both be correctly
aligned - but those two positions are also required to be separated by
exactly sizeof(type) bytes. That's not possible unless sizeof(type) is
an integer multiple of _Alignof(type).

In other words, althugh C permits padding between struct members,
it does not permit padding between array elements.

If there's a type that's "naturally" 12 bytes long but that requires
8-byte alignment, that means that the compiler must treat that
type as having a size of (at least) 16 bytes, with 4 bytes not
contributing to the value. This is necessary because of the way C
defines array indexing.
 
K

Kaz Kylheku

(snip)


As I understand it, for some systems an alignment greater than
size is necessary for optimal use.

C does not support this. C compilers can support extra alignment for efficient
access in the way local variables are laid out and perhaps struct members.

No such thing will be supported for arrays and pointers.

If a greater alignment than size is required for correctness, then misaligned
access for pointers and arrays must be implemented.

E.g. if a short is two bytes, but must be aligned on a four-byte boundary, then
code generates for array indexing and pointer dereferncing has to somehow
handle the accesses at odd indices. Perhaps by rounding down to an address
divisible by four, loading a four byte word, and then shifting down
the half-word.

The guys who designed C were no strangers to machines that didn't provide
access to certain small types such as characters.

In fact, the B language, predecessor to C, handled strings similarly to C:
characters were packed into arrays of cells, which had to be unpacked and
re-packed by routines.

http://cm.bell-labs.com/who/dmr/chist.html

"[B's] character-handling mechanisms, inherited with few changes from BCPL,
were clumsy: using library procedures to spread packed strings into
individual cells and then repack, or to access and replace individual
characters, began to feel awkward, even silly, on a byte-oriented machine. "

So at that point Ritchie went for an addressable character type.
That can basically be seen as the point of departure at which the design
of C shifted toward "every type, down to the character/byte, is accessible at
an address that is no more strictly aligned than a multiple of its size".
 
S

Stephen Sprunk

As I understand it, for some systems an alignment greater than size
is necessary for optimal use. Specifically, some of the SSE
instructions, as I understand it, will process pairs of doubles
aligned to 16 byte boundaries. I believe other combinations, such as
four floats or four ints.

Such instructions operate not on a single object but on a group of
objects, and it is the _group_ that must have greater alignment;
however, that is invisible at the C level as long as you're using ints,
floats, etc. If the compiler wants to auto-vectorize access to an array
of such objects, it is responsible for generating code to handle any
potential alignment issues at the front--and dealing with remainders at
the end.

One alternative is a compiler extension to create vector types, such as
GCC's vector_size attribute; they are similar to (short) arrays but
always have the correct alignment for vector instructions, unlike normal
arrays, and that carries through to arrays of vectors. For instance:

typedef int v4si __attribute__ ((vector_size (16)));
v4si a = {1,2,3,4}; // always aligned
v4si b[2] = {{1,2,3,4},{5,6,7,8}}; // always aligned
int c[4] = {1,2,3,4}; // maybe unaligned

S
 
G

glen herrmannsfeldt

(snip, someone wrote)

(then I wrote)
In C, every object of a given type must be allocated at a location which
is correctly aligned for it's type. In an array, that means that the
first element of the array and the second element must both be correctly
aligned - but those two positions are also required to be separated by
exactly sizeof(type) bytes. That's not possible unless sizeof(type) is
an integer multiple of _Alignof(type).
On the platform you describe, must every double be aligned on a 16 byte
address, so the SSE instructions can always be used?

Pairs of doubles are aligned on 16 byte boundaries. If you have an
array of even length, you could process them two at a time if
appropriately aligned. You can then, for example, add a pair of
doubles to another pair in one operation.
Then that means
that the SSE instructions will never actually operate on two 8-byte
doubles at the same time; at most, they will operate on one 8-byte
double and one 8-byte piece of padding. In that case, sizeof(double)
must include the padding in order to implement arrays of double
correctly, so sizeof(double)==16, not 8.

In some cases, the compiler might be able to generate appropriate
code, for example adding complex data. In others, one might want
to call, as an example, and FFT routine written in assembler that
could optimally use the SSE instructions on pairs of doubles.

In the struct case, one might have a struct with a pair of doubles
(or one complex double) along with some other types, and want the
pair of doubles appropriately aligned, even in an array of such.

-- glen
 
S

Stephen Sprunk

Pairs of doubles are aligned on 16 byte boundaries.

Standard C has no type "pair of doubles".

Standard C guarantees that _Alignof(double) <= sizeof(double). On x86,
we know that sizeof(double) == 8, so _Alignof(double) == 16 is not allowed.
If you have an array of even length, you could process them two at a
time if appropriately aligned. You can then, for example, add a pair of
doubles to another pair in one operation.

Standard C only guarantees that your array of doubles will have the same
alignment as one double, i.e. 8 bytes on x86.

If you want a guarantee that your array has 16-byte alignment, then you
must either use/create another type with 16-byte alignment as your array
element or use an extension to tell the compiler you want stricter
alignment for a double (or array of doubles) than Standard C requires.

Note that Standard C doesn't guarantee the existence of _any_ type with
16-byte alignment or the ability to create such, so the former may not
be possible, and the latter is inherently outside the Standard.
In some cases, the compiler might be able to generate appropriate
code, for example adding complex data. In others, one might want to
call, as an example, and FFT routine written in assembler that could
optimally use the SSE instructions on pairs of doubles.

If the subroutine is written in assembler, then obviously Standard C
says nothing about what it can or can't do, nor does Standard C
guarantee that a pointer-to-double you pass to it will be aligned as
expected.

If the subroutine were in Standard C, the compiler must properly handle
the 8-byte aligned case. However, there is nothing stopping it from
_also_ detecting the 16-byte aligned case and then using more efficient
vector instructions.
In the struct case, one might have a struct with a pair of doubles
(or one complex double) along with some other types, and want the
pair of doubles appropriately aligned, even in an array of such.

Assuming your struct only contains doubles, then the alignment will be
the same as for one double, like in the array case above.

S
 
E

Eric Sosman

Standard C has no type "pair of doubles".

Standard C guarantees that _Alignof(double) <= sizeof(double). On x86,
we know that sizeof(double) == 8, so _Alignof(double) == 16 is not allowed.


Standard C only guarantees that your array of doubles will have the same
alignment as one double, i.e. 8 bytes on x86.

If you want a guarantee that your array has 16-byte alignment, then you
must either use/create another type with 16-byte alignment as your array
element or use an extension to tell the compiler you want stricter
alignment for a double (or array of doubles) than Standard C requires.

Note that Standard C doesn't guarantee the existence of _any_ type with
16-byte alignment or the ability to create such, so the former may not
be possible, and the latter is inherently outside the Standard.


If the subroutine is written in assembler, then obviously Standard C
says nothing about what it can or can't do, nor does Standard C
guarantee that a pointer-to-double you pass to it will be aligned as
expected.

If the subroutine were in Standard C, the compiler must properly handle
the 8-byte aligned case. However, there is nothing stopping it from
_also_ detecting the 16-byte aligned case and then using more efficient
vector instructions.

I'm with you thus far, but ...
Assuming your struct only contains doubles, then the alignment will be
the same as for one double, like in the array case above.

... are you sure of this last bit? It seems to me that
the compiler is within its rights to require stricter alignment
for a struct or union than for any of the individual elements.
(It cannot use looser alignment, of course.) Do you have C&V
to the contrary?
 
J

James Kuyper

(then I [glen herrmannsfeldt] wrote)
As I understand it, for some systems an alignment greater than
size is necessary for optimal use. Specifically, some of the SSE
instructions, as I understand it, will process pairs of doubles
aligned to 16 byte boundaries. I believe other combinations, such
as four floats or four ints.
In C, every object of a given type must be allocated at a location which
is correctly aligned for it's type. In an array, that means that the
first element of the array and the second element must both be correctly
aligned - but those two positions are also required to be separated by
exactly sizeof(type) bytes. That's not possible unless sizeof(type) is
an integer multiple of _Alignof(type).
On the platform you describe, must every double be aligned on a 16 byte
address, so the SSE instructions can always be used?

Pairs of doubles are aligned on 16 byte boundaries.

If two double can be 8 bytes apart, then _Alignof(double)<=8. Since
you'll probably want have at least one double in any group of two or
more doubles to be aligned on a 16 byte boundary, that suggests that an
implementation for that platform should choose _Alignof(double)==8.
... If you have an
array of even length, you could process them two at a time if
appropriately aligned. You can then, for example, add a pair of
doubles to another pair in one operation.

That is an optimization that a compiler is allowed to take advantage of,
when it can - but if it doesn't prevent the the existence of doubles
starting on addresses that are not multiples of 16, then it does NOT
mean that _Alignof(double) == 16.
In some cases, the compiler might be able to generate appropriate
code, for example adding complex data.

That would suggest that there is a strong incentive for _Alignof(double
_Complex) == 16, but that's a different issue.
 
J

James Kuyper

Standard C has no type "pair of doubles".

Actually, that's precisely what double[2] is; and _Alignof(double[2])
probably would be 16 on such a platform.
Standard C only guarantees that your array of doubles will have the same
alignment as one double, i.e. 8 bytes on x86.

While that's the only guarantee, an implementation is free to impose
stricter alignment requirements on arrays of a type than on the type itself.
If you want a guarantee that your array has 16-byte alignment, then you
must either use/create another type with 16-byte alignment as your array
element or use an extension to tell the compiler you want stricter
alignment for a double (or array of doubles) than Standard C requires.

Or you could use _Alignas(16), which is not an extension, but is new in
C2011.
Note that Standard C doesn't guarantee the existence of _any_ type with
16-byte alignment or the ability to create such, so the former may not
be possible, and the latter is inherently outside the Standard.

That is true: 16 is not required to be a valid fundamental or extended
alignment , and it is a constraint violation to specify _Alignas(n) when
n is neither 0 nor a valid alignment value (6.7.5p3). That's why it's
generally safer to specify _Alignas(type) rather than
_Alignas(constant_expression).
 
S

Stephen Sprunk

... are you sure of this last bit? It seems to me that the compiler
is within its rights to require stricter alignment for a struct or
union than for any of the individual elements. (It cannot use looser
alignment, of course.) Do you have C&V to the contrary?

With regard to the complex type that glen mentioned:

N1570 6.2.5p13:
"Each complex type has the same representation and alignment
requirements as an array type containing exactly two elements of the
corresponding real type;"

For structs, we know a struct's alignment has to be a positive multiple
of the largest member's alignment, and we also know it can't be larger
than the size of the struct, so the only possibilities here are 8 or 16.
I don't know why an implementation might choose the latter, but I can't
find anything in N1570 that says it isn't allowed to. So, correction noted.

S
 
G

glen herrmannsfeldt

(then I wrote)
Standard C has no type "pair of doubles".
Standard C guarantees that _Alignof(double) <= sizeof(double).
On x86, we know that sizeof(double) == 8, so _Alignof(double) == 16
is not allowed.

Standard C doesn't care about speed at all, but users often do.

Note that x86, back to the 8086, doesn't require alignment, but it
is often faster if properly aligned. When the 80486 was popular,
and four byte alignment of double was all that was needed.
(A 32 bit system, with a 32 bit data bus.) C compilers, and more
important most of the time, malloc() would generate four byte
alignment.

For way too long after the pentium became popular, C was still
generating four byte alignment.
Standard C only guarantees that your array of doubles will have
the same alignment as one double, i.e. 8 bytes on x86.

Note as above, x86 doesn't require 8 byte alignment for doubles,
so C might as well not do any padding, and malloc() might just
as well return odd addresses.
If you want a guarantee that your array has 16-byte alignment, then you
must either use/create another type with 16-byte alignment as your array
element or use an extension to tell the compiler you want stricter
alignment for a double (or array of doubles) than Standard C requires.

Or give up C and move onto other languages?
Note that Standard C doesn't guarantee the existence of _any_ type with
16-byte alignment or the ability to create such, so the former may not
be possible, and the latter is inherently outside the Standard.
(snip)

If the subroutine is written in assembler, then obviously Standard C
says nothing about what it can or can't do, nor does Standard C
guarantee that a pointer-to-double you pass to it will be aligned as
expected.
If the subroutine were in Standard C, the compiler must properly handle
the 8-byte aligned case. However, there is nothing stopping it from
_also_ detecting the 16-byte aligned case and then using more efficient
vector instructions.

But if you can't reliably generate them, that doesn't help much.
Assuming your struct only contains doubles, then the alignment will be
the same as for one double, like in the array case above.

And if it doesn't?

-- glen
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,073
Messages
2,570,539
Members
47,197
Latest member
NDTShavonn

Latest Threads

Top