unaligned pointer access

glen herrmannsfeldt · Sep 12, 2013

(snip)

int64_t fetch64(uint32_t *p) {
union {
struct { uint32_t a; uint32_t b; } s;
int64_t x;
} u;
u.s.a = *p++;
u.s.b = *p;
return u.x;
}

This should work on all architectures, and give the best code for
reading an 8-byte int from an address that is known to be 4-byte
aligned. As far as I know, the code is fully portable C (hopefully
someone will correct me if I'm wrong - this c.l.c. is good at that!).

Compiling with gcc for the ARM (arm7tdmi) gives:

fetch64:
ldmia r0, {r0, r1}
bx lr

Like most (or perhaps all) 32-bit processors, ARM is perfectly happy
with 4-byte alignment for 8-byte integers. (It may require 8-byte
alignment for doubles for cpus that support hardware floating point - I
haven't checked those details.) On some ARMs, it is more efficient when
the load is 8-byte aligned because it can use a single 64-bit memory
access - but it will still work fine with 4-byte alignment.

There are 32 bit (address width) processors with a 64 bit data bus,
such that 64 bit aligned load/store are faster for 64 bit data,
than unaligned load/store.

The ones I know of don't require alignment, but it is faster.

-- glen

James Kuyper · Sep 12, 2013

On 09/11/2013 01:45 PM, James Kuyper wrote:
....

No. "The combined effect of all alignment attributes in a declaration
shall not specify an alignment that is less strict than the alignment
that would otherwise be required for the type of the object or member
being declared." (6.7.5p4)

That "shall" occurs in a Constraints section, so creating such an
alignment attribute would be a constraint violation, requiring a
diagnostic. _Alignas() is a new feature, and I hadn't previously noticed
this clause. I had thought that _Alignas() requirements that were less
strict were simply ignored.

This means that unless you're certain whether or not _Alignof(T) is less
than _Alignof(U), (where T and U are type names) you should write:

The next 3 _Alignof()s were supposed to be _Alignas():

Keith Thompson · Sep 12, 2013

James Kuyper said:
I know of no reason why that should be the case - could explain why you
think it is?

If an object of type int64_t is allocated in a way that lets the
compiler control its alignment (say, if it's a declared object, a
malloc()ed object, or a subobject of such an object), it will be
properly aligned. Just what that means depends on the implementation;
one implementation might require 8-byte alignment, another might not
require any particular alignment.

If a chunk of memory that's not such an object is treated as an object
of type int64_t, then the behavior is undefined unless it's properly
aligned (and on some systems, a 4-byte aligned int64_t is not properly
aligned).

Keith Thompson · Sep 12, 2013

David Brown said:
That is correct. In general, gcc's "packed" attribute tells gcc to
disregard standard alignment rules - it should put elements tightly
together without any padding, and it should not assume any alignments
for the struct or its elements. So for a target that does not support
unaligned accesses, it is forced to use byte accesses. (Actually, it
might figure out that it can do better than that in some circumstances,
if it has all the information at hand.)

But the use of the __attribute((packed)) or #pragma pack can lead
to incorrect code. The compiler can detect misaligned accesses that
refer to the member name, but if you take the address of a misaligned
member and later dereference that pointer, your program can crash.

http://stackoverflow.com/q/8568432/827263

Sven KÃ¶hler · Sep 12, 2013

Am 12.09.2013 03:11, schrieb James Kuyper:

The way arrays work implies that sizeof(T) must be an integer multiple
of _Alignof(T), so you don't need to worry about that possibility.
However, the standard says nothing to prohibit unnecessary padding
between members of a struct; it does prohibit padding between elements
of an array. Because the padding is in fact unnecessary, you're pretty
unlikely to run into an implementation where the difference matters. But
using an array is no more complicated than using a struct (in fact, it's
marginally simpler), so why not use the approach that is also safer,
even if only by an infinitesimal amount?

I see, that makes sense! Thanks.

Regards,
Sven

Eric Sosman · Sep 12, 2013

Alignment must always be a factor of the size; it can't be greater. Otherwise arrays with more than one element couldn't work. There is no padding between array elements.

C11 adds a requirement that the alignment be a power of two.
A ten-byte `long double' is okay, but not if it needs five-byte
alignment.

(I suppose if a struct has a size > SIZE_MAX / 2 where an array of two elements is not possible, the size could be odd while alignment is even).

Hadn't thought of that one. Interesting corner case.

Sven Köhler · Sep 24, 2013

(I suppose if a struct has a size > SIZE_MAX / 2 where an array of
two elements is not possible, the size could be odd while alignment
is even).

To the best of my knowledge, the size of a struct is always padded to be
a multiple of the alignments of the members. I.e. if you have a struct
of an int (size 4) and a char in that order, then the struct will be
padded to size 8 if the int requires alignment 4.

Regards,
Sven

Edward A. Falk · Sep 25, 2013

typedef struct __attribute__((packed)) {
int64_t x;
} s1;

[...]

Click to expand...

What do you mean by "portable?"

Yeah, "portable" pretty much went out the window by the second paragraph.

Edward A. Falk · Sep 25, 2013

Okay, I *think* I get it, but let me try to restate the
problem in case I'm still lost:

You've got a pointer to a batch of bytes that you'd like
to treat as an int64_t, but you fear the address may not meet
int64_t's alignment requirement. You've tried various gcc
extensions but aren't entirely happy with them, because they
produce ultra-conservative byte-at-a-time code even on systems
where the penalty for unaligned access would be tolerable. You
seek an incantation that will produce "good" code on such systems
yet produce "safe" code on others. Have I got it?

Best bet is to use memcpy() to move the data to a properly-aligned
variable, and hope that the compiler replaces the memcpy() with
something more efficient for the architecture. There might be
non-portable techniques that are more efficient, but unless this
is critical inner-loop stuff, it's almost certainly not worth it.

Struct with unaligned fields	58	Aug 22, 2013
Practical packing for structs of bytes	12	Sep 17, 2010
Unaligned pointers question	7	Oct 22, 2005
gcc alignment options	19	Sep 16, 2012
Alignment, Cast	27	Aug 28, 2007
incompatible pointer assignment	7	Dec 10, 2012
Alignment problems	20	Dec 1, 2011
Can one get away with an under-allocated union?	5	Dec 25, 2010

unaligned pointer access

glen herrmannsfeldt

James Kuyper

Keith Thompson

Keith Thompson

Sven KÃ¶hler

Eric Sosman

Sven Köhler

Edward A. Falk

Edward A. Falk

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads