G
glen herrmannsfeldt
(snip)
There are 32 bit (address width) processors with a 64 bit data bus,
such that 64 bit aligned load/store are faster for 64 bit data,
than unaligned load/store.
The ones I know of don't require alignment, but it is faster.
-- glen
int64_t fetch64(uint32_t *p) {
union {
struct { uint32_t a; uint32_t b; } s;
int64_t x;
} u;
u.s.a = *p++;
u.s.b = *p;
return u.x;
}
This should work on all architectures, and give the best code for
reading an 8-byte int from an address that is known to be 4-byte
aligned. As far as I know, the code is fully portable C (hopefully
someone will correct me if I'm wrong - this c.l.c. is good at that!).
Compiling with gcc for the ARM (arm7tdmi) gives:
fetch64:
ldmia r0, {r0, r1}
bx lr
Like most (or perhaps all) 32-bit processors, ARM is perfectly happy
with 4-byte alignment for 8-byte integers. (It may require 8-byte
alignment for doubles for cpus that support hardware floating point - I
haven't checked those details.) On some ARMs, it is more efficient when
the load is 8-byte aligned because it can use a single 64-bit memory
access - but it will still work fine with 4-byte alignment.
There are 32 bit (address width) processors with a 64 bit data bus,
such that 64 bit aligned load/store are faster for 64 bit data,
than unaligned load/store.
The ones I know of don't require alignment, but it is faster.
-- glen