How to determine the byte order of machine.

B

BartC

Scott Fluhrer said:
That is incorrect. The 68K processor was consistently bigendian, and so
0xDDCCBBAA would be stored as the bytes (DD, CC, BB, AA).

I'd been looking at this diagram (top of page 461):

http://tinyurl.com/7vjemjc

But perhaps I'd misunderstood what they meant by byte 0, 1, 2 and 3. I
assumed byte 0 was least significant, the same way bit 0 is. I could be
wrong.
 
K

Keith Thompson

Joe Pfeiffer said:
If I wanted to check whether a machine were big-endian or little-endian,
my first choice would be to see whether the htonl macro (which converts
a host-order long to a network-order long, which we know in turn is
defined to be big-endian) changes anything. I expect that macro is
outside the scope of this newsgroup, since I'd be surprised if it were
part of the C standard.

It isn't.
 
E

Eric Sosman

[...]
You could also (on a machine with 8-bit bytes) declare an array of 4
chars, cast the array to a long and assign a value to it, then print the
values of the eight chars.

This particular suggestion has cropped up a couple times already
in this thread, so maybe it's time to point out the error: Since an
array of char has no particular alignment requirement, it might not
be aligned strictly enough for a `long' (or `int' or anything else).

unsigned char c[sizeof(unsigned long)] = { 1 }; // others 0
unsigned long *lp = (unsigned long*)&c[0]; // undefined
printf("%lX\n", *lp); // undefined

If you truly want to engage in this sort of thing, do it the
other way around:

unsigned long l = 1;
unsigned char *cp = (unsigned char*)&l;
for (int i = 0; i < sizeof l; ++i)
printf("%X ", cp);
printf("\n");

That is, begin with the multi-byte value properly aligned, then
inspect its bytes with a character pointer that needs no alignment.
 
S

Shao Miller

[...]
You could also (on a machine with 8-bit bytes) declare an array of 4
chars, cast the array to a long and assign a value to it, then print the
values of the eight chars.

This particular suggestion has cropped up a couple times already
in this thread, so maybe it's time to point out the error: Since an
array of char has no particular alignment requirement, it might not
be aligned strictly enough for a `long' (or `int' or anything else).

unsigned char c[sizeof(unsigned long)] = { 1 }; // others 0
unsigned long *lp = (unsigned long*)&c[0]; // undefined
printf("%lX\n", *lp); // undefined

If you truly want to engage in this sort of thing, do it the
other way around:

unsigned long l = 1;
unsigned char *cp = (unsigned char*)&l;
for (int i = 0; i < sizeof l; ++i)
printf("%X ", cp);
printf("\n");

That is, begin with the multi-byte value properly aligned, then
inspect its bytes with a character pointer that needs no alignment.


Or use a 'union', perhaps:

int big_endian(void) {
const union {
unsigned long val;
unsigned char bytes[sizeof (unsigned long)];
} test = {42};
return test.bytes[0] != 42;
}
 
B

Ben Bacarisse

BartC said:
I'd been looking at this diagram (top of page 461):

http://tinyurl.com/7vjemjc

But perhaps I'd misunderstood what they meant by byte 0, 1, 2 and 3. I
assumed byte 0 was least significant, the same way bit 0 is. I could
be wrong.

In all such diagrams that I've seen, bytes are numbered by address.
Byte 0 will be the byte with the lowest address. What varies from
machine to machine is the significance of byte 0 vs. that of byte 1.
However you read the numbers, I still can't see how you get BB,AA,DD,CC
from that diagram!
 
J

Joe Pfeiffer

Eric Sosman said:
[...]
You could also (on a machine with 8-bit bytes) declare an array of 4
chars, cast the array to a long and assign a value to it, then print the
values of the eight chars.

This particular suggestion has cropped up a couple times already
in this thread, so maybe it's time to point out the error: Since an
array of char has no particular alignment requirement, it might not
be aligned strictly enough for a `long' (or `int' or anything else).

unsigned char c[sizeof(unsigned long)] = { 1 }; // others 0
unsigned long *lp = (unsigned long*)&c[0]; // undefined
printf("%lX\n", *lp); // undefined

If you truly want to engage in this sort of thing, do it the
other way around:

unsigned long l = 1;
unsigned char *cp = (unsigned char*)&l;
for (int i = 0; i < sizeof l; ++i)
printf("%X ", cp);
printf("\n");

That is, begin with the multi-byte value properly aligned, then
inspect its bytes with a character pointer that needs no alignment.


Ah, of course. I don't think I've ever seen an array that wasn't
aligned appropriately when required, but you are correct.
 
B

BartC

Ben Bacarisse said:
In all such diagrams that I've seen, bytes are numbered by address.
Byte 0 will be the byte with the lowest address. What varies from
machine to machine is the significance of byte 0 vs. that of byte 1.
However you read the numbers, I still can't see how you get BB,AA,DD,CC
from that diagram!

If the byte numbers are simply offsets from the start address, then the
diagram doesn't show whether the high or low half is stored first. So it
could be BB,AA, DD,CC or DD,CC, BB,AA. But it looks like it's the latter,
and it must be some other system which is mixed up.
 
B

Ben Bacarisse

BartC said:
If the byte numbers are simply offsets from the start address, then
the diagram doesn't show whether the high or low half is stored
first.

No it doesn't. I think the diagram just illustrates addressing, not
significance within a word. It's the following text that adds this
detail at the very end of the paragraph. The diagram show the
significance of words within long words (the (H) and (L) in the example)
but not of bytes within words. It would not have hurt to add (H) and
(L) after the bytes as well, but the author chose not to.
So it could be BB,AA, DD,CC or DD,CC, BB,AA. But it looks like
it's the latter, and it must be some other system which is mixed up.

The (H) makes it clear that the first word (byte pair) must be the
high-order one, so you must reject BB,AA (and AA,BB) based on the
diagram alone. You need to read the text to tell that it's DD,CC;BB,AA
rather than CC,DD;AA,BB.
 
N

Nick Keighley

     Your best bet is to write code that works with values, regardless
of how those values are represented.  Ideally, you would not even
know how the machine represents eight hundred seventeen; you should
concern yourself with finding its factors or comparing it to nine
hundred twenty-six or whatever.  It is occasionally necessary to
pierce the veil, but not very often.

usually when you have to deal with external representation. You may
care about byte order and such like when data gets stuffed down comms
links or into files. But well written programs only have a tiny amount
of code that cares about this
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,082
Messages
2,570,589
Members
47,211
Latest member
Shamestone

Latest Threads

Top