Endian revisited

S

Sheldon

Hi,

I am trying to make sense of this endian problem and so far, it is
still Greek to me.
I am have some files that have stored lat and lon data in binary
format. The data was originally floats (32) and written in with the
most significant byte in the lowest address (big endian). I have an
Intel processor and working in Mandrake. In Python, I have solved this
with a built in function, but in C, I am at a loss. What I have grasp
so far is that the data ought to read in an unsigned char array and
then I have to look at the data bit by bit and then swap them. It is
this part that looses me.

I have seen this example for an int 32:

unsigned char j[4];
uint32_t o;
read(descriptor, j, 4); /* error checking ommitted for simplicity */
o = j[0]; o<<=8;
o | = j[1]; o<<=8;
o | = j[2]; o<<=8;
o | =j[3];

Could someone please explain what this person did, or, explain how to
do this byte swap?

Sincerely,
Sheldon
 
R

Robert Latest

On 23 Oct 2006 05:38:42 -0700,
in Msg. said:
unsigned char j[4];
uint32_t o;
read(descriptor, j, 4); /* error checking ommitted for simplicity */
o = j[0]; o<<=8;
o | = j[1]; o<<=8;
o | = j[2]; o<<=8;
o | =j[3];

Could someone please explain what this person did, or, explain how to
do this byte swap?

Assuming that you know the meaning of the binary-or and the
bit-shift operator, what is it thar you're having trouble with?

In fact the code is written a bit awkwardly; I'd have done it like
this:

o = (j[0] << 24) | (j[1] << 16) | (j[2] << 8) | j[3];

In your case you'd have to do it differently because the bitwise
operators won't make much sense with floats. You'd just swap the
four bytes in your array and cast the result to the appropriate
type -- assuming that the binary representation of floats on the
machine that generated the numbers is the same as on the machine
you read them with.

robert
 
C

CBFalconer

Robert said:
Sheldon said:
unsigned char j[4];
uint32_t o;
read(descriptor, j, 4); /* error checking ommitted for simplicity */
o = j[0]; o<<=8;
o | = j[1]; o<<=8;
o | = j[2]; o<<=8;
o | =j[3];

Could someone please explain what this person did, or, explain how
to do this byte swap?

Assuming that you know the meaning of the binary-or and the
bit-shift operator, what is it thar you're having trouble with?

In fact the code is written a bit awkwardly; I'd have done it like
this:

o = (j[0] << 24) | (j[1] << 16) | (j[2] << 8) | j[3];

Consider the result when CHAR_BIT is 9, for instance.
 
S

Sheldon

Robert Latest skrev:
On 23 Oct 2006 05:38:42 -0700,
in Msg. said:
unsigned char j[4];
uint32_t o;
read(descriptor, j, 4); /* error checking ommitted for simplicity */
o = j[0]; o<<=8;
o | = j[1]; o<<=8;
o | = j[2]; o<<=8;
o | =j[3];

Could someone please explain what this person did, or, explain how to
do this byte swap?

Assuming that you know the meaning of the binary-or and the
bit-shift operator, what is it thar you're having trouble with?

In fact the code is written a bit awkwardly; I'd have done it like
this:

o = (j[0] << 24) | (j[1] << 16) | (j[2] << 8) | j[3];

In your case you'd have to do it differently because the bitwise
operators won't make much sense with floats. You'd just swap the
four bytes in your array and cast the result to the appropriate
type -- assuming that the binary representation of floats on the
machine that generated the numbers is the same as on the machine
you read them with.

robert

I cannot check to see if the binary representation on my PC and the IBM
the files were written on are the same but they should be ok since this
conversion was done on my machine before using IDL and Python.
What I don't understand is that four bytes are read in an array and
then shifted: byte 1 to the left by 24, byte 2 by 16, byte 3 by 8 and
then byte 4 remains. What happened? Does this shifting pushes the bytes
forward so that at the end byte 4 is in the first 4 addresses?
I would be reading in 2D array and how then would I do this bytewise
shifting for each position?

/Sheldon
 
B

Bill Medland

Sheldon said:
Hi,

I am trying to make sense of this endian problem and so far, it is
still Greek to me.
I am have some files that have stored lat and lon data in binary
format. The data was originally floats (32) and written in with the
most significant byte in the lowest address (big endian). I have an
Intel processor and working in Mandrake. In Python, I have solved this
with a built in function, but in C, I am at a loss. What I have grasp
so far is that the data ought to read in an unsigned char array and
then I have to look at the data bit by bit and then swap them. It is
this part that looses me.

I have seen this example for an int 32:

unsigned char j[4];
uint32_t o;
read(descriptor, j, 4); /* error checking ommitted for simplicity */
o = j[0]; o<<=8;
o | = j[1]; o<<=8;
o | = j[2]; o<<=8;
o | =j[3];

Could someone please explain what this person did, or, explain how to
do this byte swap?

Sincerely,
Sheldon
Take the 8 bits of the first byte and place them in the lower 8 bits of the
32 bit unsigned 32 bit type.
Shift the whole lot left 8 bits so that they now sit in bits 8-15, with bits
0 to 7 now zero.
Now expand the second byte into a temporary uint32_t so the bits are in the
lowest 8 bits and merge this temporary one onto the first. So now the
first byte is in bits 8-15 and the second byte is in bits 0-7.
Repeat a couple more times and you get an unsigned 32 bit type with the
first byte in the top 8 bits and the last byte in the bottom 8 bits.

(and don't worry about whether the uint32 is storing its information in LSB
or MSB format; it doesn't matter)
 
S

Sheldon

Bill Medland skrev:
Sheldon said:
Hi,

I am trying to make sense of this endian problem and so far, it is
still Greek to me.
I am have some files that have stored lat and lon data in binary
format. The data was originally floats (32) and written in with the
most significant byte in the lowest address (big endian). I have an
Intel processor and working in Mandrake. In Python, I have solved this
with a built in function, but in C, I am at a loss. What I have grasp
so far is that the data ought to read in an unsigned char array and
then I have to look at the data bit by bit and then swap them. It is
this part that looses me.

I have seen this example for an int 32:

unsigned char j[4];
uint32_t o;
read(descriptor, j, 4); /* error checking ommitted for simplicity */
o = j[0]; o<<=8;
o | = j[1]; o<<=8;
o | = j[2]; o<<=8;
o | =j[3];

Could someone please explain what this person did, or, explain how to
do this byte swap?

Sincerely,
Sheldon
Take the 8 bits of the first byte and place them in the lower 8 bits of the
32 bit unsigned 32 bit type.
Shift the whole lot left 8 bits so that they now sit in bits 8-15, with bits
0 to 7 now zero.
Now expand the second byte into a temporary uint32_t so the bits are in the
lowest 8 bits and merge this temporary one onto the first. So now the
first byte is in bits 8-15 and the second byte is in bits 0-7.
Repeat a couple more times and you get an unsigned 32 bit type with the
first byte in the top 8 bits and the last byte in the bottom 8 bits.

(and don't worry about whether the uint32 is storing its information in LSB
or MSB format; it doesn't matter)

I see. My next question is really silly, but here goes: when I read the
array into a 15x15 array of unsighned char and then take the first
position array[0][0], how do i then break this up into 4x4 bytes to do
the swapping?

/Sheldon
 
R

Robert Latest

On Mon, 23 Oct 2006 09:38:23 -0400,
CBFalconer said:
Robert said:
Sheldon said:
unsigned char j[4];
uint32_t o;
read(descriptor, j, 4); /* error checking ommitted for simplicity */
o = j[0]; o<<=8;
o | = j[1]; o<<=8;
o | = j[2]; o<<=8;
o | =j[3];

Could someone please explain what this person did, or, explain how
to do this byte swap?

Assuming that you know the meaning of the binary-or and the
bit-shift operator, what is it thar you're having trouble with?

In fact the code is written a bit awkwardly; I'd have done it like
this:

o = (j[0] << 24) | (j[1] << 16) | (j[2] << 8) | j[3];

Consider the result when CHAR_BIT is 9, for instance.

Given that this is about endianness of floating point numbers
portability wasn't a prime concern to begin with.

robert
 
B

Bill Medland

CBFalconer said:
Robert said:
Sheldon said:
unsigned char j[4];
uint32_t o;
read(descriptor, j, 4); /* error checking ommitted for simplicity */
o = j[0]; o<<=8;
o | = j[1]; o<<=8;
o | = j[2]; o<<=8;
o | =j[3];

Could someone please explain what this person did, or, explain how
to do this byte swap?

Assuming that you know the meaning of the binary-or and the
bit-shift operator, what is it thar you're having trouble with?

In fact the code is written a bit awkwardly; I'd have done it like
this:

o = (j[0] << 24) | (j[1] << 16) | (j[2] << 8) | j[3];

Consider the result when CHAR_BIT is 9, for instance.
If we're going to be that pedantic doesn't it all fall apart if an int is 16
bit?
 
T

Thad Smith

Sheldon said:
I am trying to make sense of this endian problem and so far, it is
still Greek to me.
I am have some files that have stored lat and lon data in binary
format. The data was originally floats (32) and written in with the
most significant byte in the lowest address (big endian). I have an
Intel processor and working in Mandrake. In Python, I have solved this
with a built in function, but in C, I am at a loss. What I have grasp
so far is that the data ought to read in an unsigned char array and
then I have to look at the data bit by bit and then swap them. It is
this part that looses me.

Actually, you make the adjustment a byte at a time, not bit at a time.
I have seen this example for an int 32:

unsigned char j[4];
uint32_t o;
read(descriptor, j, 4); /* error checking ommitted for simplicity */
o = j[0]; o<<=8;
o | = j[1]; o<<=8;
o | = j[2]; o<<=8;
o | =j[3];

Could someone please explain what this person did, or, explain how to
do this byte swap?

This code computes an unsigned 32-bit value, assuming that the input is
big-endian with 8-bit bytes. It works on either big-, little-, or
mixed-endian processor. j[0] is the mist significant byte, so it gets
shifted left by the code 24 bits.

To create your float, assuming the same basic fp format, you can use the
above code to convert the input to a 32-bit unsigned int, then map it to
a float by storing into a union of float and uint32_t and retrieving the
float value:
float g;
union {
float f;
uint32_t ui;
} u;
u.ui = o;
g = u.f;

or by casting an uint32_t pointer to a float pointer:
g = (float*) &o;

These methods invoke behavior that is unspecified and undefined by
Standard C, but can work in your application. You should, of course,
verify correct operation. If I were writing such code I would surround
the code with conditional preprocessor directives which ensure that the
compiler and target processor match the ones which had been verified.
Other combinations would result in an #error directive.
 
N

Nick Keighley

Sheldon said:
Hi,

I am trying to make sense of this endian problem and so far, it is
still Greek to me.
I am have some files that have stored lat and lon data in binary
format. The data was originally floats (32) and written in with the
most significant byte in the lowest address (big endian).

this is generally a bad idea. If you can avoid it don't store floating
point numbers in a file in binary. Use a textual representation.
Consider XDR or ASN.1.

If it has to be binary and you're writing on one platform and reading
on another, you better hope that they are both IEEE 754 (very likely)
so you only have endianess to worry about. Write out some known values
so you can work out the ordering. At the reveiving end read the values
into an array of unsigned char, reorder the octets (bytes) then build
the floating point value. A cast may suffice.

<snip>
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,995
Messages
2,570,230
Members
46,817
Latest member
DicWeils

Latest Threads

Top