bytes to unsigned long

M

moumita

Hi All,
I need to convert 4 bytes to an unsigned long.
Suppose I have one array like unsigned char buf[4].I need to convert
these 4 bytes into a single
unsigned long. Is the following piece of code is right??Or is it a
right approch to do that??

unsigned long temp;
temp= (unsigned long) buff[3];
temp | =((unsigned long) buff[2]) << 8;
temp | =((unsigned long) buff[1]) << 16
temp | =((unsigned long) buff[0]) << 24;

Waiting for your suggestions.
 
J

Jim Langston

moumita said:
Hi All,
I need to convert 4 bytes to an unsigned long.
Suppose I have one array like unsigned char buf[4].I need to convert
these 4 bytes into a single
unsigned long. Is the following piece of code is right??Or is it a
right approch to do that??

unsigned long temp;
temp= (unsigned long) buff[3];
temp | =((unsigned long) buff[2]) << 8;
temp | =((unsigned long) buff[1]) << 16
temp | =((unsigned long) buff[0]) << 24;

Waiting for your suggestions.

There are a few ways to do it. One way I've done it in the past is to
simply treat a unsigned long as a char array and load the bytes in. Endian
may be an issue.

unsigned long temp;
for ( int i = 0; i < sizeof( unsigned long ); ++i )
(reinterpret_cast<char*>(&temp)) = buff;

The advantage of this is that it works on any size of unsigned long, just
gotta make sure the buffer is long enough. How the buffer was loaded with
the unsigned long also may matter (big .vs. little endian).

I've seen your method used, however.
 
J

James Kanze

I need to convert 4 bytes to an unsigned long.
Suppose I have one array like unsigned char buf[4].I need to convert
these 4 bytes into a single
unsigned long. Is the following piece of code is right??Or is it a
right approch to do that??
unsigned long temp;
temp= (unsigned long) buff[3];
temp | =((unsigned long) buff[2]) << 8;
temp | =((unsigned long) buff[1]) << 16
temp | =((unsigned long) buff[0]) << 24;

Maybe. It's the right approach, anyway. The question is where
the four bytes come from. If they're from an Internet protocol,
it's correct.

You might prefer using uint32_t instead of unsigned long. It's
not present in the current version of the C++ standard, but it
will be part of the next version, and it is already standard C,
so it should be supported by most compilers (provided you
include <stdint.h>, of course). On many modern machines,
unsigned long is 64 bits. (Not that it really matters here.)
 
G

Gianni Mariani

moumita said:
Hi All,
I need to convert 4 bytes to an unsigned long.
Suppose I have one array like unsigned char buf[4].I need to convert
these 4 bytes into a single
unsigned long. Is the following piece of code is right??Or is it a
right approch to do that??

unsigned long temp;
temp= (unsigned long) buff[3];
temp | =((unsigned long) buff[2]) << 8;
temp | =((unsigned long) buff[1]) << 16
temp | =((unsigned long) buff[0]) << 24;

Waiting for your suggestions.

You may need to worry about endianness...

I posted one of these things a while back ... oh here it is.
http://groups.google.com/group/comp.programming/msg/061db1be797a255f

I attached an example of how you can do it. It's kind of the whole hog,
it allows you to simply re-interpret cast and read the value in the
correct byte order.




template <class base_type, bool wire_is_big_endian = true >
class NetworkOrder
{
public:

base_type m_uav;

static inline bool EndianCheck()
{
unsigned x = 1;
return wire_is_big_endian == ! ( * ( char * )( & x ) );
}

static inline void OrderRead(
const base_type & i_val,
base_type & i_destination
)
{
unsigned char * src = ( unsigned char * ) & i_val;
unsigned char * dst = ( unsigned char * ) & i_destination;

if (
( sizeof( base_type ) == 1 )
|| EndianCheck()
) {

//
// Alignment is an issue some architectures so
// even for non-swapping we read a byte at a time

if ( sizeof( base_type ) == 1 ) {
dst[0] = src[0];
} else if ( sizeof( base_type ) == 2 ) {
dst[0] = src[0];
dst[1] = src[1];
} else if ( sizeof( base_type ) == 4 ) {
dst[0] = src[0];
dst[1] = src[1];
dst[2] = src[2];
dst[3] = src[3];
} else {

for (
int i = sizeof( base_type );
i > 0;
i --
) {
* ( dst ++ ) = * ( src ++ );
}
}

} else {

if ( sizeof( base_type ) == 2 ) {
dst[1] = src[0];
dst[0] = src[1];
} else if ( sizeof( base_type ) == 4 ) {
dst[3] = src[0];
dst[2] = src[1];
dst[1] = src[2];
dst[0] = src[3];
} else {
dst += sizeof( base_type ) -1;
for ( int i = sizeof( base_type ); i > 0; i -- ) {
* ( dst -- ) = * ( src ++ );
}
}
}
}

static inline void OrderWrite(
const base_type & i_val,
base_type & i_destination
)
{
// for the time being this is the same as OrderRead
OrderRead( i_val, i_destination );
}

inline operator base_type () const
{
base_type l_value;
OrderRead( m_uav, l_value );
return l_value;
}

inline base_type operator=( base_type in_val )
{
OrderWrite( in_val, m_uav );
return in_val;
}

};


#if 1
#include <iostream>

struct wire_data_little_endian
{
NetworkOrder<unsigned long, false> a;
};


struct wire_data_big_endian
{
NetworkOrder<unsigned long, true> a;
};


int main()
{


{
char buff[5] = { 1, 2, 3, 4, 0 };

wire_data_little_endian & data = * reinterpret_cast<wire_data_little_endian *>( buff );

unsigned long x = data.a;

std::cout << "little value " << std::hex << x << "\n";

data.a = 0x41424344UL;

std::cout << "little buff " << buff << "\n";
}

{
char buff[5] = { 1, 2, 3, 4, 0 };

wire_data_big_endian & data = * reinterpret_cast<wire_data_big_endian *>( buff );

unsigned long x = data.a;

std::cout << "big endian value " << std::hex << x << "\n";

data.a = 0x41424344UL;

std::cout << "big endian buff " << buff << "\n";
}

}

#endif
 
R

Rennie deGraaf

moumita said:
Hi All,
I need to convert 4 bytes to an unsigned long.
Suppose I have one array like unsigned char buf[4].I need to convert
these 4 bytes into a single
unsigned long. Is the following piece of code is right??Or is it a
right approch to do that??

unsigned long temp;
temp= (unsigned long) buff[3];
temp | =((unsigned long) buff[2]) << 8;
temp | =((unsigned long) buff[1]) << 16
temp | =((unsigned long) buff[0]) << 24;

Waiting for your suggestions.

That's one way to do it, assuming that you've figured out your
endianness and that unsigned long is at least 32 bits on your system.
An alternate method is to use a union, as in something like this:

union ulong_u
{
unsigned long ul;
unsigned char uc[4];
};

//...

ulong_u u;
std::memcpy(&u.uc, &buf, 4);
unsigned long temp = u.ul;

Of course, you may have to shuffle the bytes that you assign to u.uc to
handle endianness correctly.

Rennie deGraaf


-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.7 (GNU/Linux)

iD8DBQFGQYi1IvU5mZP08HERAhBjAJ4mKgI7tfbOKUO2D6OinkKoLti1VwCgroze
CR3c6XXsc29SR7eYeYIZPJ0=
=nUFL
-----END PGP SIGNATURE-----
 
M

moumita

Hi All,
I need to convert 4 bytes to an unsigned long.
Suppose I have one array like unsigned char buf[4].I need to convert
these 4 bytes into a single
unsigned long. Is the following piece of code is right??Or is it a
right approch to do that??
unsigned long temp;
temp= (unsigned long) buff[3];
temp | =((unsigned long) buff[2]) << 8;
temp | =((unsigned long) buff[1]) << 16
temp | =((unsigned long) buff[0]) << 24;
Waiting for your suggestions.

There are a few ways to do it. One way I've done it in the past is to
simply treat a unsigned long as a char array and load the bytes in. Endian
may be an issue.

unsigned long temp;
for ( int i = 0; i < sizeof( unsigned long ); ++i )
(reinterpret_cast<char*>(&temp)) = buff;

The advantage of this is that it works on any size of unsigned long, just
gotta make sure the buffer is long enough. How the buffer was loaded with
the unsigned long also may matter (big .vs. little endian).

I've seen your method used, however.


thank u all for the reply
 
O

Old Wolf

Hi All,
I need to convert 4 bytes to an unsigned long.

There are a few ways to do it. One way I've done it in the past is to
simply treat a unsigned long as a char array and load the bytes in. Endian
may be an issue.

unsigned long temp;
for ( int i = 0; i < sizeof( unsigned long ); ++i )
(reinterpret_cast<char*>(&temp)) = buff;


This way , and Rennie deGraaf's way, are non-portable. You might
cause a program crash by creating a bit pattern that is not valid for
an unsigned long, and also you don't have any control over what
integer you get out of the bytes you put in.

The only reliable method is the one used in the OP code.
AFAIC any time you have to say "endian might be an issue",
there's something wrong with your algorithm.
 
J

James Kanze

moumita said:
I need to convert 4 bytes to an unsigned long.
Suppose I have one array like unsigned char buf[4].I need to convert
these 4 bytes into a single
unsigned long. Is the following piece of code is right??Or is it a
right approch to do that??
unsigned long temp;
temp= (unsigned long) buff[3];
temp | =((unsigned long) buff[2]) << 8;
temp | =((unsigned long) buff[1]) << 16
temp | =((unsigned long) buff[0]) << 24;
Waiting for your suggestions.
There are a few ways to do it. One way I've done it in the past is to
simply treat a unsigned long as a char array and load the bytes in. Endian
may be an issue.

As may be any number of other issues.
unsigned long temp;
for ( int i = 0; i < sizeof( unsigned long ); ++i )
(reinterpret_cast<char*>(&temp)) = buff;

The advantage of this is that it works on any size of unsigned long, just
gotta make sure the buffer is long enough.

The disadvantage of this is that it supposes that the external
representation corresponds exactly to the internal one. You're
"advantage" is actually a serious disadvantage. If the external
format is four bytes, you want to convert exactly four bytes, no
more no less. You don't want to suddenly start reading eight
bytes just because you upgraded your machine, when only four
bytes were read.
 
J

James Kanze

moumita wrote:
I need to convert 4 bytes to an unsigned long.
Suppose I have one array like unsigned char buf[4].I need to convert
these 4 bytes into a single
unsigned long. Is the following piece of code is right??Or is it a
right approch to do that??
unsigned long temp;
temp= (unsigned long) buff[3];
temp | =((unsigned long) buff[2]) << 8;
temp | =((unsigned long) buff[1]) << 16
temp | =((unsigned long) buff[0]) << 24;
Waiting for your suggestions.
You may need to worry about endianness...

His code handles endianness transparently. That's why he wrote
it like that.
I attached an example of how you can do it. It's kind of the whole hog,
it allows you to simply re-interpret cast and read the value in the
correct byte order.
[xx_endian.cpp]

template <class base_type, bool wire_is_big_endian = true >

Question: we're talking about a four byte entity here. There
are 24 different byte orders possible. I've actually seen at
least three. How do you represent this with a bool?

His original code was much cleaner, easier to understand, and
far more portable.
 
J

James Kanze

moumita wrote:
I need to convert 4 bytes to an unsigned long.
Suppose I have one array like unsigned char buf[4].I need to convert
these 4 bytes into a single
unsigned long. Is the following piece of code is right??Or is it a
right approch to do that??
unsigned long temp;
temp= (unsigned long) buff[3];
temp | =((unsigned long) buff[2]) << 8;
temp | =((unsigned long) buff[1]) << 16
temp | =((unsigned long) buff[0]) << 24;
Waiting for your suggestions.
That's one way to do it, assuming that you've figured out your
endianness and that unsigned long is at least 32 bits on your system.

The whole point of his code is that the endianness of the
internal representation doesn't matter. And of course, unsigned
long is required by the language to be at least 32 bits.

If the external representation is the standard Internet four
byte integer, his code is guaranteed to work as long as the
machine it is running on guarantees that any upper bits (above
the 8 low order bits) of a unsigned char are 0, for the data
source in question. (E.g. if he's running on a machine with 9
bit char, the hardware reading the data will still read it in 8
bit blocks, putting one per char, and setting the upper bit to
0, rather that e.g. parity or whatever.) It's 100% guaranteed
for any machine with 8 bit char, which covers a pretty large
percentage of current implementations. The one place he might
run into problems is on some DSP with 32 bit char, which could
read putting all four network bytes into a single char.
An alternate method is to use a union, as in something like this:
union ulong_u
{
unsigned long ul;
unsigned char uc[4];
};

And that doesn't work, because there's not the slightest
guarantee concerning the compatibility of the representations.
//...

ulong_u u;
std::memcpy(&u.uc, &buf, 4);
unsigned long temp = u.ul;

That generates the wrong results on all of the machines I use.
Of course, you may have to shuffle the bytes that you assign to u.uc to
handle endianness correctly.

Which still doesn't handle the fact that:

-- how you "shuffle" the bytes depends on the machine, the
compiler, the version of the compiler, and maybe even the
options used when compiling,

-- on most modern machines, unsigned long will be longer than
four bytes,

-- on at least one machine still being sold, unsigned char is 9
bits; if the upper bit is 0, then the value will not
correspond, and

-- on at least one machine in the past, unsigned long had
padding bits, which had to be 0. (Of course, on that
machine, an unsigned long was 6 bytes, so you would have had
problems because of the second point as well.)
 
G

Gianni Mariani

moumita said:
I need to convert 4 bytes to an unsigned long.
Suppose I have one array like unsigned char buf[4].I need to convert
these 4 bytes into a single
unsigned long. Is the following piece of code is right??Or is it a
right approch to do that??
unsigned long temp;
temp= (unsigned long) buff[3];
temp | =((unsigned long) buff[2]) << 8;
temp | =((unsigned long) buff[1]) << 16
temp | =((unsigned long) buff[0]) << 24;
Waiting for your suggestions.
You may need to worry about endianness...

His code handles endianness transparently. That's why he wrote
it like that.

Are you sure it should not be 0,1,2,3 instead or 3,2,1,0 ? i.e. is
the wire order b/e or l/e ? The only choice we need to make in the
NetworkOrder class is wether a true or a false is needed. The
NetworkOrder class may have many other issues (it's not really
copiable - but you never really should copy it, It's strictly UB but
it works and will need to continue to work (due to ABI issues) for a
very long time),
I attached an example of how you can do it. It's kind of the whole hog,
it allows you to simply re-interpret cast and read the value in the
correct byte order.
[xx_endian.cpp]
template <class base_type, bool wire_is_big_endian = true >

Question: we're talking about a four byte entity here. There
are 24 different byte orders possible. I've actually seen at
least three. How do you represent this with a bool?

I have only seen 2 endiannesses that *I* have ever needed to support.
If someone cares about different orders, they're welcome to extend the
class.
His original code was much cleaner, easier to understand, and
far more portable.

You know better than to say that to me.

The "Mariani Minimum Complexity Proposition" suggests that any
complexity you can place in a library is better than placed in all
other locations in the code. Why and/or when is "std::string" better
than "char *" ?

i.e.

unsigned long val = wire_buffer.val;

and

wire_buffer.val = val;

is a whole lot easier to write and maintain than:

unsigned long temp;
temp= (unsigned long) buff[3];
temp | =((unsigned long) buff[2]) << 8;
temp | =((unsigned long) buff[1]) << 16
temp | =((unsigned long) buff[0]) << 24;

.... the other 6 lines of code for writing it.

Oh - and if you every need to support one of those other 22 endian
types, all the code is in one place to fix that.
 
J

James Kanze

moumita wrote:
I need to convert 4 bytes to an unsigned long.
Suppose I have one array like unsigned char buf[4].I need to convert
these 4 bytes into a single
unsigned long. Is the following piece of code is right??Or is it a
right approch to do that??
unsigned long temp;
temp= (unsigned long) buff[3];
temp | =((unsigned long) buff[2]) << 8;
temp | =((unsigned long) buff[1]) << 16
temp | =((unsigned long) buff[0]) << 24;
Waiting for your suggestions.
You may need to worry about endianness...
His code handles endianness transparently. That's why he wrote
it like that.
Are you sure it should not be 0,1,2,3 instead or 3,2,1,0 ? i.e. is
the wire order b/e or l/e ?

It depends on the protocol. Presumably, his code is specific to
the protocol. His code implements big endian, which is correct
for all of the Internet protocols, for fixed width integers in
BER, and for most other protocols. (FWIW: I don't know of a
small endian protocol.)
The only choice we need to make in the
NetworkOrder class is wether a true or a false is needed. The
NetworkOrder class may have many other issues (it's not really
copiable - but you never really should copy it, It's strictly UB but
it works and will need to continue to work (due to ABI issues) for a
very long time),

Your code assumed two possible orders, both for the line and for
the internal representation. In practice, there is only one for
the line, except perhaps for some special in house protocols.
On the other hand, I've actually seen 3 different internal
orders (not just 2). His code is transparent to the internal
ordering.
I attached an example of how you can do it. It's kind of the whole hog,
it allows you to simply re-interpret cast and read the value in the
correct byte order.
[xx_endian.cpp]
template <class base_type, bool wire_is_big_endian = true >
Question: we're talking about a four byte entity here. There
are 24 different byte orders possible. I've actually seen at
least three. How do you represent this with a bool?
I have only seen 2 endiannesses that *I* have ever needed to support.
If someone cares about different orders, they're welcome to extend the
class.

Fine. I've actually seen and used three different internal
orderings. All on very widely used machines---nothing exotic.
(But you've probably never heard of MS-DOS, or PDP-11's. All
the world is Windows.)
You know better than to say that to me.

Why? Because you know it all, and won't listen, even to people
who have considerably more experience than you. (The code you
posted is what I would consider amaturish, and would certainly
fail code review anywhere I've worked.)
The "Mariani Minimum Complexity Proposition" suggests that any
complexity you can place in a library is better than placed in all
other locations in the code. Why and/or when is "std::string" better
than "char *" ?

So what does that have to do with anything here. You've got a
block of extremely hard to read, hard to modify, overly complex
code which doesn't handle as many real cases as the original.
unsigned long val = wire_buffer.val;

wire_buffer.val = val;
is a whole lot easier to write and maintain than:
unsigned long temp;
temp= (unsigned long) buff[3];
temp | =((unsigned long) buff[2]) << 8;
temp | =((unsigned long) buff[1]) << 16
temp | =((unsigned long) buff[0]) << 24;

Obviously, this is in a library somewhere. The use is (almost)
exactly the same. (Actually, my own code for this is in an
ixdrstream/oxdrstream class, using the iostream idiom. So you
write:

source >> val1 >> val2 ...

where source is an ixdrstream, using a streambuf connected to
the socket.)

We're talking here about the code you put into the library, not
about the interface of the library.
... the other 6 lines of code for writing it.
Oh - and if you every need to support one of those other 22 endian
types, all the code is in one place to fix that.

The trick is, of course, that his code handles the internal
representation transparently, regardless of what it is. Neither
yours nor his (nor mine) handle "exotic" representations,
however. Some of which (e.g. variable length ints in BER) are
fairly widespread.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,995
Messages
2,570,236
Members
46,825
Latest member
VernonQuy6

Latest Threads

Top