S
SG
Hi!
I'm wondering what the preferred portable way of handling binary data
is. For example, I want to read a binary file which contains 32bit and
16bit integers in the little endian format. Now, I'm aware that a
character might have more bits than 8. But I don't care about this
case for now. So, I enclose my conversion routines for char* to some
int with preprocessor directives:
#include <climits>
#if CHAR_BIT == 8
// conversion code here
#endif
As far as I know the C++ standard doesn't specify whether a char is
signed or unsigned nor does it specify what will happen if i convert
between signed and unsigned in case the original value can't be
represented. Also, signed integers don't need to be stored in two's
complement. Unfortunately, this seems to make decoding a 16 bit signed
number in two's complement & little endian byte order in a portable
way impossible. I came up with the following piece of code which still
invokes implementation defined behaviour:
// decode signed 16 bit int (two's complement & little endian)
inline int_fast16_t get_s16le(const char* p)
{
// we already know that CHAR_BIT == 8 but "char" might be signed
// as well as unsigned
unsigned char low = p[0]; // implementation-defined for p[0]<0
signed char hi = p[1]; // implementation-defined for p[1]>=128
return int_fast16_t(low) + int_fast16_t(hi) * 256;
}
Also, this is horrorbly slow. I'd much rather be able to query certain
implementation properties so I can use much faster code.
My latest incarnation looks like this:
inline uint_fast16_t swap_bytes_16bit(uint_fast16_t x) {
return ((x & 0xFF00u) >> 8) | ((x & 0x00FFu) << 8);
}
inline uint_fast16_t get_u16le(const char* p) {
uint_fast16_t x;
assert(sizeof(x)>=2);
std::memcpy(&x,p,2);
#if BYTE_ORDER == LITTLE_ENDIAN
return x;
#else
return swap_bytes_16bit(x);
#endif
}
inline int_least16_t get_s16le(const char * p) {
assert( signed(~0u) == -1 ); //< This is not guaranteed by the
stamdard
return get_u16le(p);
}
What's the preferred way to do this in a reasonably portable way?
Cheers!
SG
I'm wondering what the preferred portable way of handling binary data
is. For example, I want to read a binary file which contains 32bit and
16bit integers in the little endian format. Now, I'm aware that a
character might have more bits than 8. But I don't care about this
case for now. So, I enclose my conversion routines for char* to some
int with preprocessor directives:
#include <climits>
#if CHAR_BIT == 8
// conversion code here
#endif
As far as I know the C++ standard doesn't specify whether a char is
signed or unsigned nor does it specify what will happen if i convert
between signed and unsigned in case the original value can't be
represented. Also, signed integers don't need to be stored in two's
complement. Unfortunately, this seems to make decoding a 16 bit signed
number in two's complement & little endian byte order in a portable
way impossible. I came up with the following piece of code which still
invokes implementation defined behaviour:
// decode signed 16 bit int (two's complement & little endian)
inline int_fast16_t get_s16le(const char* p)
{
// we already know that CHAR_BIT == 8 but "char" might be signed
// as well as unsigned
unsigned char low = p[0]; // implementation-defined for p[0]<0
signed char hi = p[1]; // implementation-defined for p[1]>=128
return int_fast16_t(low) + int_fast16_t(hi) * 256;
}
Also, this is horrorbly slow. I'd much rather be able to query certain
implementation properties so I can use much faster code.
My latest incarnation looks like this:
inline uint_fast16_t swap_bytes_16bit(uint_fast16_t x) {
return ((x & 0xFF00u) >> 8) | ((x & 0x00FFu) << 8);
}
inline uint_fast16_t get_u16le(const char* p) {
uint_fast16_t x;
assert(sizeof(x)>=2);
std::memcpy(&x,p,2);
#if BYTE_ORDER == LITTLE_ENDIAN
return x;
#else
return swap_bytes_16bit(x);
#endif
}
inline int_least16_t get_s16le(const char * p) {
assert( signed(~0u) == -1 ); //< This is not guaranteed by the
stamdard
return get_u16le(p);
}
What's the preferred way to do this in a reasonably portable way?
Cheers!
SG