write binary representation to output

W

wongjoekmeu

Dear all,
I was wondering how in C++ the code would look like if I want to write
the binary notation of a unsigned char to standard output or any other
basic data type, like int, unsigned int, float and so forth.
Thanks in advance.
RR
 
J

Juha Nieminen

I was wondering how in C++ the code would look like if I want to write
the binary notation of a unsigned char to standard output or any other
basic data type, like int, unsigned int, float and so forth.

#include <iostream>
#include <climits>

template<typename Type>
void printBinaryRepresentation(const Type& value)
{
// Resolve if this is a big-endian or a little-endian system:
int dummy = 1;
bool littleEndian = (*reinterpret_cast<char*>(&dummy) == 1);

// The trick is to create a char pointer to the value:
const unsigned char* bytePtr =
reinterpret_cast<const unsigned char*>(&value);

// Loop over the bytes in the value:
for(unsigned i = 0; i < sizeof(Type); ++i)
{
unsigned char byte;
if(littleEndian) // we have to traverse the value backwards:
byte = bytePtr[sizeof(Type) - i - 1];
else // we have to traverse it forwards:
byte = bytePtr;

// Print the bits in the byte:
for(int bitIndex = CHAR_BIT-1; bitIndex >= 0; --bitIndex)
{
std::cout << ((byte >> bitIndex) & 1);
}
}

std::cout << std::endl;
}
 
J

James Kanze

* (e-mail address removed):

<code>
#include <iostream> // std::cout, std::eek:stream
#include <ostream> // operator<<, std::endl
#include <bitset> // std::bits
#include <climits> // CHAR_BIT
static unsigned const bitsPerByte = CHAR_BIT;
template< typename T >
struct BitSize
{
enum { value = bitsPerByte*sizeof( T ) };
};

Just curious, but do you really need this?
template< typename T >
std::bitset< BitSize<T>::value > bitsetFrom( T const& v )
{
typedef std::bitset< BitSize<T>::value > BitSet;
BitSet result;
unsigned char const* p = reinterpret_cast<unsigned char const*>( & v );

And why this?
// Uses little-endian convention for bit numbering.

Which I think the standard requires. The problem is that you're
also assuming little-endian for the byte order, which is the
exception, not the rule (in terms of number of architectures,
not number of machines).
for( size_t i = sizeof(T)-1; i != size_t(-1); --i )
{
result <<= bitsPerByte;
result |= BitSet( p );
}
return result;
}


Maybe I'm misunderstanding something, but when someone says
somthine like "binary notation to standard out", I imagine
something like "00011100" (for 0x1C). I'm not really sure what
he's looking for when he mentions float, but for the unsigned
integral types, something like the following should do as a
first approximation:

template< typename T >
class Binary
{
public:
explicit Binary( T value )
: myValue( value )
{
}
friend std::eek:stream&operator<<(
std::eek:stream& dest,
Binary< T > const& value )
{
T tmp = value.myValue ;
std::string s ;
do {
s += '0' + (tmp & 1) ;
tmp >>= 1 ;
} while ( tmp != 0 ) ;
reverse( s.begin(), s.end() ) ;
dest << s ;
return dest ;
}
} ;

template< typename T >
inline Binary< T >
binary( T value )
{
return Binary< T >( value ) ;
}

Use:

unsigned i = 42 ;
std::cout << binary( i ) << std::endl ;

Most of the formatting flags (e.g. width) are handled correctly,
by the << operator for string.

Handling signed values is a bit more tricky, because you need
the unsigned equivalent for the tmp, or else some special code
for handling the the fact that on most machines, there is one
negative value which doesn't have a positive corresponant. (If
you're lucky enough to be working on a 1's complement machine or
a signed magnitude machine, there's no problem.)
 
J

James Kanze

#include <iostream>
#include <climits>
template<typename Type>
void printBinaryRepresentation(const Type& value)
{
// Resolve if this is a big-endian or a little-endian system:
int dummy = 1;
bool littleEndian = (*reinterpret_cast<char*>(&dummy) == 1);

And what about middle endian? If a type has more than two
bytes, then more than two orders are possible, and in fact, I've
seen at least three for long (on very common machines, no less).

For the rest, I don't think that this is what he was looking
for, and even if it was, I don't see any reason to access byte
by byte.
 
J

James Kanze

* James Kanze:
It's a templated compile time constant.

I know that.
std::bitset requires a compile time constant as template parameter.

And you use it as a return value, so you have to declare the
type outside of the function. That's what I'd overlooked.
Silly design and I'd use something else if available in
standard library and more reasonable (constrained to only
valid data) than std::string.
Mostly in order to deal with 'float', as the OP requested.

But he didn't say what he wanted as output for a float.

I'm not sure yet what he really wants. Since he's outputting to
standard out, I can assume text (since you can't output binary
to standard out). But beyond that, I'm (or rather we're) just
guessing.
Otherwise std::bitset can be constructed directly from the value.
I'm not sure about that,

I think it's implicit in a "pure binary representation". Or at
least, that the implementation behave "as if" it were the case:
(integralType & 1) is guaranteed to expose the bit 2^0.
but it would be nice if the standard has requirements that
means a direct construction of bitset from e.g. int produces
same result as this function. My intention was to not violate
such requirements if they exist.

You know something: I've never used std::bitset. In the rare
cases where I've needed bitset's, I've simply continued using my
pre-standard class. So I don't really know too much about what
std::bitset requires or guarantees.

What I was really wondering about, however, is the
appropriateness (or the necessity) of passing through a bitset
of any kind. Why not just generate the characters '0' and '1'
directly?
Uhm, sorry, there is no such thing as little-endian with some
other byte order.
Little endian means bit numbering increases in same direction
as memory addresses, for any size unit.

Little endian means that the sub-unit numbering increases in the
same direction as the sub-units appear physically. Thus, the
Internet uses little endian bit ordering in bytes, but big
endian byte ordering in higher order elements, like integers.
If you're talking about little endian bit ordering, you're
talking about the order of the bits in a byte.
for( size_t i = sizeof(T)-1; i != size_t(-1); --i )
{
result <<= bitsPerByte;
result |= BitSet( p );
}
return result;
}

Maybe I'm misunderstanding something, but when someone says
somthine like "binary notation to standard out", I imagine
something like "00011100" (for 0x1C).


Well, that's pretty much the meaning of "notation". And you
can't output binary to standard out. So whatever else he's
asking for, it's a text representation.

The question is, of course, what he wants for float. I can
think of at least three interpretations, and I've not the
slightest idea which one he's looking for.
Well, see below: above doesn't really require a class or anything.

The class is just a convention means of getting the format you
want in the ostream. You could just as easily make it a single
function which returned a string.
But I'm not so concerned about that as I am that both your and
Juha's solution mixes data representation and i/o. I'd rather
prefer a member function that returns a pure data
representation of the binary (I used a bitset, but string,
although not ideal in the sense of constraints on value, would
be acceptable).

OK. I can understand that point of view. I presume then that
std::bitset (like my pre-standard BitVector) has a << operator
for the actual output.

Maybe I'm reading too much into the word "notation", but in my
mind, it means a textual representation; his problem is output
formatting. In which case, introducing an unnecessary
intermediate type (other than as a decorator for formatting) is
unnecessary added complexity. If, on the other hand, he needs a
representation which he can then further manipulate, std::bitset
is the "official" answer.
[Usage]:
unsigned i = 42 ;
std::cout << binary( i ) << std::endl ;
The code I posted earlier mainly tackles float and double in
addition to integrals.
For the example above you don't need such code, because you can just do
unsigned const i = 42;
std::cout << std::bitset<CHAR_BIT*sizeof(i)>( i ) << std::endl;
Note the reduction in number of lines, to just 2 (no support class). :)

But will it also handle user defined integral types?

On the other hand, you're right. Unless there is an absolute
need to support such types, this is a much better solution than
mine.
Well, I'm not sure that that's really a problem with your code.
I think your code would work just fine (for integral types).

It will go into an endless loop for negative values if >>
propagates the sign.

I'll repeat something I wrote in another thread a few moments
ago: I stick to unsigned types when manipulating bits.

The real question remains, however: why does he want this? What
is he trying to do? Does he want float to output something like
"1.0011001B05"? (Somehow I doubt it, but taken literally,
that's really what he asked for.) Or does he want a binary dump
of the underlying memory, which is what your code does (but
then, I would generally prefer it broken up into bytes, with a
space between each byte)?
 
J

Juha Nieminen

James said:
And what about middle endian?

Most probably we can ignore obscure architectures.
For the rest, I don't think that this is what he was looking
for, and even if it was, I don't see any reason to access byte
by byte.

How else? The size of the type will be n bytes, so accessing byte by
byte is the most logical choice. How else would you do it?
 
C

Charles Coldwell

Dear all,
I was wondering how in C++ the code would look like if I want to write
the binary notation of a unsigned char to standard output or any other
basic data type, like int, unsigned int, float and so forth.
Thanks in advance.
RR

How about this:

#include <iostream>
#include <iomanip>

template<typename T> struct binary_traits
{
typedef T int_t;
};

template<> struct binary_traits<float>
{
typedef int int_t;
};

template<> struct binary_traits<double>
{
typedef long long int_t;
};

template<typename T>
void output(T x)
{
std::cout << std::setw(sizeof(x)*2) << std::setfill('0') << std::hex
<< * reinterpret_cast<typename binary_traits<T>::int_t *>(&x)
<< std::endl;
}

int main(int argc, char *argv[])
{
output(10);
output(10L);
output(10LL);
output(0.1f);
output(0.1);

return 0;
}
 
J

James Kanze

Most probably we can ignore obscure architectures.

That was a Microsoft compiler, on an Intel architecture. I
don't know that the word "obscure" is appropriate in such cases.
How else? The size of the type will be n bytes, so accessing
byte by byte is the most logical choice. How else would you do
it?

As itself?
 
J

James Kanze

How do you access individual bits of a double? (Or a struct
containing several member variables, for example.)

And we're back to the question I raised with Alf: what does it
mean to access the individual bits of a double? And does the
original poster want to access the original bits? Or does he
want a "binary representation", e.g. something like
"1.001101B011" (for 9.625)? Or something else entirely? Taken
literally, the "binary representation" of 9.625 would be either
"1001.101" or "1.001101B011". Or---guessing a lot here---maybe
he really wants a "binary dump" of the underlying bits. In that
case, my "binary dumps" usually display in hexadecimal, and of
course, the actual byte order in memory is one of the things I'm
probably trying to determine. (I use binary dumps of different
values to try to work out the layout, when it isn't documented.)
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,995
Messages
2,570,225
Members
46,815
Latest member
treekmostly22

Latest Threads

Top