analysis of floating point values

K

K4 Monk

hi, is there a way by which we can see a bit-by bit representation of
how a floating point value is stored? I was thinking something similar
to how we shift bits using the >> and << operators but they only work
for integers. In short, I'd like to first find out what the size of
the value is in bytes and then analyse every bit in every byte. Is
this possible?

thanks!
 
J

Jens Thoms Toerring

K4 Monk said:
hi, is there a way by which we can see a bit-by bit representation of
how a floating point value is stored? I was thinking something similar
to how we shift bits using the >> and << operators but they only work
for integers. In short, I'd like to first find out what the size of
the value is in bytes and then analyse every bit in every byte. Is
this possible?

Yes, it's possible. One way is

#include <cstdio>
#include <cstring>

int main( )
{
double d = 42.1419;
unsigned char * c = new unsigned char [ sizeof d ];
memcpy( c, &d, sizeof x );

for ( size_t i = 0; i < sizeof x; ++i )
printf( "%02x ", c[ i ] );
printf( "\n" );
delete [ ] c;
}

(I use printf() here since I'm too lazy to look up the
correct way to get properly formated hex output from
std::cout).

The 'sizeof d' bits (which could also be written as
'sizeof(double)' tells you how many bytes there are in
a double. To keep things simple a buffer of unsigned
chars of exacty this size is allocated and the bytes
of the double are directly copied over there. Then
you can print out the hexadecimal values of each of
those bytes. Splitting up a hex value into its bits is
so simple it can be done without a program;-) (You
could also do without the extra buffer using casts
and a bit of pointer fiddling.)

Of course, those hex values won't make too much sense
without any idea what they mean. And what they mean can
differ from machine to machine. Luckily, most machines
nowadays use IEEE 754-2008 format, see e.g.

http://en.wikipedia.org/wiki/IEEE_754-2008

and the pages linked in there. Keep in mind that there
can be emdianess issues (i.e. one some machines the
low order bytes come first in memory, on others the
high order bytes and then there are further possible
variations) and that not all machines use this format.

Regards, Jens
 
G

gwowen

K4 said:
hi, is there a way by which we can see a bit-by bit representation of
how a floating point value is stored? I was thinking something similar
to how we shift bits using the >> and << operators but they only work
for integers. In short, I'd like to first find out what the size of
the value is in bytes and then analyse every bit in every byte. Is
this possible?

template <typename T>
void print_representation(T*addr){
unsigned char* chaddr = (unsigned char*)addr;
for(size_t q=0;q<sizeof(T); ++q){
for (int bit=CHAR_BIT;bit > 0 ;--bit){
std::cout << ((*(chaddr+q)&(1U<<(bit-1))) ? 1 : 0);
}
}
std::cout<<std::endl;
}


int main()
{
float x = -43.43;
double y = 43.43;
print_representation(&x);
print_representation(&y);
}
 
K

K4 Monk

Thanks to all, Leigh, Jens, and gwowen...this is very useful...good
for practising C++ as well.

One quick note, I see you cast to (unsigned char*) but using sizeof
operator I see that "double" is 16 bytes (on my machine a core2duo
running linux x86_64) and unsigned char* is 8 bytes. But still the
code snippets you provided are able to print everything. Ah well, time
to go through the books again!
 
P

Paul

Jens Thoms Toerring said:
K4 Monk said:
hi, is there a way by which we can see a bit-by bit representation of
how a floating point value is stored? I was thinking something similar
to how we shift bits using the >> and << operators but they only work
for integers. In short, I'd like to first find out what the size of
the value is in bytes and then analyse every bit in every byte. Is
this possible?

Yes, it's possible. One way is

#include <cstdio>
#include <cstring>

int main( )
{
double d = 42.1419;
unsigned char * c = new unsigned char [ sizeof d ];
memcpy( c, &d, sizeof x );

for ( size_t i = 0; i < sizeof x; ++i )
printf( "%02x ", c[ i ] );
printf( "\n" );
delete [ ] c;
}

(I use printf() here since I'm too lazy to look up the
correct way to get properly formated hex output from
std::cout).

It's simply:
std::cout<<std::hex<< your_integer;

HTH.
 
J

Jens Thoms Toerring

K4 Monk said:
Thanks to all, Leigh, Jens, and gwowen...this is very useful...good
for practising C++ as well.
One quick note, I see you cast to (unsigned char*) but using sizeof
operator I see that "double" is 16 bytes (on my machine a core2duo
running linux x86_64) and unsigned char* is 8 bytes.

In e.g.

there's a cast from 'double *' (note the '&' in front of 'f')
to 'unsigned char *', so the sizeof a double is irrelevant
here. The size of a double becomes only important when you
then iterate over the array of unsigned chars with

to stop when the end of the array (which is just the
double split up into single bytes) is reached.

Regards, Jens
 
K

K4 Monk

In e.g.


there's a cast from 'double *' (note the '&' in front of 'f')
to 'unsigned char *', so the sizeof a double is irrelevant
here.

On 18/02/2011 14:15, K4 Monk wrote:
The cast is of a *pointer to* the double rather than of the *value of*
the double; an unsigned char pointer is the same size as a double pointer..

/Leigh

Ok I understand now. Thanks. So its basically because pointers are
always the same size regardless of whether they point to a double or a
char, and we then switch the type of the pointer so it acts as a ptr
to a char. (And the reason we do this is because double* will always
take chunks of memory in 16 bytes whereas for byte-by-byte analysis we
need a pointer which byte sized chunks, correct?) Wow I feel so much
wiser now. This is exciting, it means we can also analyze the layout
of other objects in memory...cool!
 
P

Paul

K4 Monk said:
Thanks to all, Leigh, Jens, and gwowen...this is very useful...good
for practising C++ as well.

One quick note, I see you cast to (unsigned char*) but using sizeof
operator I see that "double" is 16 bytes (on my machine a core2duo
running linux x86_64) and unsigned char* is 8 bytes. But still the
code snippets you provided are able to print everything. Ah well, time
to go through the books again!
The outer loop is limited by the sizeof the float or double and not the
sizeof char.
for(size_t q=0;q<sizeof(T); ++q).

The data structure, that is the double or float, is traversed as if it were
an array of chars.
*(chaddr+q) in the inner loop, is indexing this data structure in char sized
chunks.


HTH
 
J

Jens Thoms Toerring

Ok I understand now. Thanks. So its basically because pointers are
always the same size regardless of whether they point to a double or a
char

Mostly correct. I'm not sure about the C++ standard, but the
C standard only guarantees that the size of a char or void
pointer is sufficient to allow a cast from other object types
to them (the back-cast is then also possible). That means that
there's a theoretical possibility that pointers to objects of
other types have smaller sizes. But I haven't seen any such
machine yet and it also rather likely doesn't make much sense
to cast from a char pointer to some other pointer (as long as
it's not a back-cast to the original type).

Be a bit careful with pointers to functions, they aren't in-
cluded in this (a function isn't an object). But even there
on most machines it also works.

, and we then switch the type of the pointer so it acts as a ptr
to a char. (And the reason we do this is because double* will always
take chunks of memory in 16 bytes whereas for byte-by-byte analysis we
need a pointer which byte sized chunks, correct?)

Yes, exactly.
Regards, Jens
 
J

Juha Nieminen

Leigh Johnston said:
int main()
{
double f = 42.42;
unsigned char* bits = reinterpret_cast<unsigned char*>(&f);
for (std::size_t i = 0; i != sizeof(f); ++i)
{
unsigned char byte = bits;
for (std::size_t j = CHAR_BIT; j != 0; --j)
std::cout << (byte >> (j-1) & 1 ? '1' : '0');
}
}


error: 'size_t' is not a member of 'std'
 
J

Juha Nieminen

None of the presented solutions take into account endianess. Usually when
you want to print the binary representation of something, you want the
most significant bit to be printed first and go down from there (in other
words, you just want the base-2 representation of the value in the same
way you would print a regular base-10 one).

Also, since it's trivial in C++ to make the function work with any type,
not just doubles, why not do that while we are at it?

//---------------------------------------------------------------
#include <iostream>
#include <climits>

template<typename Type>
void printBinaryRepresentation(Type value)
{
// Resolve if this is a big-endian or a little-endian system:
int dummy = 1;
bool littleEndian = (*reinterpret_cast<char*>(&dummy) == 1);

// The trick is to create a char pointer to the value:
const unsigned char* bytePtr =
reinterpret_cast<const unsigned char*>(&value);

// Loop over the bytes in the floating point value:
for(unsigned i = 0; i < sizeof(Type); ++i)
{
unsigned char byte;
if(littleEndian) // we have to traverse the value backwards:
byte = bytePtr[sizeof(Type) - i - 1];
else // we have to traverse it forwards:
byte = bytePtr;

// Print the bits in the byte:
for(int bitIndex = CHAR_BIT-1; bitIndex >= 0; --bitIndex)
std::cout << ((byte >> bitIndex) & 1);
}

std::cout << std::endl;
}

int main()
{
printBinaryRepresentation(0.5);
printBinaryRepresentation(-0.5f);
}
//---------------------------------------------------------------
 
J

Juha Nieminen

Leigh Johnston said:
Leigh Johnston said:
int main()
{
double f = 42.42;
unsigned char* bits = reinterpret_cast<unsigned char*>(&f);
for (std::size_t i = 0; i != sizeof(f); ++i)
{
unsigned char byte = bits;
for (std::size_t j = CHAR_BIT; j != 0; --j)
std::cout<< (byte>> (j-1)& 1 ? '1' : '0');
}
}


error: 'size_t' is not a member of 'std'


What are you gibbering about? #includes are usually implied in a code
snippet. std::size_t exists.


And someone learning C++ is supposed to know that how?

It would have costed only a couple of additional lines to post a complete
program.
 
G

gwowen

  None of the presented solutions take into account endianess.

The problem with that is that you assume every machine's endianess/
byte-order can be described completely by looking at the lowest-
addressed-byte of the representation of unsigned(1). This can be
wrong on an ARM, and is always wrong on a PDP-11. It also assumes that
the endianess of a float point type is the same as an integer type.
This is untrue on a smattering of crazier-than-a-bag-of-weasel
architectures http://www.quadibloc.com/comp/cp0201.htm
Usually when you want to print the binary representation of something, you want the most significant bit to be printed first and go down from there

But sometimes you're printing the representation of something
precisely to poke around at the innards of the processor, to determine
byte-ordering, mantissa format, etc, and you really want the "bytes
ordered-as-they-are-arranged-in-memory". And of course, sometimes you
want to print the representation of something that is not a value type.
 
K

K4 Monk

    bool littleEndian = (*reinterpret_cast<char*>(&dummy) == 1);

took me a minute to understand this but now that I do, its very
clever! I got confused because dummy is an int of value 1, and in the
line above I wasn't sure if 1 was a bool or an int.
 
J

Joshua Maurice

Mostly correct. I'm not sure about the C++ standard, but the
C standard only guarantees that the size of a char or void
pointer is sufficient to allow a cast from other object types
to them (the back-cast is then also possible). That means that
there's a theoretical possibility that pointers to objects of
other types have smaller sizes. But I haven't seen any such
machine yet and it also rather likely doesn't make much sense
to cast from a char pointer to some other pointer (as long as
it's not a back-cast to the original type).

Be a bit careful with pointers to functions, they aren't in-
cluded in this (a function isn't an object). But even there
on most machines it also works.

Pretty sure that's the same in C++.

However, due to forward declarations, an implementation would likely
have to go out of its way to have different pointer to struct types
which have different sizes or representations. This is true of C and C+
+.

Also, IIRC, some crazy mainframes do have void* and char* of a
different size than int*. The reason is that the machine is at the
hardware level only 64 bit addressable, and they didn't want to have
CHAR_BITS or whatever be 64. Instead, char has 8 bits, and a "simple"
char read or write is implemented through a hardware assembly load or
store with additional implicit bit manipulation to only change the
right 8 bits. (Needless to say, such a system wouldn't be POSIX
pthread conforming nor C++0x conforming.)
 
J

James Kanze

On Feb 18, 6:56 am, (e-mail address removed) (Jens Thoms Toerring) wrote:
Also, IIRC, some crazy mainframes do have void* and char* of a
different size than int*.

Not so much on mainframes, as on smaller, embedded machines. On
where not much text handling is to be expected, using word
addressing makes sense even today, at least if words are small.
If you're not using all of the bits for addressing, then you
might as well spend the extra bits to address bytes. On a 16
bit machine, however, word addressing means you can address
128KB, rather than just 64KB.

What did happen in the past (and may still be the case on some
exotic mainframes) is that the basic address was originally word
addressing, but that some of the unused upper bits were later
dedicated to the byte address in a word. In such cases, a char*
wouldn't be bigger than an int*, but it would have a different
representation, and casting a char* to an int* could force the
byte address part to zero. (It also lead to the interesting
characteristic that (unsigned)p > (unsigned)(p+1) in some cases.)
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,982
Messages
2,570,185
Members
46,738
Latest member
JinaMacvit

Latest Threads

Top