reinterpret_cast

S

saurabh

I am trying to understand reinterpret_cast,here is what i tried,

#include<iostream>
#include<cstdio>
using namespace std;
int main()
{

int x=1;
int* y=&x;
char* p= reinterpret_cast<char*>(y);
/*I expect the following line to access different bytes of x as chars and
print them.what should I expect?Am i expecting right? */
printf("\n the chars are %c %c %c %c\n",p[0],p[1],p[2],p[3]);

return 0;
}

I want to know what is going behind.
Thanks for any help.
 
H

Harsh Puri

I am trying to understand reinterpret_cast,here is what i tried,

#include<iostream>
#include<cstdio>
using namespace std;
int main()
{

        int x=1;
        int* y=&x;
        char* p= reinterpret_cast<char*>(y);
        /*I expect the following line to access different bytes of x as chars and
        print them.what should I expect?Am i expecting right? */
        printf("\n the chars are %c %c %c %c\n",p[0],p[1],p[2],p[3]);

        return 0;

}

I want to know what is going behind.
Thanks for any help.

Hi Saurabh,

You are accessing uninitialized memory and the result of doing that is
undefined. That is preceisely what will happen here, it will be
undefined behavior.

By the way, on a side note, reinterpret_cast is required in wierd
cases like casting function pointers. It is not really advised to use
the reinterpret_cast as its no better than our old friend the c-style
cast.

Regards
Harsh
 
D

dizzy

saurabh said:
I am trying to understand reinterpret_cast,here is what i tried,

#include<iostream>
#include<cstdio>
using namespace std;
int main()
{

int x=1;
int* y=&x;
char* p= reinterpret_cast<char*>(y);

So far so good (btw, this is well defined because x is a POD and you may
reinterpret_cast pointers to PODs to pointers to char in a well defined way
using reinterpret cast).
/*I expect the following line to access different bytes of x as chars and
print them.what should I expect?Am i expecting right? */
printf("\n the chars are %c %c %c %c\n",p[0],p[1],p[2],p[3]);

Right. Tho the reason to print those bytes as characters instead of printing
their numeric values avoids me.

Also, notice that tho the standard says that you can treat PODs as array of
chars this applies to their object representation (the bytes array that
make the POD object contents as given by sizeof()) but they do not also
mean the value representation (the bits from the object representation that
are actually used to form the value of the POD, that is, it is to be
expected that some bits are just for padding there and
printing/interpreting their values in that char array of yours won't make
much sense).
I want to know what is going behind.

What's going behind what?
 
A

acehreli

On Jun 10, 2:25 pm, saurabh <[email protected]> wrote:
It is not really advised to use
the reinterpret_cast as its no better than our old friend the c-style
cast.

Maybe you should give reinterpret_cast another chance. It is a
protective friend who will not allow you cast-away const or
volatile. :)

Dump the C-style cast already! It may stab you on the back one day. :)

Ali
 
F

Fran

On Jun 10, 2:25 pm, saurabh wrote:
int main()
{
int x=1;
int* y=&x;
char* p= reinterpret_cast<char*>(y);
/*I expect the following line to access different bytes of x as chars and
print them.what should I expect?Am i expecting right? */
printf("\n the chars are %c %c %c %c\n",p[0],p[1],p[2],p[3]);
return 0;
}
You are accessing uninitialized memory

This is not true. Every variable in the above program is initialized
before it is used.
 
S

saurabh

saurabh wrote:
.....SNIP....

So far so good (btw, this is well defined because x is a POD and you may
reinterpret_cast pointers to PODs to pointers to char in a well defined way
using reinterpret cast).
/*I expect the following line to access different bytes of x as chars and
print them.what should I expect?Am i expecting right? */
printf("\n the chars are %c %c %c %c\n",p[0],p[1],p[2],p[3]);

Right. Tho the reason to print those bytes as characters instead of printing
their numeric values avoids me.
Also, notice that tho the standard says that you can treat PODs as array of
chars this applies to their object representation (the bytes array that
make the POD object contents as given by sizeof()) but they do not also
mean the value representation (the bits from the object representation that
are actually used to form the value of the POD, that is, it is to be
expected that some bits are just for padding there and
printing/interpreting their values in that char array of yours won't make
much sense).

Thanks, I didn't know about the padding bits.I thought by guessing the
binary respresentation of the integer I can guess what characters would be
printed.
What's going behind what?

what's going behind means whats going in the memory,How that integer is
represented in the memory?Can I predict the output?
 
D

dizzy

saurabh said:
Thanks, I didn't know about the padding bits.I thought by guessing the
binary respresentation of the integer I can guess what characters would be
printed.

Exotic platforms have integrals with padding (I think only the unsigned ones
are allowed to that, the signed ones should not). But one common case of
padding is in a POD struct:

struct S { char a; int b; };

std::cout << sizeof(char) << ", " << sizeof(int) << ", " << sizeof(S);

Run that and try to interpret the values. Surprise? :)
what's going behind means whats going in the memory,How that integer is
represented in the memory?Can I predict the output?

Well first of all there are some guarantees from the standard. The standard
says that the unsigned integral values (that is even for the signed types,
if we store a value without sign) a C++ implementation will use a pure
binary system. To store negative values the standard allows 3
representations 2's complement, 1's complement and sign and magnitude. All
these negative value implementations (you can look them up on wikipedia or
your student textbook) use a single bit for the sign.

Then there are alot of platform specific details, such as the size of a byte
(in bits), the size in bytes of a type, the number of bits used for the
value representation.

For the first you can use std::numeric_limits<unsigned char>::digits. For
the second of course you can use sizeof(type) (and with these 2 you can
find out the number of bits that make the object representation of a type).
Using std::numeric_limits<int>::digits tells you the number of bits used
for the value of an int (without the bit sign).

Is that enough for your needs?
 
J

James Kanze

On Jun 10, 10:02 am, Harsh Puri wrote:
int main()
{
int x=1;
int* y=&x;
char* p= reinterpret_cast<char*>(y);
/*I expect the following line to access different bytes of x as chars and
print them.what should I expect?Am i expecting right? */
printf("\n the chars are %c %c %c %c\n",p[0],p[1],p[2],p[3]);
return 0;
}
You are accessing uninitialized memory
This is not true. Every variable in the above program is
initialized before it is used.

Only if sizeof(int) >= 4. On the other hand, he's outputting
what is basically binary data to a stream opened in text mode,
which is undefined behavior.
 
J

James Kanze

Exotic platforms have integrals with padding (I think only the
unsigned ones are allowed to that, the signed ones should
not).

Character types are not allowed to have padding; all other types
can. Signed integral types can also have two 0's (+ and -), and
implementations can trap on -0. (But C++, unlike C, doesn't
allow trapping representations in plain char, so if signed char
has a trapping -0, then plain char must be unsigned.)

BTW: your earlier comments mentionned POD's. There is no
restriction to POD's here: you can access the underlying memory
of any type through an lvalue of character type. (The
difference is that using this to copying a non-POD elsewhere is
not guaranteed to be value preserving.)
But one common case of padding is in a POD struct:
struct S { char a; int b; };
std::cout << sizeof(char) << ", " << sizeof(int) << ", " << sizeof(S);
Run that and try to interpret the values. Surprise? :)
Well first of all there are some guarantees from the standard.
The standard says that the unsigned integral values (that is
even for the signed types, if we store a value without sign) a
C++ implementation will use a pure binary system. To store
negative values the standard allows 3 representations 2's
complement, 1's complement and sign and magnitude. All these
negative value implementations (you can look them up on
wikipedia or your student textbook) use a single bit for the
sign.

The C++ standard isn't as clear as one would like in this
regard, but I'm pretty sure that the intent is C compatibility.
Which means that integral types (except for unsigned char, and
char in C++) can have padding bits and trapping representations.
One current implementation has 48 bit ints with only 39 value
bits.
 
D

dizzy

James said:
Character types are not allowed to have padding; all other types
can.

Good, I never was sure about how to interpret 3.9.1 1. It says "character
types have no padding" but I was never sure if that means all character
types or just "char".
Signed integral types can also have two 0's (+ and -), and
implementations can trap on -0. (But C++, unlike C, doesn't
allow trapping representations in plain char, so if signed char
has a trapping -0, then plain char must be unsigned.)


You are right. I think I confused this with POSIX/C99 intX_t types for which
it says if they exist they are a native signed integral type with no
padding of those number of bits but it doesn't say the no padding thing for
uintX_t.
BTW: your earlier comments mentionned POD's. There is no
restriction to POD's here: you can access the underlying memory
of any type through an lvalue of character type. (The
difference is that using this to copying a non-POD elsewhere is
not guaranteed to be value preserving.)

You are right that I was thinking about the standard mentioning the copy of
the value. But then I looked again and searched the standard and I can't
seem to find where it says that you can convert in a well defined way a T*
rvalue to a (unsigned/signed) char* rvalue. Can you point me to it?
The C++ standard isn't as clear as one would like in this
regard, but I'm pretty sure that the intent is C compatibility.
Which means that integral types (except for unsigned char, and
char in C++) can have padding bits and trapping representations.
One current implementation has 48 bit ints with only 39 value
bits.

Getting OT, does this implementation offer POSIX/C99 compliant stdint.h?
Because that 48 object/39 value bit integral does not seem like a good
candidate for int48_t or int39_t :)
 
J

Jerry Coffin

[email protected] says... said:
Exotic platforms have integrals with padding (I think only the unsigned ones
are allowed to that, the signed ones should not).

Not so. If a signed and unsigned have the same value, they're also
required to have the same representation, so if there are padding bits
in one, there must be corresponding padding bits in the other.

[ ... ]
Well first of all there are some guarantees from the standard. The standard
says that the unsigned integral values (that is even for the signed types,
if we store a value without sign) a C++ implementation will use a pure
binary system. To store negative values the standard allows 3
representations 2's complement, 1's complement and sign and magnitude. All
these negative value implementations (you can look them up on wikipedia or
your student textbook) use a single bit for the sign.

These restrictions are valid for C99, but not (AFAIK) for the current
C++ standard. Though I've never seen or heard of it, I believe the
current C++ standard would allow other pure binary representations of
signed numbers (e.g. a biased representation). Then again, this
restriction was added to C99 largely because there doesn't seem to be
any reason NOT to -- i.e. no hardware out there using any representation
but those three, so worrying about seeing something else is probably
pretty pointless.
 
J

Jerry Coffin

[email protected] says... said:
Getting OT, does this implementation offer POSIX/C99 compliant stdint.h?
Because that 48 object/39 value bit integral does not seem like a good
candidate for int48_t or int39_t :)

C99 makes the exact-width integer types semi-optional -- iif the
implementation HAS types that are exactly 8, 16, 32 and/or 64 bits, it's
required to provide typedefs of those types. If it doesn't have such a
type, it must NOT provide a typedef of it.

The standard specifically prohibits an exact-width typedef of any type
that includes padding bits.
 
J

James Kanze

[ ... ]
Getting OT, does this implementation offer POSIX/C99
compliant stdint.h? Because that 48 object/39 value bit
integral does not seem like a good candidate for int48_t or
int39_t :)
C99 makes the exact-width integer types semi-optional -- iif
the implementation HAS types that are exactly 8, 16, 32 and/or
64 bits, it's required to provide typedefs of those types. If
it doesn't have such a type, it must NOT provide a typedef of
it.
The standard specifically prohibits an exact-width typedef of
any type that includes padding bits.

It also requires 2's complement for the exact-width signed
types.
 
J

James Kanze

Getting OT, does this implementation offer POSIX/C99 compliant
stdint.h?

C99 compliant, yes. Posix requires the presence of 2's
complement integers without padding, in the sizes 8, 16, 32 and
64, so you can't implement Posix on this hardware.
Because that 48 object/39 value bit integral does
not seem like a good candidate for int48_t or int39_t :)

Given that the representation of signed values is signed
magnitude, and not 2's complement, a conformant C99
implementation may not define any of the exact width types.

If your code needs the exact width types, you should have
something like:

#ifndef INT32_MAX
#error Exact width, 2's complement integral type required
#endif

Of course, if you forget it, the code will still fail to compile
with an error when you try to use the type.
 
J

James Kanze

[ ... ]
Exotic platforms have integrals with padding (I think only
the unsigned ones are allowed to that, the signed ones
should not).
Not so. If a signed and unsigned have the same value, they're
also required to have the same representation, so if there are
padding bits in one, there must be corresponding padding bits
in the other.

The intent, obviously, is that you can pass a possitive int to a
"%x" formating specifier in printf, and not have undefined
behavior.
[ ... ]
Well first of all there are some guarantees from the
standard. The standard says that the unsigned integral
values (that is even for the signed types, if we store a
value without sign) a C++ implementation will use a pure
binary system. To store negative values the standard allows
3 representations 2's complement, 1's complement and sign
and magnitude. All these negative value implementations (you
can look them up on wikipedia or your student textbook) use
a single bit for the sign.
These restrictions are valid for C99, but not (AFAIK) for the
current C++ standard. Though I've never seen or heard of it, I
believe the current C++ standard would allow other pure binary
representations of signed numbers (e.g. a biased
representation).

Maybe, if it could do so and still ensure that the common subset
of values in the signed and unsigned integral type had the same
representation:). (Actually, I don't think it can; I think
that the requirement for a "pure binary representation" holds
for both signed and unsigned. But even if it doesn't, the
requirement you mentionned above ensures that at least the
positive values be represented in pure binary.)
Then again, this restriction was added to C99
largely because there doesn't seem to be any reason NOT to --
i.e. no hardware out there using any representation but those
three, so worrying about seeing something else is probably
pretty pointless.

Also because it was felt that the original wording wasn't very
clear and precise. while I don't think everyone really liked
the idea of enumerating all possible representations, no one
proposed wording which was precise enough to specify what was
wanted otherwise. There's a proposal to adopt the wording from
the C99 standard into C++, which I suppose the committee is
discussing at the meeting this week.
 
J

Jerry Coffin

[ ... ]
Maybe, if it could do so and still ensure that the common subset
of values in the signed and unsigned integral type had the same
representation:). (Actually, I don't think it can; I think
that the requirement for a "pure binary representation" holds
for both signed and unsigned. But even if it doesn't, the
requirement you mentionned above ensures that at least the
positive values be represented in pure binary.)

It's really ugly to implement, but I'm pretty sure it's possible. For
unsigned you use the same bias, but use wraparound so what would be
negative values as signed are large positive values for unsigned. The
hard part is actually implementing the arithmetic -- the only way I've
figured out to do it is undo the bias, do the arithmetic, then reapply
the bias. Needless to say, I don't think it'll be taking over the world
anytime soon...
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,995
Messages
2,570,230
Members
46,819
Latest member
masterdaster

Latest Threads

Top