weird encouter on solaris, windows and os x

T

ts

hi i got the following code:

============================
#include <iostream>
#include <fstream>

using namespace std;

int main()
{
ofstream newFile("test.new", ios::binary);

int test = 3;

newFile.write(reinterpret_cast<const char *>(&test), 1);
newFile.write(reinterpret_cast<const char *>(&test), 1);

cout << endl;
cdlFile.close();
return 0;
}
============================
while os x and windows give me the result of 33 inside the file(dont
mind the endianess for now). Solaris gives me 00. Did anyone know what
causing the difference on the solaris system? cause my target is
actually make the code running on solaris but doesnt looks like the
case now.

thanks.
 
A

Alf P. Steinbach

* ts:
hi i got the following code:

============================
#include <iostream>
#include <fstream>

using namespace std;

int main()
{
ofstream newFile("test.new", ios::binary);

int test = 3;

newFile.write(reinterpret_cast<const char *>(&test), 1);
newFile.write(reinterpret_cast<const char *>(&test), 1);

cout << endl;
cdlFile.close();
return 0;
}
============================
while os x and windows give me the result of 33 inside the file(dont
mind the endianess for now). Solaris gives me 00. Did anyone know what
causing the difference on the solaris system? cause my target is
actually make the code running on solaris but doesnt looks like the
case now.

Formally the result of the reinterpret_cast is undefined, but in practice you're
trying to access address 3.

What makes you think that the same data should be there with different operating
systems, or even processes in the same OS?

All this has to do with C++ is the possibility of writing environment specific
code -- what is it you're trying to achieve, what result did you expect?


Cheers & hth.,

- Alf
 
A

Alf P. Steinbach

* Alf P. Steinbach:
* ts:

Formally the result of the reinterpret_cast is undefined, but in
practice you're trying to access address 3.

Sorry, but such is the problem with UB code: even experienced old C++
programmers stumble when presented with it.

It seems you're trying to access the the first byte of the 'int' variable.

The contents of the first byte does indeed depend on the endianness, which
cannot be ignored.


Cheers & again hope this helps,

- Alf (a little too trigger-happy :) )
 
T

ts

* Alf P. Steinbach:





Sorry, but such is the problem with UB code: even experienced old C++
programmers stumble when presented with it.

It seems you're trying to access the the first byte of the 'int' variable..

The contents of the first byte does indeed depend on the endianness, which
cannot be ignored.

Cheers & again hope this helps,

- Alf (a little too trigger-happy :) )

Alf,

actually i would expect i got the result of 33 when reading it again
in the solaris system. but instead i got 00. and i cant figure out
what is the actual problem. my reading code is as below:

==============
#include <iostream>
#include <fstream>

using namespace std;

int main()
{
ifstream cdlFile("cdl.new", ios::binary);

while (cdlFile.good())
{
int byte = cdlFile.get();
if (cdlFile.good())
cout << byte;
}

cout << endl;
cdlFile.close();
return 0;
}
============================
 
M

Michael Oswald

ts said:
actually i would expect i got the result of 33 when reading it again
in the solaris system. but instead i got 00. and i cant figure out
what is the actual problem. my reading code is as below:

==============
#include <iostream>
#include <fstream>

using namespace std;

int main()
{
ifstream cdlFile("cdl.new", ios::binary);

while (cdlFile.good())
{
int byte = cdlFile.get();
if (cdlFile.good())
cout << byte;
}

cout << endl;
cdlFile.close();
return 0;
}
============================


Your actual problem is not the reading, but the writing.

If you take your lines:



then the variable test is represented in memory (assuming a 32 bit
system) like this:

0x03 0x00 0x00 0x00

and on the Solaris like this:

0x00 0x00 0x00 0x03

Google up for different byte ordering. Since you only write one byte,
you only write the first byte and clearly get 3 on one system and 0 on a
system with different byte order.

If you want the file to be portable, read up about serialisation in the
C++ FAQ


hth,
Michael
 
J

Jiøí Paleèek

hi i got the following code:

============================
#include <iostream>
#include <fstream>

using namespace std;

int main()
{
ofstream newFile("test.new", ios::binary);

int test = 3;

newFile.write(reinterpret_cast<const char *>(&test), 1);
newFile.write(reinterpret_cast<const char *>(&test), 1);

cout << endl;
cdlFile.close();
return 0;
}
============================
while os x and windows give me the result of 33 inside the file(dont
mind the endianess for now). Solaris gives me 00. Did anyone know what
causing the difference on the solaris system? cause my target is
actually make the code running on solaris but doesnt looks like the
case now.

This is undefined behaviour, because it is dereferencing a pointer to an
object of different type than the pointer base type [4.1.1]. Also see the
strict aliasing rule [3.10.15]. This specifies ways objects can be
accessed, and using lvalue of type const char* is not among them
(interestingly, using a lvalue of type char is). Some compilers use this
rule to optimize variable accesses, in your case, the compiler might just
omit the assignment to test altogether, because this is never read. Try
looking at the assembly to see if this is the case, and turning strict
aliasing rule off if this is possible.

The canonical way of doing this is using an union like this:

union {
int i;
char ch [4];
};

Regards
Jiri Palecek
 
D

Default User

ts wrote:

while os x and windows give me the result of 33 inside the file(dont
mind the endianess for now). Solaris gives me 00. Did anyone know what
causing the difference on the solaris system? cause my target is
actually make the code running on solaris but doesnt looks like the
case now.

Congratulations. You have discovered the endian situation.

<http://en.wikipedia.org/wiki/Endian>



Brian
 
J

James Kanze

Your actual problem is not the reading, but the writing.
If you take your lines:
then the variable test is represented in memory (assuming a 32 bit
system) like this:
0x03 0x00 0x00 0x00
and on the Solaris like this:
0x00 0x00 0x00 0x03

Just a few nits. (You actually explained the problem quite
well.) But "Solaris" is an OS, not a machine architecture, and
it runs on different machine architectures, both big endian and
little endian. (And of course, no every machine has four byte
ints.)
Google up for different byte ordering. Since you only write
one byte, you only write the first byte and clearly get 3 on
one system and 0 on a system with different byte order.
If you want the file to be portable, read up about
serialisation in the C++ FAQ

Where "portable" means "readable by anything but the binary
image which wrote it". I've seen byte order change from one
version of the compiler to the next, or depend on compiler
flags.
 
J

James Kanze

This is undefined behaviour, because it is dereferencing a
pointer to an object of different type than the pointer base
type [4.1.1].

With one exception: you can always access memory as an array of
character types, see §3.10/15. (And it would seem that you've
found a bug in the standard, since 4.1.1 doesn't mention this
exception.)
Also see the strict aliasing rule [3.10.15]. This specifies
ways objects can be accessed, and using lvalue of type const
char* is not among them (interestingly, using a lvalue of
type char is).

Exactly. What do you thing ostream::write does with the pointer
you pass it, if not dereference it (resulting in an lvalue of
type char).
Some compilers use this rule to optimize variable accesses,
in your case, the compiler might just omit the assignment to
test altogether, because this is never read. Try looking at
the assembly to see if this is the case, and turning strict
aliasing rule off if this is possible.

A compiler that does this if one of the types is a character
type is not conformant. There are also other tricky cases where
the compiler cannot (according to the standard) count on the
lack of aliasing---some compilers (e.g. g++) get some of the
border cases wrong, however.

Traditionally, the purpose of reinterpret_cast is precisely to
allow this sort of aliasing. The results (except in the case of
character types) are formally undefined behavior, since there's
really not anything the standard can say about them, but as a
quality of implementation issue, they should work, at least in
cases where the reinterpret_cast is visible to the compiler.
The canonical way of doing this is using an union like this:
union {
int i;
char ch [4];
};

Now this *is* undefined behavior. Some compilers (e.g. g++) do
guarantee it, though.

Note that in practice, many compilers (including g++) will fail
at times with code like the following:

union U
{
int i ;
double d ;
} ;

int
tricky( int const* pi, double* pd )
{
int r = *pi ;
*pd = 0.0 ;
return r ;
}

// ...
U u ;
u.i = 43 ;
std::cout << tricky( &u.i, &u.d ) ;

Note that this code *is* fully conformant, with well defined
behavior according to the standard. I'm not sure that this is
intentional, however, since it pretty much negates the effect of
allowing the compiler to assume a lack of aliasing if the
pointer types are different. I rather suspect that both the C
and the C++ standard need serious rework in this respect. As it
stands, however, the compiler can only assume no aliasing
between two pointers if 1) they have different types, 2) neither
of the types is a character type, and 3) accesses are
sufficiently "interleaved" (e.g. a read of *pi after the write
through *pd, above). Some compilers forget this last
requirement, however.
 
J

Juha Nieminen

Default said:
Congratulations. You have discovered the endian situation.

<http://en.wikipedia.org/wiki/Endian>

Actually code similar to what the original poster wrote is sometimes
used to detect the endianess of the target system at runtime. (Of course
the standard does *not* guarantee it will work, but *in practice* it
works, except maybe in some exotic hardware.)

Of course if you need to care about endianess, then in 99% of the
cases you are doing it wrong. It's rare to have to care. (There might be
some situations in which eg. reading the contents of a binary file may
be done faster if the endianess of the values in the binary file match
those of the architecture, and need to be done in the slower way if the
endianess does not match. But as said, since the endianess test is not
100% fool-proof, it's always a risk to write such low-level code if you
really intend for the code to be fully portable.)
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,995
Messages
2,570,230
Members
46,819
Latest member
masterdaster

Latest Threads

Top