How to use std::cout to output a char as a number?

D

Dancefire

Hi, everyone

It might be a simple question, but I really don't know the answer.

char c = '1';
cout << c;

The above code will only output a '1' rather than 0x31;

If I use int cast, it can show the number:

cout << (int) c;

however, if the c is > 0x80, which might be a part of MBCS(multibyte
character set) string, such as :

const char* cc = "\xba\xba";
cout << hex << "0x" << (int)cc[0] << ", 0x" << (int)cc[1] << endl;

it will extend negative bit, and will output as:

0xFFFFFFBA, 0xFFFFFFBA

I currently use a stupid way to avoid extending the negative bit and
show the number of a simple char by double casting:

const char* cc = "\xba\xba";
cout << hex << "0x" << (unsigned short)((unsigned char)cc[0]) << ",
0x" << (unsigned short)((unsigned char)cc[1]) << endl;

This time the output looks ok. But it doesn't make any sense, since I
only want to output the char as a number, should not so complicated by
double casting.

Did I miss anything? What is the most directly way to use std::cout
output a char as a number?

Thanks.
 
V

Victor Bazarov

Dancefire said:
Hi, everyone

It might be a simple question, but I really don't know the answer.

char c = '1';
cout << c;

The above code will only output a '1' rather than 0x31;

If I use int cast, it can show the number:

cout << (int) c;

however, if the c is > 0x80,[..]

See, this is where the logic fails you. An 8-bit char cannot
have the value > 0x7f, unless it's implemented as unsigned. If
it's a signed char (as you seem to show), 0x7f is its highest
possible value.

V
 
D

Dancefire

If I use int cast, it can show the number:
cout << (int) c;
however, if the c is > 0x80,[..]

See, this is where the logic fails you. An 8-bit char cannot
have the value > 0x7f, unless it's implemented as unsigned. If
it's a signed char (as you seem to show), 0x7f is its highest
possible value.

Yes, define the 'unsigned char' works for above case. Thank you.

But I still have the problem. The original problem is raised when I
try to print a std::string's content as number for each character. The
standard define the string is basic_string<char, ...>, is not
bacic_string<unsigned char, ...>. Currently, I have to use the
following loop to show each character as a number:

string test("\xba\xba");
for(std::string::iterator iter = test.begin(); iter != test.end(); +
+iter)
{
cout << "0x" << hex << (unsigned short)((unsigned char) *iter) <<
" ";
}

The *iter is char, not unsigned char. So, I have to use double casting
to do the job. If I use c-style function, I can easily specify "%x" in
the format string of printf() to print the char without any
unnecessary casting, it looks much straightforward. In C++'s
std::cout, there should be a way that easily specify whether I want to
print a character, or a numeric value, rather than determine by cout
itself, since the 'char' is really special, sometimes it's character,
sometime, it means a 8-bit integer number.
 
G

Gennaro Prota

On 30 Apr 2007 04:52:22 -0700, Dancefire wrote:

[...]
const char* cc = "\xba\xba";
cout << hex << "0x" << (unsigned short)((unsigned char)cc[0]) << ",
0x" << (unsigned short)((unsigned char)cc[1]) << endl;

This time the output looks ok. But it doesn't make any sense, since I
only want to output the char as a number, should not so complicated by
double casting.

Did I miss anything? What is the most directly way to use std::cout
output a char as a number?

What do you mean by most direct way? You could write something like
the following:

unsigned
as_unsigned( char c )
{
return static_cast< unsigned char >( c );
}

(note: if c is negative, the cast to unsigned char will yield the
mathematical value UCHAR_MAX + c, regardless of whether char uses a
two's complement representation or not)
 
G

Gennaro Prota

(note: if c is negative, the cast to unsigned char will yield the
mathematical value UCHAR_MAX + c, regardless of whether char uses a
two's complement representation or not)

Of course I meant to write UCHAR_MAX + 1 + c, sorry.
 
D

Dancefire

What do you mean by most direct way? You could write something like
the following:

unsigned
as_unsigned( char c )
{
return static_cast< unsigned char >( c );
}

(note: if c is negative, the cast to unsigned char will yield the
mathematical value UCHAR_MAX + c, regardless of whether char uses a
two's complement representation or not)

Thanks for reply, do I have to use a function to do this? It involve
much more overhead. Not only the function call, but also I should put
the function to some where my every time cout << char can see the
code, which means I should put the function into my function lib.

And the above function still cannot output the number, it's will still
output the character, Unless I call another casting to cast the
unsigned char to unsigned short. It's not better than just double
casting the char at the place I output it. which is provide at
original post:

cout << " 0x" << hex << static_cast<unsigned
short>(static_cast<unsigned char>(c));

But this form is not semantically correct, I only want to output the 8-
bit unsigned integer, rather than 16-bit unsigned integer. And I use 2
casting only for correct output the char as a number? Is there any
other way to do that? is there anything just like the printf("%x", c)
in cout, which means I set something, and then the output for char is
a number. Such as:

cout << " 0x" << hex << char_is_number << c;

or

cout << " 0x" << hex << static_cast<*a simple type here*>(c);

Thanks.
 
G

Gennaro Prota

Did I miss anything? What is the most directly way to use std::cout
output a char as a number?

What do you mean by most direct way? You could write something like
the following:

unsigned
as_unsigned( char c )
{
return static_cast< unsigned char >( c );
}

(note: if c is negative, the cast to unsigned char will yield the
mathematical value UCHAR_MAX + c, regardless of whether char uses a
two's complement representation or not)

[sig snipped...]

Thanks for reply, do I have to use a function to do this? It involve
much more overhead. Not only the function call,

You shouldn't be concerned with that. Try for instance googling for
"premature optimization".
but also I should put
the function to some where my every time cout << char can see the
code, which means I should put the function into my function lib.

What's wrong with that?
And the above function still cannot output the number, it's will still
output the character, Unless I call another casting to cast the
unsigned char to unsigned short.

No. First the function performs no output, just a (double) conversion.
Secondly, the return type is unsigned int, not unsigned char. Try

std::cout << as_unsigned( c );
It's not better than just double casting the char at the place

It's better than double casting "in place" for at least two reasons:
a) it gives a name to the operation you perform b) it encapsulates the
way you do it: should you later discover that the way you did it is
incorrect, you just have to fix it in one place.

And, of course, you should anyway use new-style casts.
I output it. which is provide at original post:

cout << " 0x" << hex << static_cast<unsigned
short>(static_cast<unsigned char>(c));

But this form is not semantically correct,

You lost me here. Didn't you say in your original post that this gives
what you want?
I only want to output the 8-
bit unsigned integer, rather than 16-bit unsigned integer.

What's the difference on output if your unsigned int has at most 8
bits on?
And I use 2
casting only for correct output the char as a number? Is there any
other way to do that? is there anything just like the printf("%x", c)
in cout, which means I set something, and then the output for char is
a number. Such as:

cout << " 0x" << hex << char_is_number << c;

or

cout << " 0x" << hex << static_cast<*a simple type here*>(c);

You could obtain the latter, yes, though I don't see what advantage it
gives.

#include <ostream>

struct simple_type
{
typedef unsigned short number_type;

number_type m_n;

explicit simple_type( char c )
: m_n( static_cast< unsigned char >( c ) )
{
}
};

std::eek:stream &
operator<<( std::eek:stream & dest, const simple_type & s )
{
return dest << s.m_n;
}
 
G

Gennaro Prota

Currently, I have to use the
following loop to show each character as a number:

string test("\xba\xba");
for(std::string::iterator iter = test.begin(); iter != test.end(); +
+iter)
{
cout << "0x" << hex << (unsigned short)((unsigned char) *iter) <<
" ";
}

The *iter is char, not unsigned char. So, I have to use double casting
to do the job. If I use c-style function, I can easily specify "%x" in
the format string of printf() to print the char without any
unnecessary casting, it looks much straightforward.

BTW, where did you get that idea from? If you specify "%x" the
corresponding argument must have type unsigned int; otherwise you have
*undefined behavior*.
 
G

Gianni Mariani

Dancefire wrote:
....
Did I miss anything? What is the most directly way to use std::cout
output a char as a number?

No, you didn't miss anything. There may be slightly more elegant
solutions but this is what you get when the compiler picks your output
format by overloading the << operator.

To make it a little more palatable, this is one alternative.

unsigned OutAsUnsigned( char val )
{
return 0xff & unsigned( val );
}


cout << hex << "0x" << OutAsUnsigned( cc[0] )
<< 0x" << OutAsUnsigned( cc[1] );
 
D

Dancefire

And the above function still cannot output the number, it's will still
No. First the function performs no output, just a (double) conversion.
Secondly, the return type is unsigned int, not unsigned char. Try

std::cout << as_unsigned( c );

oops, yes you're right, the function return "unsigned" which is
"unsigned int", that makes it work. But, actually, it's still a double
casting of the char, one explicit casting in as_unsigned() function,
one implicit casting during the return of the function. After
optimization by compiler (eliminate the function call), the result
will be the same as the in place double casting.
It's better than double casting "in place" for at least two reasons:
a) it gives a name to the operation you perform b) it encapsulates the
way you do it: should you later discover that the way you did it is
incorrect, you just have to fix it in one place.

And, of course, you should anyway use new-style casts.

Yes, you are right, encapsulate the operation as a function can make
the debug much easier.
You lost me here. Didn't you say in your original post that this gives
what you want?

The original solution in my first post:

cout << " 0x" << hex << (unsigned short)((unsigned char)c);

It works, (I modified them to static_cast<>() way now), I'm currently
using this way, but I think this form is not elegant and might be not
necessary. I thought there may be a flag or something like "hex" in
STL can be set on cout, so I can directly cout << c, without any
(logically) unnecessary casting in my code.
What's the difference on output if your unsigned int has at most 8
bits on?
no, no different for the output, just might not elegant in the code.
You could obtain the latter, yes, though I don't see what advantage it
gives.

#include <ostream>

struct simple_type
{
typedef unsigned short number_type;

number_type m_n;

explicit simple_type( char c )
: m_n( static_cast< unsigned char >( c ) )
{
}
};

std::eek:stream &
operator<<( std::eek:stream & dest, const simple_type & s )
{
return dest << s.m_n;
}

Yes, I can get the latter form by this code, but I don't see the
advantage of it either, and it actually do the same thing above,
double casting, and just became a struct form.

So, in conclusion, there is no way to avoid the double casting if I
want std::cout output a char as a number, right?

Thank you.
 
D

Dancefire

BTW, where did you get that idea from? If you specify "%x" the
corresponding argument must have type unsigned int; otherwise you have
*undefined behavior*.

Yes, you are right, I am wrong here. I show %x here just want to
mention something like a flag or type field in printf() or some other
stuff can be set to cout.
 
D

Dancefire

Dancefire wrote:

...
Did I miss anything? What is the most directly way to use std::cout
output a char as a number?

No, you didn't miss anything. There may be slightly more elegant
solutions but this is what you get when the compiler picks your output
format by overloading the << operator.

To make it a little more palatable, this is one alternative.

unsigned OutAsUnsigned( char val )
{
return 0xff & unsigned( val );

}

cout << hex << "0x" << OutAsUnsigned( cc[0] )
<< 0x" << OutAsUnsigned( cc[1] );

Thank you, yes it works. If I use a function, I think Gennaro's
version is better, if I modified it to more explicit for double
casting, it will like:

unsigned int to_number(char c)
{
return static_cast<unsigned int>(static_cast<unsigned char>(c));
}
 
G

Gianni Mariani

Dancefire said:
Dancefire wrote:

...
Did I miss anything? What is the most directly way to use std::cout
output a char as a number?
No, you didn't miss anything. There may be slightly more elegant
solutions but this is what you get when the compiler picks your output
format by overloading the << operator.

To make it a little more palatable, this is one alternative.

unsigned OutAsUnsigned( char val )
{
return 0xff & unsigned( val );

}

cout << hex << "0x" << OutAsUnsigned( cc[0] )
<< 0x" << OutAsUnsigned( cc[1] );

Thank you, yes it works. If I use a function, I think Gennaro's
version is better, if I modified it to more explicit for double
casting, it will like:

unsigned int to_number(char c)
{
return static_cast<unsigned int>(static_cast<unsigned char>(c));
}

What you write is the same as because there is a conversion to the
return value type :

unsigned int to_number(char c)
{
return static_cast<unsigned char>(c);
}


From the compiler's standpoint, the code static_cast<unsigned char>(c)
is the same as 0xff & unsigned( val ).
 
G

Gennaro Prota

It works, (I modified them to static_cast<>() way now), I'm currently
using this way, but I think this form is not elegant and might be not
necessary. I thought there may be a flag or something like "hex" in
STL can be set on cout, so I can directly cout << c, without any
(logically) unnecessary casting in my code.

No such flag :) The issue, so to say, is "hardcoded" in the logic of
the stream inserters and their overloading: all the overloads for
char, signed char and unsigned char, behave as *character* inserters;
inserters for other built-in types behave as *arithmetic* inserters.

It seems that you were just curious about the theoretical question
here (sorry for somehow missing that in my earlier replies), so the
above should answer it. Practically speaking, don't be afraid to use
the function at all, and don't worry about "optimization".

Of course, depending on your application and/or library design, you
might choose some higher level abstraction, such as a HexDumper class,
with something like a display( std::eek:stream & ) member function giving
an output along the lines of
> 80 65 66 ...
> A0 FF 4A 7F ...

Take care of restoring the formatting flags if you change them within
a function which accepts a stream argument; see for instance the Boost
state-saving templates at

<http://www.boost.org/libs/io/>

(Frankly I prefer something definitely simpler than that, but that's
another story :) It's not like you have to maintain that code)
 
D

Dancefire

What you write is the same as because there is a conversion to the
return value type :

unsigned int to_number(char c)
{
return static_cast<unsigned char>(c);

}

From the compiler's standpoint, the code static_cast<unsigned char>(c)
is the same as 0xff & unsigned( val ).

Yes, they are actually same stuff. The only different here is not
hardcode 0xff in my function, to avoid future portablility issue. (but
practically, in foreseable future, nobody trying to define char as
other than 8-bit. so the two function are same result. :) )
 
J

James Kanze

On 30 Apr 2007 07:07:41 -0700, Dancefire wrote:
BTW, where did you get that idea from? If you specify "%x" the
corresponding argument must have type unsigned int; otherwise you have
*undefined behavior*.

In practice, of course, since unsigned int is required to have
the same size as int (and integral promotion guarantees that
he'll pass the char as int), about the only thing he is risking
is funny values in the display: things like ffe9, instead of
just e9.

The fundamental problem, of course, is that we use a type which
might be signed for character values, with the result that we
end up lying about things. Thus, for example, in ISO 8859-1,
the character LATIN SMALL LETTER E WITH ACUTE is 0xE9; 233, if
you prefer. But if CHAR_MAX is 127 (not an infrequent case),
this simply doesn't fit; you can't have a char with this value.
In practice, however, most systems will convert this value to a
char, without any problem, giving the value -23, and when you
convert -23 to unsigned char, on a machine with 8 bit char's,
you end up with 233. (The conversion in this direction is well
defined by the standard.) But the fact remains that we're
lying; the char does NOT contain the correct numeric value.

And it's a well known fact that one lie leads to another, and
the results rapidly become very complicated. Logically, the
requirement should be that plain char can hold any character in
any of the extended character sets without being negative.
Practically speaking, however, this would mean that on 8 bit
machines, plain char would have to be unsigned. Historically,
however, it was signed on the PDP-11, an incredable amount of C
was written in the belief that all the world is a PDP-11, and
used char as a very small, signed integer, and most
implementors, given the choice, don't (or didn't) want to break
such software.
 
J

James Kanze

Dancefire said:
Dancefire wrote:
...
Did I miss anything? What is the most directly way to use std::cout
output a char as a number?
No, you didn't miss anything. There may be slightly more elegant
solutions but this is what you get when the compiler picks your output
format by overloading the << operator.
To make it a little more palatable, this is one alternative.
unsigned OutAsUnsigned( char val )
{
return 0xff & unsigned( val );
}
cout << hex << "0x" << OutAsUnsigned( cc[0] )
<< 0x" << OutAsUnsigned( cc[1] );
Thank you, yes it works. If I use a function, I think Gennaro's
version is better, if I modified it to more explicit for double
casting, it will like:
unsigned int to_number(char c)
{
return static_cast<unsigned int>(static_cast<unsigned char>(c));
}
What you write is the same as because there is a conversion to the
return value type :
unsigned int to_number(char c)
{
return static_cast<unsigned char>(c);

}
From the compiler's standpoint, the code static_cast<unsigned char>(c)
is the same as 0xff & unsigned( val ).

There's a semantic difference with regards to what the code says
to the reader. If the goal is to display an 8 bit character
code as a number, and'ing with 0xFF says this more clearly, at
least IMHO.

Formally, there's also a difference at the language level; the
two are only equivalent on 2's complement machines with 8 bit
bytes. If the goal is to display an 8 bit code, and you're
using a ones complement machine, or a machine with 9 bits, then
the cast to unsigned char may not give the expected results.
In the case of 9 bit bytes, there shouldn't be any problem,
though, since any 8 bit code is representable as a positive
value in a 9 bit byte. And the only ones complement machine I
know that is still being sold has 9 bit bytes, so there
shouldn't be any problems there, either. (There's also signed
magnitude, but I don't know of any machine still being sold
which uses that.)
 
M

Markus Schoder

James said:
Formally, there's also a difference at the language level; the
two are only equivalent on 2's complement machines with 8 bit
bytes. If the goal is to display an 8 bit code, and you're
using a ones complement machine, or a machine with 9 bits, then
the cast to unsigned char may not give the expected results.

I do not see how one or two's complement would make any difference here
given that conversions to unsigned (no matter wether char or int) are
defined modulo 2^n.
 
G

Gennaro Prota

On 2 May 2007 02:33:51 -0700, James Kanze wrote:

[...]
There's a semantic difference with regards to what the code says
to the reader. If the goal is to display an 8 bit character
code as a number, and'ing with 0xFF says this more clearly, at
least IMHO.

I interpreted the original question as asking to display the
"character code", whatever the char width was (9 bits on
implementations with 9 bit chars). So I'd have used UCHAR_MAX instead
of hardcoding 0xFF. (It is an interesting exercise for language
lawyers establishing if

UCHAR_MAX && static_cast< unsigned >( c )

is enough or there's --in general-- a need for some conversion on
UCHAR_MAX too :))
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,997
Messages
2,570,239
Members
46,827
Latest member
DMUK_Beginner

Latest Threads

Top