Converting CHAR_MAX+1 to char?

Johannes Schaub (litb) · Mar 12, 2010

What happens in this case?

char c = CHAR_MAX + 1;

Not sure what happens. The spec only defines behavior for signed and
unsigned integral types it seems. Is the behavior undefined?

Balog Pal · Mar 12, 2010

Johannes Schaub (litb) said:
What happens in this case?

char c = CHAR_MAX + 1;

Not sure what happens. The spec only defines behavior for signed and
unsigned integral types it seems. Is the behavior undefined?

As I read the standard, unless CHAR_MAX == INT_MAX, the result is that c is
assigned an implementation-defined value.

The stuff on the right is promoted to int. Then addition happens. If that
overflows, it is UB. If not, the result is converted to char.

Johannes Schaub (litb) · Mar 13, 2010

Balog said:
As I read the standard, unless CHAR_MAX == INT_MAX, the result is that c
is assigned an implementation-defined value.

But i can't find it in the spec.

It doesn't say what happens if the value
doesn't fit

Can you point me to the parts where it says that about char?

Kai-Uwe Bux · Mar 13, 2010

Paavo said:
Plain char can be signed or unsigned, as defined by the implementation.
If it is signed, then I think 4.7.3 holds: "If the destination type is
signed, the value is unchanged if it can be represented in the
destination type (and bit-field width); otherwise, the value is
implementation-defined."

If char is signed and its size is less than int, then CHAR_MAX+1 can be
represented as an int, but obviously not as a char, thus assigning this
back to a char is implementation-defined. If char is signed and has the
same size than int, then CHAR_MAX+1 would result in integer overflow,
invoking undefined behavior.

I think, its just a tad more twisted. CHAR_MAX + 1 looks like a constant
expression and then the provision from [5/5] should kick in, which says the
program is ill-formed. (However, the standard acknowledges in a note that
most existing implementations are non-conforming in this regard and ignore
integer overflow.)

[...]

Best

Kai-Uwe Bux

Johannes Schaub (litb) · Mar 13, 2010

Paavo said:
Plain char can be signed or unsigned, as defined by the implementation.
If it is signed, then I think 4.7.3 holds: "If the destination type is
signed, the value is unchanged if it can be represented in the
destination type (and bit-field width); otherwise, the value is
implementation-defined."

But where does it say that char can be signed? I only find text where it
says it could hold negative values. It does not seem to say that it can be
included in the list of signed or unsigned integer types by an
implementation

Johannes Schaub (litb) · Mar 13, 2010

Kai-Uwe Bux said:
Paavo said:

Plain char can be signed or unsigned, as defined by the implementation.
If it is signed, then I think 4.7.3 holds: "If the destination type is
signed, the value is unchanged if it can be represented in the
destination type (and bit-field width); otherwise, the value is
implementation-defined."

If char is signed and its size is less than int, then CHAR_MAX+1 can be
represented as an int, but obviously not as a char, thus assigning this
back to a char is implementation-defined. If char is signed and has the
same size than int, then CHAR_MAX+1 would result in integer overflow,
invoking undefined behavior.

Click to expand...

I think, its just a tad more twisted. CHAR_MAX + 1 looks like a constant
expression and then the provision from [5/5] should kick in, which says
the program is ill-formed. (However, the standard acknowledges in a note
that most existing implementations are non-conforming in this regard and
ignore integer overflow.)

The addition is done in the int or unsigned int domain so this does not
apply to the range of char.

James Kanze · Mar 14, 2010

What happens in this case?

char c = CHAR_MAX + 1;

Not sure what happens. The spec only defines behavior for
signed and unsigned integral types it seems. Is the behavior
undefined?

Technically, it's implementation defined. In practice, it's a
pretty save bet that you'll get the same things as
char c = CHAR_MIN;
With, perhaps, a compiler warning.

Kai-Uwe Bux · Mar 14, 2010

Johannes said:
Kai-Uwe Bux said:

Paavo said:

Balog Pal wrote:

"Johannes Schaub (litb)" <[email protected]>
What happens in this case?

char c = CHAR_MAX + 1;

Not sure what happens. The spec only defines behavior for signed and
unsigned integral types it seems. Is the behavior undefined?

As I read the standard, unless CHAR_MAX == INT_MAX, the result is
that c is assigned an implementation-defined value.

But i can't find it in the spec. It doesn't say what happens if the
value doesn't fit Can you point me to the parts where it says that
about char?

Plain char can be signed or unsigned, as defined by the implementation.
If it is signed, then I think 4.7.3 holds: "If the destination type is
signed, the value is unchanged if it can be represented in the
destination type (and bit-field width); otherwise, the value is
implementation-defined."

If char is signed and its size is less than int, then CHAR_MAX+1 can be
represented as an int, but obviously not as a char, thus assigning this
back to a char is implementation-defined. If char is signed and has the
same size than int, then CHAR_MAX+1 would result in integer overflow,
invoking undefined behavior.

Click to expand...

I think, its just a tad more twisted. CHAR_MAX + 1 looks like a constant
expression and then the provision from [5/5] should kick in, which says
the program is ill-formed. (However, the standard acknowledges in a note
that most existing implementations are non-conforming in this regard and
ignore integer overflow.)

Click to expand...

The addition is done in the int or unsigned int domain

Yes, and it can overflow in the signed case.

so this does not apply to the range of char.

"This" being?

I see, I was not specific. My remark only applies to the case where there is
an overflow (CHAR_MAX == INT_MAX). The overflow is not related to the char
type at all; and I don't think, anybody claimed that. The additional point
is just that if an overflow happens in a constant expression, the program is
ill-formed.

Best

Kai-Uwe Bux

James Kanze · Mar 14, 2010

"Johannes Schaub (litb)" <[email protected]>

As I read the standard, unless CHAR_MAX == INT_MAX, the result
is that c is assigned an implementation-defined value.

The stuff on the right is promoted to int. Then addition
happens. If that overflows, it is UB.

If that overflows, the program is ill formed, because it is a
constant expression. (And you're right to point out this
possibility. I tend to forget it, because it isn't the case on
any of the machines I work on. But from what I understand, it's
a fairly frequent case on embedded platforms.)

If not, the result is converted to char.

Except that the results are guaranteed not to fit into a char,
so the results of the conversion are implementation defined.
The C standard goes further, and limits what those results can
be: either an implementation defined value, or an implementation
defined signal. (The C standard neglects to say what happens if
the conversion occurs at compile time. I would expect a
compiler error if the runtime implementation would raise a
signal, but I'm not sure that the standard actually allows
this.)

From a quality point of view, the only reasonable response would
be the signal. But for a number of historical reasons, that
would break so much code that it's not going to happen. In
practice, I think it's safe to say that you'll always get
CHAR_MIN.

James Kanze · Mar 14, 2010

Paavo Helde wrote:

[...]

But where does it say that char can be signed? I only find
text where it says it could hold negative values. It does not
seem to say that it can be included in the list of signed or
unsigned integer types by an implementation

Last sentence in §3.9.1/1: "a plain char object can take on
either the same values as a signed char or an unsigned char;
which one is implementation-defined."

Kai-Uwe Bux · Mar 14, 2010

James said:
Paavo Helde wrote:
[...]

Plain char can be signed or unsigned, as defined by the
implementation. If it is signed, then I think 4.7.3 holds:
"If the destination type is signed, the value is unchanged
if it can be represented in the destination type (and
bit-field width); otherwise, the value is
implementation-defined."

Click to expand...

Click to expand...

But where does it say that char can be signed? I only find
text where it says it could hold negative values. It does not
seem to say that it can be included in the list of signed or
unsigned integer types by an implementation

Click to expand...

Last sentence in §3.9.1/1: "a plain char object can take on
either the same values as a signed char or an unsigned char;
which one is implementation-defined."

I was thinking of the same, but I think the problem runs deeper than
defining the set of values. Plain char, signed char, and unsigned char are
three distinguished types. Only signed char and unsigned char are listed as
signed and unsigned integral types. Plain char, like bool, is an integral
type that appears to be neither signed nor unsigned. This poses a problem
for interpreting integer conversions from and to plain char because even if
plain char cannot hold negative values and behaves like unsigned char, there
is formally no guarantee that, e.g., arithmetic with plain char is mod 2^N
or that integer values are converted mod 2^N. For that, one would need
farther reaching provisions.

Best

Kai-Uwe Bux

Johannes Schaub (litb) · Mar 14, 2010

Kai-Uwe Bux said:
Kai-Uwe Bux said:

Paavo Helde wrote:

Balog Pal wrote:

"Johannes Schaub (litb)" <[email protected]>
What happens in this case?

char c = CHAR_MAX + 1;

Not sure what happens. The spec only defines behavior for signed and
unsigned integral types it seems. Is the behavior undefined?

As I read the standard, unless CHAR_MAX == INT_MAX, the result is
that c is assigned an implementation-defined value.

But i can't find it in the spec. It doesn't say what happens if the
value doesn't fit Can you point me to the parts where it says that
about char?

Plain char can be signed or unsigned, as defined by the implementation.
If it is signed, then I think 4.7.3 holds: "If the destination type is
signed, the value is unchanged if it can be represented in the
destination type (and bit-field width); otherwise, the value is
implementation-defined."

If char is signed and its size is less than int, then CHAR_MAX+1 can be
represented as an int, but obviously not as a char, thus assigning this
back to a char is implementation-defined. If char is signed and has the
same size than int, then CHAR_MAX+1 would result in integer overflow,
invoking undefined behavior.

I think, its just a tad more twisted. CHAR_MAX + 1 looks like a constant
expression and then the provision from [5/5] should kick in, which says
the program is ill-formed. (However, the standard acknowledges in a note
that most existing implementations are non-conforming in this regard and
ignore integer overflow.)

Click to expand...

The addition is done in the int or unsigned int domain

Click to expand...

Yes, and it can overflow in the signed case.

so this does not apply to the range of char.

Click to expand...

"This" being?

I see, I was not specific. My remark only applies to the case where there
is an overflow (CHAR_MAX == INT_MAX). The overflow is not related to the
char type at all; and I don't think, anybody claimed that. The additional
point is just that if an overflow happens in a constant expression, the
program is ill-formed.

Ah, i see now. Somehow i thought that if CHAR_MAX==INT_MAX, it must promote
to unsigned int. But i'm wrong, of course (CHAR_MAX is actually already of
the promoted type, which i missed too). If they are equal, only UCHAR_MAX
must be of unsigned int type, i think. Good catch on this, mate!

Johannes Schaub (litb) · Mar 14, 2010

Kai-Uwe Bux said:
James said:

Paavo Helde wrote:
[...]
Plain char can be signed or unsigned, as defined by the
implementation. If it is signed, then I think 4.7.3 holds:
"If the destination type is signed, the value is unchanged
if it can be represented in the destination type (and
bit-field width); otherwise, the value is
implementation-defined."

Click to expand...

But where does it say that char can be signed? I only find
text where it says it could hold negative values. It does not
seem to say that it can be included in the list of signed or
unsigned integer types by an implementation

Click to expand...

Last sentence in Â§3.9.1/1: "a plain char object can take on
either the same values as a signed char or an unsigned char;
which one is implementation-defined."

Click to expand...

I was thinking of the same, but I think the problem runs deeper than
defining the set of values. Plain char, signed char, and unsigned char are
three distinguished types. Only signed char and unsigned char are listed
as signed and unsigned integral types. Plain char, like bool, is an
integral type that appears to be neither signed nor unsigned. This poses a
problem for interpreting integer conversions from and to plain char
because even if plain char cannot hold negative values and behaves like
unsigned char, there is formally no guarantee that, e.g., arithmetic with
plain char is mod 2^N or that integer values are converted mod 2^N. For
that, one would need farther reaching provisions.

Exactly, i was worried about that one. Haven't found a solution for it in
the spec

Bo Persson · Mar 14, 2010

Johannes said:
Kai-Uwe Bux said:

James said:

On Mar 13, 11:05 pm, "Johannes Schaub (litb)"
Paavo Helde wrote:

[...]
Plain char can be signed or unsigned, as defined by the
implementation. If it is signed, then I think 4.7.3 holds:
"If the destination type is signed, the value is unchanged
if it can be represented in the destination type (and
bit-field width); otherwise, the value is
implementation-defined."

But where does it say that char can be signed? I only find
text where it says it could hold negative values. It does not
seem to say that it can be included in the list of signed or
unsigned integer types by an implementation

Last sentence in §3.9.1/1: "a plain char object can take on
either the same values as a signed char or an unsigned char;
which one is implementation-defined."

Click to expand...

I was thinking of the same, but I think the problem runs deeper
than defining the set of values. Plain char, signed char, and
unsigned char are three distinguished types. Only signed char and
unsigned char are listed as signed and unsigned integral types.
Plain char, like bool, is an integral type that appears to be
neither signed nor unsigned. This poses a problem for interpreting
integer conversions from and to plain char because even if plain
char cannot hold negative values and behaves like unsigned char,
there is formally no guarantee that, e.g., arithmetic with plain
char is mod 2^N or that integer values are converted mod 2^N. For
that, one would need farther reaching provisions.

Click to expand...

Exactly, i was worried about that one. Haven't found a solution for
it in the spec

If we look at the type traits (C++0x), is_signed and is_unsigned are
defined for all arithmetic types, which includes plain char. There the
expression char(-1) < char(0) defines its signedness.

Bo Persson

Johannes Schaub (litb) · Mar 14, 2010

Bo said:
Johannes said:

Kai-Uwe Bux said:

James Kanze wrote:

On Mar 13, 11:05 pm, "Johannes Schaub (litb)"
Paavo Helde wrote:

[...]
Plain char can be signed or unsigned, as defined by the
implementation. If it is signed, then I think 4.7.3 holds:
"If the destination type is signed, the value is unchanged
if it can be represented in the destination type (and
bit-field width); otherwise, the value is
implementation-defined."

But where does it say that char can be signed? I only find
text where it says it could hold negative values. It does not
seem to say that it can be included in the list of signed or
unsigned integer types by an implementation

Last sentence in ï¿½3.9.1/1: "a plain char object can take on
either the same values as a signed char or an unsigned char;
which one is implementation-defined."

I was thinking of the same, but I think the problem runs deeper
than defining the set of values. Plain char, signed char, and
unsigned char are three distinguished types. Only signed char and
unsigned char are listed as signed and unsigned integral types.
Plain char, like bool, is an integral type that appears to be
neither signed nor unsigned. This poses a problem for interpreting
integer conversions from and to plain char because even if plain
char cannot hold negative values and behaves like unsigned char,
there is formally no guarantee that, e.g., arithmetic with plain
char is mod 2^N or that integer values are converted mod 2^N. For
that, one would need farther reaching provisions.

Click to expand...

Exactly, i was worried about that one. Haven't found a solution for
it in the spec

Click to expand...

If we look at the type traits (C++0x), is_signed and is_unsigned are
defined for all arithmetic types, which includes plain char. There the
expression char(-1) < char(0) defines its signedness.

Indeed, and bool is stated unsigned by that. I think that is_signed doesn't
refer to the core-language notation of "signed" tho.

So if noone can find the bug in the spec should someone do a issue report on
it?

Why does 1 represent a negative sign bit?	0	Aug 10, 2022
whats the use of unsigned char	11	Nov 6, 2009
sign of char literals in #if directive	0	Aug 30, 2007
Converting several Markdown files into DOCX with pandoc	4	Feb 1, 2023
Cannot convert (double) to (double*)	1	Sep 5, 2022
lifetime of char[]?	4	Nov 9, 2013
reading binary file into memory. Converting from char to uint32,float, double, ASCII strings etc (st	37	Oct 15, 2011
How do i get numberOfItemsHired to only accept 1-500 if it is outside those values error message should be displayed	10	Jul 5, 2024

Converting CHAR_MAX+1 to char?

Johannes Schaub (litb)

Balog Pal

Johannes Schaub (litb)

Kai-Uwe Bux

Johannes Schaub (litb)

Johannes Schaub (litb)

James Kanze

Kai-Uwe Bux

James Kanze

James Kanze

Kai-Uwe Bux

Johannes Schaub (litb)

Johannes Schaub (litb)

Bo Persson

Johannes Schaub (litb)

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads