cout << -2147483648 ; strange output in g++

V

Victor Bazarov

Rolf said:
Victor said:
Rolf said:
Victor Bazarov wrote:

suresh wrote:
For this code snippet, I get the following output, which I am
unable to understand.
(2^31 = 2147483648)

cout<< -2147483648 << endl;
cout << numeric_limits<int>::min() <<',' <<
numeric_limits<int>::max()<< endl;

The outputs I get are:
2147483648
-2147483648,2147483647

with this compile time warning: this decimal constant is unsigned
only in ISO C90

I am surprised why the negative number is printed as positive.

Could you please explain why this behaviour?

The literal '2147483648' does not fit in an int, therefore it is
promoted to 'long int'.

My compiler promoted it to unsigned long.

That's against the Standard. See [lex.icon]/2.

It says there that the behavior is undefined. Converting to unsigned
long seems perfectly valid for that.

Yes, it is. Sorry. I guess I ought to say that it's by no means
an explanation of what happens in the OP's program. "My compiler
does this" is not a valid argument of what _shall_ happen in all
possible cases.

V
 
J

James Kanze

The literal '2147483648' does not fit in an int, therefore it
is promoted to 'long int'. If your system's 'long' has the
same size as 'int' (like on Windows, for example), the
behaviour is undefined.

My copy of the standard says that the program is ill formed if a
decimal literal does not fit in a signed integral type. Ill
formed means that it requires a diagnostic. (It's undefined
behavior in C, however, since the C standard lacks the one
sentence that says that it is ill formed.)

Of course, the current draft of the standard (and C, since C90)
support longer integral types, as do most current compilers. In
which case, the type of the literal should be long long. Which
leads to an interesting anomally: we all know that 2147483648 is
0x80000000 in hex. But if both the compiler and the library
support long long (and both will be required to real soon now),
then:

std::cout << -2147483648 << std::endl ;
// outputs "-2147483648"
std::cout << -0x80000000 << std::endl ;
// outputs 2147483648

Formally, if his machine is a 32 bit 2's complement machine,
then if the compiler supports long long (or if long has more
than 32 bits), it should output -2147483648, as he expects, and
if it doesn't, he should get a diagnostic when he compiles.

Practically: C leaves this behavior undefined, and historically,
a lot of compilers have treated a literal which didn't fit into
a long, but did fit into an unsigned long, as an unsigned. I
wouldn't be too surprised if many C++ did this as well, even if
the standard doesn't allow it. And of course, this would result
in the behavior he is seeing.
Try adding the suffix 'U' to the literal.

That certainly won't help in getting the expression be treated
as signed, which is apparently what he wants.
 
J

James Kanze

My compiler promoted it to unsigned long.

Presumably because it won't fit in a signed long either. That's
not legal C++ (the standard requires a diagnostic), but it does
correspond to a lot of traditional C implementations.

Try compiling is the strictest standard conformant mode.
 
J

James Kanze

As I understand, the range of 32 bit signed number is -2147483648 to
2147483647. So it should work correctly.

You're correct as to the range. The problem here is that
-2147483648 is two tokens, "-" and "2147483648". The second is
a decimal integral literal. Which means (according to C++98),
that if it fits in an int, it has type int, otherwise if it fits
in a long, it has type long, otherwise, the program is ill
formed.

Historicaly, in older C compilers, the sequence of types was
int, long, unsigned long. The C standard leaves the behavior
undefined if the value doesn't fit, so that a C compiler can
legally implement the pre-standard behavior. I suspect that
your compiler is implementing the classical, pre-standard C
rules, which is permissible in C, but not in C++.

And it is a recognized problem that there is no way of entering
the smallest possible long as a decimal constant in the
language.
Again, the code snippet,
signed int x = -2147483648;
cout << x << endl;
works correctly though the warning "this decimal constant is
unsigned only in ISO C90" is produced.

This forces a conversion to signed int. Supposing for an
instance that long has more than 32 bits, or that the compiler
supports long long, then the arithmetic in the expression works,
and the results of the arithmetic on the two tokens does result
in a value that fits in a signed int. Supposing that your
compiler (illegally) treats your expression as a unsigned long
(with 32 bit longs), then the resulting value of the expression
"-2147483648" is an unsigned long with a value that doesn't fit
in a signed int. According to the standard, the results of the
conversion are implementation defined, but on most
implementations, all that happens is that the bit pattern is
stuffed into the target type. Which "happens" to give the
correct results in this case, on a 32 bit 2's complement
machine.

Just out of curiosity, what does:

std::cout << typeid( -2147483648 ).name() << std::endl ;

output on your machine?
 
J

James Kanze

suresh wrote:

People keep saying it, but...
Undefined.

There is no undefined behavior in any of the code I've seen so
far, at least according to the C++ standard. There is a very
high probability of a compiler "error", however, or rather, a
case of the compiler intentionally being non-compliant, for
various histe^orical reasons.
 
V

Victor Bazarov

James said:
My copy of the standard says that the program is ill formed if a
decimal literal does not fit in a signed integral type. Ill
formed means that it requires a diagnostic. (It's undefined
behavior in C, however, since the C standard lacks the one
sentence that says that it is ill formed.)

You're correct. I didn't read the next paragraph, [lex.icon]/3,
which states that the code is ill-formed if the literal cannot
be represented.

I would seem that the OP's compiler did something similar to the
compiler used by Rolf Magnus, i.e. promote the literal that it
couldn't fit into 'long' to 'unsigned long', which then gave the
expected result when unary minus was applied to it.

V
 
J

James Kanze

[...]
Yes, but the '-' is not part of the integer literal. It's an operator that
is applied to it. Now the C++ standard says:
"The type of an integer literal depends on its form, value, and suffix. If
it is decimal and has no suffix, it has the first of these types in which
its value can be represented: int, long int; if the value cannot be repre-
sented as a long int, the behavior is undefined."

Where does it say that? I've got both the official C++ 98 and
the latest draft on line, and both clearly say that "A program
is ill-formed if one of its translation units contains an
integer literal that cannot be represented by any of the allowed
types". (In C99, it is undefined, because the standard doesn't
say what the behavior should be. There's no explicit statement
that the behavior is undefined there either, that I can find.)
 
V

Victor Bazarov

James said:
[...]
Yes, but the '-' is not part of the integer literal. It's an
operator that is applied to it. Now the C++ standard says:
"The type of an integer literal depends on its form, value, and
suffix. If it is decimal and has no suffix, it has the first of
these types in which its value can be represented: int, long int; if
the value cannot be repre- sented as a long int, the behavior is
undefined."

Where does it say that? I've got both the official C++ 98

....which has been superceded by C++ 03 (although that part didn't
change, most likely; I don't have the outdated document handy to
check).
and
the latest draft on line, and both clearly say that "A program
is ill-formed if one of its translation units contains an
integer literal that cannot be represented by any of the allowed
types".

See the previous paragraph.
(In C99, it is undefined, because the standard doesn't
say what the behavior should be. There's no explicit statement
that the behavior is undefined there either, that I can find.)

V
 
R

Rolf Magnus

James said:
[...]
Yes, but the '-' is not part of the integer literal. It's an operator
that is applied to it. Now the C++ standard says:
"The type of an integer literal depends on its form, value, and suffix.
If it is decimal and has no suffix, it has the first of these types in
which its value can be represented: int, long int; if the value cannot be
repre- sented as a long int, the behavior is undefined."

Where does it say that?

I've quoted the above from C++ 98, 2.13.1/2.
I've got both the official C++ 98 and the latest draft on line,
and both clearly say that "A program is ill-formed if one of its
translation units contains an integer literal that cannot be represented
by any of the allowed types".

What exactly does it mean by "the allowed types"?
 
E

Erik Wikström

James said:
[...]
Yes, but the '-' is not part of the integer literal. It's an operator
that is applied to it. Now the C++ standard says:
"The type of an integer literal depends on its form, value, and suffix.
If it is decimal and has no suffix, it has the first of these types in
which its value can be represented: int, long int; if the value cannot be
repre- sented as a long int, the behavior is undefined."

Where does it say that?

I've quoted the above from C++ 98, 2.13.1/2.
I've got both the official C++ 98 and the latest draft on line,
and both clearly say that "A program is ill-formed if one of its
translation units contains an integer literal that cannot be represented
by any of the allowed types".

What exactly does it mean by "the allowed types"?

That would be int and long int, as specified in 2.13.1/2 (long long will
probably be added in the next version of the standard).
 
J

James Kanze

James said:
[...]
Yes, but the '-' is not part of the integer literal. It's
an operator that is applied to it. Now the C++ standard
says:
"The type of an integer literal depends on its form, value,
and suffix. If it is decimal and has no suffix, it has the
first of these types in which its value can be represented:
int, long int; if the value cannot be represented as a
long int, the behavior is undefined."
Where does it say that?
I've quoted the above from C++ 98, 2.13.1/2.

Interesting. The standard contradicts itself in two successive
paragraphs. The latest draft fixes this, however, and there is
no undefined behavior.

I took the occasion last night to look up the text in some of my
older documents (K&R, first edition, and the ARM). The history
surrounding this seems somewhat curious, to put it mildly:

-- In K&R C, an integral literal is either int (if it fits) or
long (if it doesn't fit in an int). This holds for all
integral literals, regardless of base (but one could append
an L or an l to force long). K&R doesn't say what happens
if it doesn't fit in a long, but unsigned long isn't an
option, since unsigned didn't exist yet.

-- The ARM (1988 or 1989, I think---I didn't think to check in
my original copy of TC++PL, ed. 1, which would definitely be
pre-ARM) and C90 give the list of types as int, long,
unsigned long; the type is, again, the first one that fits.
Neither says what happens if the value won't fit in an
unsigned long; at least in C90, if it isn't otherwise
specified (the case here), it is undefined behavior.

C90 and the ARM also introduce the U/u suffix, to force
unsigned (in addition to continuing to support L/l). C90,
at least (and I think the ARM as well) also uses a different
list (int, unsigned int, long, unsigned long) for octal and
hexadecimal literals (which means that 2147483648 and
0x80000000 will behave differently on a machine with 32 bit
ints but 64 bit longs). This distinction depending on the
base is maintained in all later documents.

-- The C++98 and the C++03 standards give the list for decimal
literals as simply int, long, without the unsigned long.
Both standards contradict themselves, saying that it is
undefined behavior if the value doesn't fit where they
specify the list, but stating that the program is ill formed
if it doesn't fit in the next paragraph.

-- C99 adds long long and extended integer types, and also
drops unsigned types from the list for decimal literals,
resulting in int, long, long long as the list. It goes on
to say:

If an integer literal cannot be represented by any
type in its list and an extended integer type can
represent its value, it may have that extended
integer type. If all of the types in the list for
the literal are signed, the extended integer type
shall be signed. If all of the types in the list for
the literal are unsigned, the extended integer type
shall be unsigned. If the list contains both signed
and unsigned types, the extended integer type may be
signed or unsigned.

Since the standard doesn't say what happens if the given
value doesn't fit in any of the available types, it is
undefined behavior.

-- The latest draft of the standard copies the C99 text
verbatim, but adds the sentence "A program is ill-formed if
one of its translation units contains an integer literal
that cannot be represented by any of the allowed types." to
the end of the preceding paragraph. (In C++98 and C++03,
that sentence is in a paragraph of its own.)

The version of g++ that I have readily available here (4.1.)
implements long long, but still uses the type list int, long,
unsigned long (ignoring long long) from C90/ARM C++ (although it
supports the LL suffix). It does give a warning, at least with
the options I usually use, and since it documents that anything
the compiler outputs is a diagnostic in the sense of the
standard, it is conformant (at least if you specify -pedantic to
turn off long long); according to the standard, what happens
after the compiler has issued a diagnostic is undefined
behavior.
What exactly does it mean by "the allowed types"?

Those that are listed as possible types for the value. The list
of allowed types depends on the base and the suffixes (u/U and
l/L).
 
S

suresh

I just get output "m" when this code segment is executed. what does
this mean?
suresh
 
R

Rolf Magnus

Please don't top-post, and don't quote signatures.
I just get output "m" when this code segment is executed. what does
this mean?

In g++, it stands for "unsigned long".
 
L

Lars Uffmann

Hiya James!


Just thought I'd mention it, since Rolf pointed out Suresh quoted your
signature: Your sig is missing a blankspace after the initial two dashes
- newsreaders won't recognize this as a signature so they could
auto-remove it upon responding.

Best Regards,

Lars
 
I

Ioannis Gyftos

Hiya James!
[...]
Just thought I'd mention it, since Rolf pointed out Suresh quoted your
signature: Your sig is missing a blankspace after the initial two dashes
- newsreaders won't recognize this as a signature so they could
auto-remove it upon responding.

I'm pretty sure he doesn't miss it :)
 
E

Erik Wikström

Hiya James!


Just thought I'd mention it, since Rolf pointed out Suresh quoted your
signature: Your sig is missing a blankspace after the initial two dashes
- newsreaders won't recognize this as a signature so they could
auto-remove it upon responding.

Best Regards,

Lars

And you are missing both the dashes and the blankspace :)
 
R

Ron Natalie

suresh said:
HI Victor,

Could you be little bit more elaborative? The max value of int in my
linux machine is 2147483648. Then why did you say that The literal
'2147483648' does not fit in an int?
Unless you have some screwball 32-bit signed magnitude signed
representation it DOES NOT.

2**31 - 1 or 2147483647 is the largest positive int value.
 
R

Ron Natalie

Rolf Magnus wrote:
]
My compiler promoted it to unsigned long.
Your compiler is bogus then.

2.13 Literals

A decimal literal with no suffix has the type of the first of these it
fits in: int, long int. If it doesn't fit in either, then the
behavior is undefined.

If it were octal or hexadecimal (i.e., begin with 0 or 0x), then it
would be be typed to the first of int, unsigned int, long unsigned long,
that it fits in.

"promoted" is probably the wrong term here. There's no promotion,
the type is defined by the size.
 
R

Ron Natalie

James said:
Presumably because it won't fit in a signed long either. That's
not legal C++ (the standard requires a diagnostic), but it does
correspond to a lot of traditional C implementations.

Try compiling is the strictest standard conformant mode.
No diagnostic is required. The behavior is UNDEFINED.

I doubt seriously his compiler makes it unsigned. I've never seen
one that does it. Most do, as the OP's does, end up with a int
or long int number that doesn't actually represent the literal
expressed.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,184
Messages
2,570,976
Members
47,536
Latest member
MistyLough

Latest Threads

Top