'Unsigned decimal'

J

Joe Pfeiffer

Les Cargill said:
Keith said:
Les Cargill said:
Keith Thompson wrote:
int a = -2147483648;
long long int b = 2147483648;

And gcc gives me this warning for each:

"warning: this decimal constant is unsigned only in ISO C90"

Thanks for the replies. I've replaced instances of -2147483648 with
0x80000000 where an int is expected. And using an LL suffix for any long
long ints.

That eliminates the warning (because 0x8000000 is of type
unsigned int in both C90 and C99), but you're still doing an
implementation-defined conversion from unsigned to signed.
It's *probably* safe to assume that 0x80000000 converts to
(int)-2147483648; you'll have to decide whether that's good enough.

Using 2147483648LL should eliminate both the warning and the
implementation-definedness of the conversion -- if you can assume
support for the LL suffix.

In my own defense :), I tried that and it failed.
C:\c\usenet>gcc -v [snip]
gcc version 3.4.5 (mingw-vista special r3)

How did it fail?

For me, this program:

#include <stdio.h>
int main(void) {
int a = -2147483648LL;
long long int b = 2147483648LL;
printf("a = %d\n", a);
printf("b = %lld\n", b);
return 0;
}

compiles without warnings and produces this output:

a = -2147483648
b = 2147483648

It warning-ed.

C:\c\usenet>gcc -o long.exe long.c
long.c:2: warning: this decimal constant is unsigned only in ISO C

C:\c\usenet>cat long.c

signed long int a = -2147483648L;

The code Keith exhibited had a LL on this constant (see above)
 
K

Keith Thompson

Les Cargill said:
It warning-ed.

C:\c\usenet>gcc -o long.exe long.c
long.c:2: warning: this decimal constant is unsigned only in ISO C

C:\c\usenet>cat long.c

signed long int a = -2147483648L;
signed long long int b = 2147483648LL;

Well there's your problem. You used a single 'L' suffix, and
2147483648 is outside the range of both signed int and signed long
(given that both are 32 bits). In C90, 2147483648 that causes it
to be of type unsigned long; in C99 and later, it's of type signed
long long.

It's not surprising that different code would behave differently.

[...]
There are two strings embedded in the source code - -2147483648
and 2147483648 - which map to the same hex value, for the right toolsets
and declarations. As you note, -2147483648 is actually an expression,
but it is still, SFAIK, a constant value.

There is no such thing as a "hex value", at least not in this
context. Hexadecimal, like decimal is a human-readable way of
*representing* values.

Yes, the C expressions -2147483648 and 2147483648 can evaluate to
the same value in certain circumstances. In C90, with 32-bit int and
long, 2147483648 is of type unsigned long and has the *mathematical*
value 2,147,483,648. (I add the commas to emphasize that I'm
talking about a number, not a C value -- but "2,147,483,648" is
still just a human-readable representation of that value. I'd hold
up 2,147,483,648 fingers if I could.) Negating that value (i.e.,
applying C's unary "-" operator to it) happens to yield the same
value. The same is true for `0` and `-0`, but for different reasons.

But for a given version of the C language, and a given set of ranges of
the basic types, there is no ambiguity in the meaning of any of the C
expressions `-2147483648`, `2147483648`, or `0x80000000`.

[...]

Agreed -- except that hex notation doesn't solve as many problems
as you might expect. It's easy to think of hexadecimal notation as
denoting an actual bit-level representation, but in fact it merely
represents C values of C types, and the rules are as complex as
(and different from) the rules for decimal notation.
 
L

Les Cargill

Joe said:
Les Cargill said:
Keith said:
Keith Thompson wrote:
int a = -2147483648;
long long int b = 2147483648;

And gcc gives me this warning for each:

"warning: this decimal constant is unsigned only in ISO C90"

Thanks for the replies. I've replaced instances of -2147483648 with
0x80000000 where an int is expected. And using an LL suffix for any long
long ints.

That eliminates the warning (because 0x8000000 is of type
unsigned int in both C90 and C99), but you're still doing an
implementation-defined conversion from unsigned to signed.
It's *probably* safe to assume that 0x80000000 converts to
(int)-2147483648; you'll have to decide whether that's good enough.

Using 2147483648LL should eliminate both the warning and the
implementation-definedness of the conversion -- if you can assume
support for the LL suffix.

In my own defense :), I tried that and it failed.
C:\c\usenet>gcc -v
[snip]
gcc version 3.4.5 (mingw-vista special r3)

How did it fail?

For me, this program:

#include <stdio.h>
int main(void) {
int a = -2147483648LL;
long long int b = 2147483648LL;
printf("a = %d\n", a);
printf("b = %lld\n", b);
return 0;
}

compiles without warnings and produces this output:

a = -2147483648
b = 2147483648

It warning-ed.

C:\c\usenet>gcc -o long.exe long.c
long.c:2: warning: this decimal constant is unsigned only in ISO C

C:\c\usenet>cat long.c

signed long int a = -2147483648L;

The code Keith exhibited had a LL on this constant (see above)


You're absolutely right! I filtered that out, apparently.
 
L

Les Cargill

Keith said:
Well there's your problem. You used a single 'L' suffix, and
2147483648 is outside the range of both signed int and signed long
(given that both are 32 bits). In C90, 2147483648 that causes it
to be of type unsigned long; in C99 and later, it's of type signed
long long.

It's not surprising that different code would behave differently.

True! I mentally erased the L because we were assigning to an int
on a 32 bit machine, which missed the point considerably... :)
[...]
There are two strings embedded in the source code - -2147483648
and 2147483648 - which map to the same hex value, for the right toolsets
and declarations. As you note, -2147483648 is actually an expression,
but it is still, SFAIK, a constant value.

There is no such thing as a "hex value", at least not in this
context. Hexadecimal, like decimal is a human-readable way of
*representing* values.

Yes, the C expressions -2147483648 and 2147483648 can evaluate to
the same value in certain circumstances. In C90, with 32-bit int and
long, 2147483648 is of type unsigned long and has the *mathematical*
value 2,147,483,648. (I add the commas to emphasize that I'm
talking about a number, not a C value -- but "2,147,483,648" is
still just a human-readable representation of that value. I'd hold
up 2,147,483,648 fingers if I could.) Negating that value (i.e.,
applying C's unary "-" operator to it) happens to yield the same
value. The same is true for `0` and `-0`, but for different reasons.

But for a given version of the C language, and a given set of ranges of
the basic types, there is no ambiguity in the meaning of any of the C
expressions `-2147483648`, `2147483648`, or `0x80000000`.

I am not sure I follow your reasoning here, but that's okay. I mean
by ambiguity "is part of a non-one-to-one-and-onto map from strings
in source code to binary values in memory."

If we use those three r-values - -2147483648, 2147483648 and 0x80000000
in 'C' source code, then use objdump to view how code was generated for
them, we'd see the same representation ( assuming certain compilers,
machine word sizes, yadda yadda).

If I may, you seem to be holding that the mapping is much more
inscrutable than I think it is. It was pretty clear to me
that the OP was on a 32 bit machine that worked like GNU
on this subject.
[...]

Agreed -- except that hex notation doesn't solve as many problems
as you might expect. It's easy to think of hexadecimal notation as
denoting an actual bit-level representation, but in fact it merely
represents C values of C types, and the rules are as complex as
(and different from) the rules for decimal notation.

I am thinking that hex notation fully specifies values, while decimal
has one more degree of freedom. There is still the problem of sign
extension, but since we happen to be "int is 32 bits, so is long",
for *that* bit pattern, it washes out.

Regardless of that, thanks for the comments. It's easy to get into
a rut on these things and use shortcuts that are not always correct.
After a couple of decades of 32 bit targets, you get used to certain
things...
 
K

Keith Thompson

Les Cargill said:
I am not sure I follow your reasoning here, but that's okay. I mean
by ambiguity "is part of a non-one-to-one-and-onto map from strings
in source code to binary values in memory."

But expressions map to binary values in memory only when they're *used*,
and the way they're used can matter.
If we use those three r-values - -2147483648, 2147483648 and 0x80000000
in 'C' source code, then use objdump to view how code was generated for
them, we'd see the same representation ( assuming certain compilers,
machine word sizes, yadda yadda).

If I may, you seem to be holding that the mapping is much more
inscrutable than I think it is. It was pretty clear to me
that the OP was on a 32 bit machine that worked like GNU
on this subject.

I've been assuming (explicitly enough, I hope) the same thing: 32-bit
int and long, 2's-complement, and either 64-bit long long or no long
long at all for C90.

I'll use hexadecimal in square brackets to denote bits in memory, to
avoid confusion with C syntax.

Even a simple constant like 7 can, depending on the context in
which it's used, result in a number of different in-memory bit
patterns: [07], [0007], [00000007], [0000000000000007]. Given the
assumptions we're making, it's always of type int and therefore
always [00000007], but any of the others can show up (with trivial
optimization) if you use 7 to initialize an object of type char,
short, int, or long long, respectively.
[...]
The point I was trying to make is that hex notation is a strategy to
use when the other conventions break down. But as you note,
"you gotta know the territory."

Agreed -- except that hex notation doesn't solve as many problems
as you might expect. It's easy to think of hexadecimal notation as
denoting an actual bit-level representation, but in fact it merely
represents C values of C types, and the rules are as complex as
(and different from) the rules for decimal notation.

I am thinking that hex notation fully specifies values, while decimal
has one more degree of freedom. There is still the problem of sign
extension, but since we happen to be "int is 32 bits, so is long",
for *that* bit pattern, it washes out.

I suggest that this is where you're going a bit astray.
C hexadecimal constants are no more or less ambiguous than C
decimal constants. They do map more directly to bit patterns,
but a hex literal can be of *more* different types than a decimal
literal can (unsuffixed decimal literals, as of C99, are always of
some signed type). But now that I think about it, the rules for
hex literals can lead to fewer surprising results.

An example: This program:

#include <stdio.h>
int main(void) {
printf("sizeof 0x80000000 = %d\n", (int)sizeof 0x80000000);
printf("sizeof 2147483647 = %d\n", (int)sizeof 2147483647);
printf("sizeof 2147483648 = %d\n", (int)sizeof 2147483648);
return 0;
}

produces this output in C90:

sizeof 0x80000000 = 4
sizeof 2147483647 = 4
sizeof 2147483648 = 4

and this output in C99:

sizeof 0x80000000 = 4
sizeof 2147483647 = 4
sizeof 2147483648 = 8

Hex constants are more likely to behave the way most people expect,
resulting in a stored representation whose bits correspond directly to
the hex digits. On the other hand, if you're trying to initialize an
int object to the value -2,147,483,648 with the representation
[f0000000], then this:

int x = 0xf0000000;

*probably* works, but if you think of a hexadecimal constant as a
direct portrayal of a bit pattern (rather than of a *value* of a
specific *type*), then it's easy to miss the fact that the implicit
unsigned-to-signed conversion has an implementation-defined result.
(And in case anyone was wondering, adding a cast to int doesn't help.)
Regardless of that, thanks for the comments. It's easy to get into
a rut on these things and use shortcuts that are not always correct.
After a couple of decades of 32 bit targets, you get used to certain
things...

Indeed. "All the world's a VAX^H^H^H x86."
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,077
Messages
2,570,567
Members
47,203
Latest member
EmmaSwank1

Latest Threads

Top