Okay, but if we are to believe this hocus pocus, there is no way to
avoid the hypothetical trapping. It makes no difference whether you
use (unsigned char)-1, (unsigned char)~0, or UCHAR_MAX, since they all
evaluate to the same thing. All are equally right or equally wrong.
You appear to have an incorrect "mental model" of how C works.
This is not surprising; I suspect most people do. (One needs to
have worked with those oddball ones'-complement machines to really
have a feel for this stuff.)
The language is not defined in terms of "what happens on a PDP-11",
nor "what happens on a VAX", nor even "what happens on an x86 or
other CPU produced within the last few years". Rather, it is defined
in terms of an "abstract machine". A C compiler writer must map
from "abstract machine" to "real machine" in some way.
The section quoted above (along with others) define how the abtract
machine is to work. In the abstract machine, writing:
~0
means:
- make an int with the value 0
- now, flip all the bits
This process *can* give rise to a "trap representation" on a ones'
complement machine.
On the other hand, writing:
-1
means:
- make an int with the value 1
- now, negate it
This process *must* produce the (ordinary signed int) value -1.
On a ones' complement machine, this value in binary is a sequence
of 1 bits followed by a zero, e.g., 111111111111111110 -- 17 1 bits
and then a 0 -- on an 18-bit-int ones' complement CPU. (The CPU
I am using as a model here is the Univac 11xx, which has 9, 18,
and 36 bit integers and *does* use ones' complement.)
Converting any ordinary signed int to type "unsigned char" *must*
produce a valid unsigned char bit pattern and value -- these are
defined as more or less the same thing in the abstract machine --
and the process by which a negative signed int is transformed into
a (positive) unsigned char is defined mathematically. If the
signed int has value -1, the result must be UCHAR_MAX, which is
a valid bit pattern that consists of all-1-bits, e.g., 111111111
(9 ones) on a 9-bit-byte ones' complement CPU.
At this point you are probably ready to hit your "post follow-up"
key or mouseable button or whatnot, saying: "What?! HOLD ON! JUST
A CONSARNED MINUTE! That all-1-bits pattern, you just said it's
a trap representation, now you say it's a valid value?!?" Yep.
How can it be both?
The answer lies in the *type* of the value. When the *type* of
the value is "signed int", an all-one-bits pattern is allowed to
be a "trap representation". When the type is "unsigned int", this
is *not* allowed. If the target CPU makes this a royal pain in
the butt, well, too bad for the C compiler implementor and/or user
-- "unsigned"s are going to be difficult and/or slow. But if you
*need* all-one-bits patterns, you -- as a C programmer -- should
use "unsigned" arithmetic, which is well-behaved and avoids all
these "trap representation" things. Moreover, given:
unsigned int ui = UINT_MAX;
the sequence:
ui++;
is *guaranteed* to cause ui to "roll over" to zero, without trapping
at runtime with an overflow error. With ordinary signed ints there
is no guarantee -- they may "roll over" (from positive to negative
or vice versa) or they may trap at runtime, whichever the implementor
finds easier or "better".
At the edges, the rules for C can get pretty complicated, but
there *are* simple answers for the common cases:
- If you need an ordinary signed integer and do not believe
you will overflow it, use an ordinary signed integer. (Use
"long" if your range is -2 billion to +2 billion; in C99, use
"long long" if your range is -9 quintillion to +9 quintillion.
Numerically these are 2147483647 and 9223372036854775807
respectively, in case you-the-reader are someone who uses
"milliard".
Ordinary "int" is only guaranteed to handle
[-32767..+32767], even though it often handles the 2 billion
number.)
- If you need modular "clock arithmetic", use an unsigned integer.
- If you need to do bitwise operations, use an unsigned integer.
- If you need exact, precisely defined behavior in *all* cases,
use an unsigned integer, synthesizing your own signed values
from these if desired. (In other words, build your own ones'
or two's complement or sign-and-magnitude system.)
Incidentally, one trick proposed (but not actually used on the
Univac) for unsigned integers vs. trap representations is, e.g.,
to have "unsigned int" be only 17 bits, while ordinary signed int
is 18 bits. Then UINT_MAX and INT_MAX are the same number (!),
and "unsigned"ness is achieved mainly by forcing the sign bit to
stay off. This appears to be allowed by the C standard. It is
therefore possible that the "simple rules" *still* do not achieve
the desired effect, depending on what that desired effect might
be.