unsigned char

B

Back9

Hi,

Can someone explain me about the following statement?

unsigned char c = -1;
printf("\n c is %d", c);

c is 255

Seems to be related to 2's complement above.
It's been long time since I learned c language.

tia
 
S

Seebs

Can someone explain me about the following statement?
unsigned char c = -1;

For unsigned types, the range is always 0 to <type>_MAX. On your system,
the chances are UCHAR_MAX is 255. The way numbers outside that range are
converted is by adding/subtracting multiples of (UCHAR_MAX+1) until you get
something in range.

So...
-1 => (-1) + (UCHAR_MAX + 1) => UCHAR_MAX
printf("\n c is %d", c);

Here, c is promoted to int, which doesn't change it.

However, you did this wrong. You need a newline AFTER the output
on some systems, and there is no obvious reason to print a newline
BEFORE the output.
Seems to be related to 2's complement above.

Not in the least. It does not matter what representation the system uses;
-1, converted to an unsigned type, is TYPE_MAX.

-s
 
K

Keith Thompson

Seebs said:
Can someone explain me about the following statement?
unsigned char c = -1;

For unsigned types, the range is always 0 to <type>_MAX. On your system,
the chances are UCHAR_MAX is 255. The way numbers outside that range are
converted is by adding/subtracting multiples of (UCHAR_MAX+1) until you get
something in range.

So...
-1 => (-1) + (UCHAR_MAX + 1) => UCHAR_MAX
[...]
Seems to be related to 2's complement above.

Not in the least. It does not matter what representation the system uses;
-1, converted to an unsigned type, is TYPE_MAX.

Actually it *is* related to 2's complement in the least. :cool:}

Conversion of a signed or unsigned integer to an unsigned type, where
the value isn't already within the target type's range, is defined as
follows (C99 6.3.1.3p2):

Otherwise, if the new type is unsigned, the value is converted by
repeatedly adding or subtracting one more than the maximum value that
can be represented in the new type until the value is in the range of
the new type.

with a footnote:

The rules describe arithmetic on the mathematical value, not the
value of a given type of expression.

Note that the conversion is described in terms of values, not
representation.

Of course the implementation doesn't actually have to do these
repeated additions or subtractions. On a system that uses
2's-complement, converting from a signed type to an unsigned type
either copies the low-order bits, copies the entire representation,
or sign-extends the representation, depending on the relative sizes
of the source and target. This is a very simple and fast operation;
the compiler doesn't even need to take signedness into account.
(For sign-and-magnitude or ones'-complement representations, the
compiler has to do a little more work, but such systems are rare.)

The standard very carefully doesn't refer to representation when it
discusses the conversion rules, but the rules were almost certainly
originally *motivated* by the natural way to do conversions on a
2's-complement system.
 
P

Peter Nilsson

Keith Thompson said:
Seebs said:
unsigned char c  = -1;

For unsigned types, the range is always 0 to <type>_MAX.
On your system, the chances are UCHAR_MAX is 255.  The way
numbers outside that range are converted is by adding/
subtracting multiples of (UCHAR_MAX+1) until you get
something in range.

So...
-1 => (-1) + (UCHAR_MAX + 1) => UCHAR_MAX
[...]
Seems to be related to 2's complement above.
Not in the least.  It does not matter what representation
the system uses; -1, converted to an unsigned type, is
TYPE_MAX.

Actually it *is* related to 2's complement in the least.
 :cool:}

No, it really isn't and your citations simply emphasise that.
...(C99 6.3.1.3p2):

Formalises what Seebs said.
with a footnote:

    The rules describe arithmetic on the mathematical value,
not the value of a given type of expression.

...

Of course the implementation doesn't actually have to do
these repeated additions or subtractions.  On a system that
uses 2's-complement, converting from a signed type to an
unsigned type either copies the low-order bits, copies the
entire representation, or sign-extends the representation,
depending on the relative sizes of the source and target.
 This is a very simple and fast operation; the compiler
doesn't even need to take signedness into account.

Except for negative integers when, as you say, it needs to
sign extend the representation. Note that right shifting
a negative value is implementation defined presicely because
there where architectures that weren't capable of doing the
sign extension.
(For sign-and-magnitude or ones'-complement representations,
the compiler has to do a little more work, but such systems
are rare.)

You should be emphasising the point that signed and
unsigned integer types should generally not be mixed.
[Beyond occasionally using a non-negative literal
constant.]

It's only the unnecessary fixation on the virtues of 2's
complement that makes people mix them unnecessarily. There
is hardly ever an actual need to convert between signed
types. The only exception I can think of is converting a
ptrdiff_t to size_t, and even then, the need is rare.

Unsigned types have properties of modulo arithmetic. Those
properties are useful independant of whether signed types
are 2's complement or not.
The standard very carefully doesn't refer to representation
when it discusses the conversion rules, but the rules were
almost certainly originally *motivated* by the natural way
to do conversions on a 2's-complement system.

As I understand it, they mirror address arithmetic which is
often modulo a power of 2, even on non 2's complement systems.
Pointers where used as a cheap form of unsigned integer prior
to unsigned integers being introduced to the language.
 
S

Seebs

Actually it *is* related to 2's complement in the least. :cool:}

Okay, fair enough.

I think what I was trying to get at is that the visible behavior has nothing
to do with whether or not your machine is 2s complement.
The standard very carefully doesn't refer to representation when it
discusses the conversion rules, but the rules were almost certainly
originally *motivated* by the natural way to do conversions on a
2's-complement system.

Seems plausible.

-s
 
K

Keith Thompson

Seebs said:
Okay, fair enough.

I think what I was trying to get at is that the visible behavior has nothing
to do with whether or not your machine is 2s complement.

Yes, absolutely.
Seems plausible.

Note that this is largely speculation on my part, but since the machines
on which C was first implemented used 2's-complement, it does seem
plausible. Early C was much less strongly typed than modern C (I don't
think it originally even had unsigned types).
 
K

Keith Thompson

Peter Nilsson said:
Keith Thompson said:
Seebs said:
unsigned char c  = -1;

For unsigned types, the range is always 0 to <type>_MAX.
On your system, the chances are UCHAR_MAX is 255.  The way
numbers outside that range are converted is by adding/
subtracting multiples of (UCHAR_MAX+1) until you get
something in range.

So...
-1 => (-1) + (UCHAR_MAX + 1) => UCHAR_MAX
[...]

Seems to be related to 2's complement above.
Not in the least.  It does not matter what representation
the system uses; -1, converted to an unsigned type, is
TYPE_MAX.

Actually it *is* related to 2's complement in the least.
 :cool:}

No, it really isn't and your citations simply emphasise that.
...(C99 6.3.1.3p2):

Formalises what Seebs said.
Yes.

[snip]

You should be emphasising the point that signed and
unsigned integer types should generally not be mixed.
[Beyond occasionally using a non-negative literal
constant.]

That's a very good point; it's just not the one I was making.
It's only the unnecessary fixation on the virtues of 2's
complement that makes people mix them unnecessarily. There
is hardly ever an actual need to convert between signed
types. The only exception I can think of is converting a
ptrdiff_t to size_t, and even then, the need is rare.

Unsigned types have properties of modulo arithmetic. Those
properties are useful independant of whether signed types
are 2's complement or not.

Agreed. I was merely speculating on the historic motivation for the
current rules. I didn't mean to suggest that modern programmers
should be guided by that history rather than by the current
definition. Except that you can be fairly sure that conversions
between signed and unsigned types will be quite efficient on almost
all modern systems (and will work correctly on *all* systems).
As I understand it, they mirror address arithmetic which is
often modulo a power of 2, even on non 2's complement systems.
Pointers where used as a cheap form of unsigned integer prior
to unsigned integers being introduced to the language.

Right. Though I actually have no idea how address arithmetic would
have worked on non-2's-complement machines. (I speculate that it
would have acted like unsigned arithmetic.)
 
C

Chad

For unsigned types, the range is always 0 to <type>_MAX.  On your system,
the chances are UCHAR_MAX is 255.  The way numbers outside that range are
converted is by adding/subtracting multiples of (UCHAR_MAX+1) until you get
something in range.

So...
-1 => (-1) + (UCHAR_MAX + 1) => UCHAR_MAX


Here, c is promoted to int, which doesn't change it.

How do you get UCHAR_MAX + 1 from a range of 0 to <type>_MAX ?
 
B

bart.c

Back9 said:
Hi,

Can someone explain me about the following statement?

unsigned char c = -1;
printf("\n c is %d", c);

c is 255

What did you expect? You're storing a signed value into a location which
should contain only unsigned values.

Maybe the compiler should have said something..

Anyway the bit pattern for -1 presumably is the same as the one for 255.

And printf("\n c is %d", (signed char)c);

will likely show -1.
 
S

Seebs

How do you get UCHAR_MAX + 1 from a range of 0 to <type>_MAX ?

Think about minutes on a clock. Minutes run from 0 to 59. The minute before
0 is -1, which is 59 of the previous hour. 59 = -1 + 60. Therefore, the
multiples to add or subtract are (MINUTE_MAX+1).

Basically, it's FOO_MAX+1 because there are FOO_MAX+1 items, because 0 is
an item.

-s
 
K

Keith Thompson

bart.c said:
What did you expect? You're storing a signed value into a location which
should contain only unsigned values.

Maybe the compiler should have said something..

Anyway the bit pattern for -1 presumably is the same as the one for 255.

It's not about bit patterns (except perhaps historically).
It's about the Standard's specification of the semantics of
conversions from signed types to unsigned types.
And printf("\n c is %d", (signed char)c);

will likely show -1.

Likely, but not guaranteed. Unlike signed-to-unsigned conversions,
unsigned-to-signed conversions don't yield well-defined results in
all cases.
 
C

Chad

Think about minutes on a clock.  Minutes run from 0 to 59.  The minute before
0 is -1, which is 59 of the previous hour.  59 = -1 + 60.  Therefore, the
multiples to add or subtract are (MINUTE_MAX+1).

Basically, it's FOO_MAX+1 because there are FOO_MAX+1 items, because 0 is
an item.

Bear with this. My brain isn't fully functional (yet) because I just
got done working 13 hours. So now, in your example, would MINUTE_MAX
be 59 or 60?
 
S

Seebs

Bear with this. My brain isn't fully functional (yet) because I just
got done working 13 hours. So now, in your example, would MINUTE_MAX
be 59 or 60?

59, of course -- that's the highest minute value that exists. 60 => 0,
120 => 0, 180 => 0, 179 => 59.... If there are N of something, they are
numbered 0 through N-1 inclusive. 60 minutes, numbered 0-59.

-s
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,997
Messages
2,570,239
Members
46,827
Latest member
DMUK_Beginner

Latest Threads

Top