efficiency concern: when to really use unsigned ints and when not to

C

CBFalconer

Dan said:
.... snip ...

Signed integers and unsigned integers are fairly different beasts,
intended for different purposes. Avoid unsigned integers unless
you need their special semantics (or the additional range), even
if you're only manipulating positive values. After all, the
prototype of main() isn't

int main(unsigned argc, char **argv);

despite the fact that argc is not supposed to have negative values.

If it's intended for usual arithmetic operations, use signed
integer. If it's intended for bit manipulation operations and/or
modulo arithmetic, use unsigned integer.

Avoid as much as possible mixing the two flavours in the same
expression, because very nasty bugs may arise.

The fact that size_t is unsigned is a real pain in the ass, because
this type is seldom used in a genuine unsigned context. It should
have been signed, for the same reason that argc is signed.

There are two fundamental problems without compatible resolution.
First, int to unsigned conversion is always possible, while the
reverse may not be. This encourages the use of ints. Secondly,
size_t usually represents things that can be addressed in the
system, and this is very unlikely to waste any bit positions.
This more or less mandates the use of size_t (and thus unsigned)
in many places.
 
P

pete

Christian said:
Completely brainlessly wrong. The code checked that the value of a
variable was in a certain constant range. If I want to know whether a
value x is in a range from a to b then I write
"if (x >= a && x <= b)".
It is absolutely idiotic to have a special case for a = 0.

It's idiotic not to think about the meaning of your code.

It's worth remembering that an unsigned object compared
against a constant zero with >=, is almost certainly a bug.

http://groups.google.com/[email protected]

As you know, I have a policy against using relational operators
to compare unsigned objects against a constant zero.
It's a good policy.
Changing the code would have created maintenance problems:
If a variable can hold values lets say from 0 to 100,
then any integer type whether
signed or unsigned can hold all the values.
However, changing the type
from unsigned to signed makes the code incorrect if the
test x >= 0 is missing.

Not buying it.
 
C

CBFalconer

Christian said:
Completely brainlessly wrong. The code checked that the value
of a variable was in a certain constant range. If I want to
know whether a value x is in a range from a to b then I write
"if (x >= a && x <= b)". It is absolutely idiotic to have a
special case for a = 0.

Then I suggest you turn off *all* warnings, and consider any
source that does not cause a syntax error to be correct. Please
keep us informed as to any improvements in your productivity.

The next time you meet a skunk stamping its forefoot with its tail
upraised, advise it that the warning is ridiculous and rush
towards it. (I assume the existence of skunks in your part of the
world.)
 
T

Tim Prince

Dan Pop said:
In <[email protected]> (e-mail address removed)
(Neil Zanella) said:
The fact that size_t is unsigned is a real pain in the ass, because this
type is seldom used in a genuine unsigned context. It should have been
signed, for the same reason that argc is signed.
Interesting

Speed is not a concern. On two's complement architectures, the two are
usually handled identically. It's the intended purpose of the variable
that dictates the choice (again, unless you need the extra range provided
by the unsigned flavour, but this doesn't seem to be the case).
Where unsigned int is used as a array subscript, compilers may insert extra
code to ensure that overflowing the range will wrap correctly. That may
have a large impact on performance. Signed int does not have defined
behavior for range overflow, so no special code generation is needed.
On many architectures, casting unsigned int to a (signed) floating point
type requires a library function call, while the signed int version has
direct hardware support.
In both cases, casts may be employed to persuade the compiler to drop the
precautions implied by unsigned.
 
C

Christian Bau

pete said:
It's idiotic not to think about the meaning of your code.

It's worth remembering that an unsigned object compared
against a constant zero with >=, is almost certainly a bug.

http://groups.google.com/[email protected]

As you know, I have a policy against using relational operators
to compare unsigned objects against a constant zero.
It's a good policy.


Not buying it.

Not my problem. Your problem.
 
C

Christian Bau

CBFalconer said:
Then I suggest you turn off *all* warnings, and consider any
source that does not cause a syntax error to be correct. Please
keep us informed as to any improvements in your productivity.

That is an absolutely stupid suggestion. Why would I ignore useful
warnings, just because there are a few stinkers in between?
 
N

Neil Zanella

Christian Bau said:
For signed integers, the C operators +, -, * produce exactly the same
results as the mathematical operators (as long as the results are not
too large). For unsigned integers, the C operators do some pretty weird
things. A trivial example: For which numbers is

x >= y - 1

Good point, but the set of nonnegative numbers is not closed under subtraction.
However the way int and unsigned int work is they model a set of representatives
for the integers modulo the word size (usually 32, 64, or 16 on older machines).
However the integers modulo n are not ordered, so when we use comparisons, we
regard them as plain integers, not integers modulo n. But when we perform
arithmetic, we are doing arithmetic modulo n and using the appropriate
representative from either [2^(n-1),2^(n-1)-1] or [0, 2^n-1].
true if x and y are both signed, both unsigned, one signed and the otherwise
unsigned? For signed numbers, you are quite safe. Both unsigned, and
there is a strange special case for y = 0. One signed and the other
unsigned, and you have to study the C Standard.

I believe that when a signed and an unsigned are used together and they are
of the same length both operands are promoted to... unsigned int, which can
cause problems. Perhaps someone can confirm.
It seems what you have are "positive" numbers. "unsigned" is something
completely different; unsigned numbers can behave in completely
unexpected ways. x - 1 is not one less than x in very common cases.

Agreed, but nevertheless, if you're going to be using negative numbers in
the same context then perhaps it's best to use signed ints.
(And some people will be in for some nasty surprises if they switch to a
compiler where unsigned int and unsigned long have different sizes. )

Such as...?

Regards,

Neil
 
N

Neil Zanella

Martin Dickopp said:
I fully agree with you that if there is a risk of "underflowing" an
unsigned integer (and the wrap-around behavior is not deliberately
wanted), a signed integer type should be used. However, IMHO unsigned
integers should be used when there's no such risk.

Well said.
When I was just beginning to learn programming and C, I did in fact
write a `while (n >= 0)' loop, where `n' was an unsigned integer (and
the compiler I used at that time did /not/ warn about it). When I saw
my mistake, I didn't conclude that I should just avoid unsigned integers
(either in this special case or in general), but that I should think
again what I'm doing.

In the case you describe I would not suggest switching to signed. Rather I
would suggest doing something such as the following:

for (;;) { /* stuff */ if (n-- == 0) break; }

Reagrds,

Neil
 
N

Neil Zanella

Christian Bau said:
Completely brainlessly wrong. The code checked that the value of a
variable was in a certain constant range. If I want to know whether a
value x is in a range from a to b then I write "if (x >= a && x <= b)".
It is absolutely idiotic to have a special case for a = 0.

Changing the code would have created maintenance problems: If a variable
can hold values lets say from 0 to 100, then any integer type whether
signed or unsigned can hold all the values. However, changing the type
from unsigned to signed makes the code incorrect if the test x >= 0 is
missing.

Well, at least gcc 3.3.2 does not mind the following code, and I agree that
that is what it should do because the user can change the value of a in one
place at a later time to some other value as the user pleases. It is up to
the compiler to optimize the code and delete useless comparisons with zero
of unsigned numbers. But in the example below, I agree with Christian Bau
that a warning would be useless, and in fact even with -Wall gcc does not
issue a warning.

#include <stdio.h>

int main(void) {
unsigned int a = 0, b = 8, c;
printf("Please enter an unsigned integer value: ");
scanf("%u", &c);
if (a <= c && c <= b)
printf("Value is within range.\n");
else
printf("Value is not within range.\n");
return 0;
}

On the other hand the following code is stupid, and the compiler should issue
some warning:

#include <stdio.h>

int main(void) {
unsigned int b = 8, c;
printf("Please enter an unsigned integer value: ");
scanf("%u", &c);
if (0 <= c && c <= b)
printf("Value is within range.\n");
else
printf("Value is not within range.\n");
return 0;
}

Values that the user expects to change at a later time should be made into
variables as a matter of good style. But gcc issues no warning here, not even
with the -Wall option. The only value of the 0 <= c is as a reminder to the
user that that part is been checked, and the compiler can take it out.
But I still think that the compiler should give a warning if some flag
is turned on in this latter case, don't you think?

Regards,

Neil
 
N

Neil Zanella

Christian Bau said:
File header.h:
#define MIN_VALUE 0 // Or any other value
#define MAX_VALUE 99 // Or any other value

File mycode.c:
#include "header.h"
int in_range (unsigned int x) {
#if MIN_VALUE <= 0
return x <= MAX_VALUE;
#else
return x >= MIN_VALUE && x <= MAX_VALUE;
#endif
}

Is that how you would want to write code?

Good point, even if you didn't use variables, you could use macros,
so the compiler should not complain in the case of comparisons with
zero either.

Regards,

Neil
 
C

CBFalconer

Neil said:
.... snip ...

for (;;) { /* stuff */ if (n-- == 0) break; }

Much clearer is:

while (n--) { /* stuff */ }

Department of simplistic clarification.
 
P

pete

CBFalconer said:
Much clearer is:

while (n--) { /* stuff */ }

Department of simplistic clarification.

Your while loop has different semantics from that particular for loop.

I'm not sure whether I prefer
do { /* stuff */ } while (n-- != 0);
or
do { /* stuff */ } while (n--);

I think that the meaning of while (n--)
is just as clear as that of while (n-- != 0)
 
C

CBFalconer

pete said:
Your while loop has different semantics from that particular for loop.

I'm not sure whether I prefer
do { /* stuff */ } while (n-- != 0);
or
do { /* stuff */ } while (n--);

Well caught. However the point is that for loops are not best
suited in many places. Also that break is almost as confusing as
goto, and often needless.
 
C

Christian Bau

Such as...?

unsigned long x = -1u;

You might expect this to create the largest possible unsigned long, but
it doesn't if unsigned long has more bits than unsigned int. Until quite
recently the machines where this is the case have become more rare (16
bit vs. 32 bit), but now the number starts growing again (32 bit vs. 64)
 
O

Old Wolf

For signed integers, the C operators +, -, * produce exactly the same
Which case happens more often, INT_MIN or 0?

If it is code relating to my bank balance: INT_MIN :)
My point: this problem pertains to any sort of programming
with range-limited integral types. Every time you perform
an arithmetic operation you must consider the overflow cases.
Each case is on its own merits and in the context of the
application (eg. deciding whether you are more likely to have
problems with 0, INT_MIN or UINT_MAX).

If I judge that the code will never be near an overflow (unless
the code has UB'd already) then I will write something as you
did above. If it is important to notice overflows then I will
write some small functions to perform the operation and handle
overflow in a way appropriate for that application. (for example,
in code dealing with # of seconds since 1970, or # of milliseconds
since whenever).


I have a couple of other rules of thumb:
- enable compiler warnings for signed vs. unsigned comparison
- prefer to use + instead of - (I find it easier to comprehend)
x + 1 >= y

NB. Of course these are all my own guidelines to avoid overflow
bugs, they should not be interpreted as expert advice
 
D

Dan Pop

In said:
There are two fundamental problems without compatible resolution.
First, int to unsigned conversion is always possible, while the
reverse may not be. This encourages the use of ints. Secondly,
size_t usually represents things that can be addressed in the
system, and this is very unlikely to waste any bit positions.
This more or less mandates the use of size_t (and thus unsigned)
in many places.

How many times have you defined an object occupying half of your address
space or more?

Dan
 
C

CBFalconer

Dan said:
How many times have you defined an object occupying half of your
address space or more?

Quite often. Of course it wasn't in C, and it didn't have large
virtual memory.
 
M

Mark McIntyre

Quite often. Of course it wasn't in C, and it didn't have large
virtual memory.

FWIW in the olden days, when 512K was standard on Amstrad PCs, I did find
myself allocating 256K in one go. Hardly a truly enormous object when you
think about it.
 
C

Christian Bau

[email protected] (Dan Pop) said:
How many times have you defined an object occupying half of your address
space or more?

And if you have done so, did you run into problems because ptrdiff_t is
not capable of representing differences between pointers in all cases?
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,139
Messages
2,570,807
Members
47,356
Latest member
Tommyhotly

Latest Threads

Top