Performance of signed vs unsigned types

M

Michael Press

Ike Naar said:
[...] However, that leaves me with
no idea why you thought it would work. Would you care to explain what
assumptions you were making about the results of signed integer overflow
when you wrote that code?

It did work, on the machines where I tested it, and it worked precisely
because x+step did produce a negative value.

Perhaps he's hinting at saturating arithmetic.

No, I never was.
 
K

Keith Thompson

Michael Press said:
And sometimes not. Offer a compiler flag that
makes arithmetic on integer types closed.

It would be perfectly reasonable for a compiler to offer such a flag.

The point is that a conforming compiler needn't do so, and even
if it does, it can still be a conforming compiler when that flag
is disabled.

A program that depends on integer types being closed under various
arithmetic operations *can* work correctly, if it's running under an
implementation that satisfies that dependency, whether by default or
in whatever configuration you choose to use. But it can't be 100%
portable, because not all implementations are required implement
closed arithmetic.

If you want most compilers to implement closed arithmetic, then
you already have your wish.

If you want the standard to require all compilers to do so, then
I don't think that's going to happen; trapping on overflow is too
useful (at least in some contexts) for the standard to forbid it.
(And the standard, at least currently, has nothing to say about
compiler flags; as far as the standard is concerned, for example,
"gcc -unsigned-char" and "gcc -fsigned-char" are two distinct
implementations.)
 
L

lawrence.jones

Michael Press said:
People writing safety-critical code do not
range check arguments, and examine return values?

Of course they do, but that isn't sufficient to prevent overflow.
 
B

Ben Bacarisse

Michael Press said:
Why not offer a compiler flag that guarantees
arithmetic is closed on integer types?
We're not talking bus errors or segmentation faults
where execution cannot be made to continue.

Yes, indeed, why not? I am not sure what you point is. I agree that
many implementations will want to provide programmers with "the
expected" overflow behaviour, but they don't have to.
 
K

Keith Thompson

Of course they do, but that isn't sufficient to prevent overflow.

Well, carefully range-checking each operand of each arithmetic operator
(assuming that "arguments" includes operands) should be sufficient to
prevent overflow. But it's hardly practical if the compiler doesn't
generate the checking code for you.

int x = rand();
int y = rand();
/* checking here */
int z = x * y;

Checking whether x * y will overflow without actually performing
the multiplication is certainly possible, but (a) it's tedious, and
(b) the check is likely to be more expensive than the multiplication.
 
J

James Kuyper

I do not see why overflow is any kind of impediment.
Whatever is in the register is what you get.
Physical machines that can trap on overflow do not
have to trap on overflow.

I'm familiar with the details of only a few platforms at that level,
none of which have that feature, so I cannot claim to know anything very
specific about what it would take to make a conforming implementation on
such a platform.

However, I believe that such platforms generally have to either be told
explicitly not to trap, or a handler must be set up for to intercept the
trap. If the platform has a good reason to make trapping the default, a
compiler would generally want to restore that default reasonably
quickly, so the code would have lots of "trap off", "trap on" command pairs.

In any event, that was not really the kind of thing I had in mind. What
I was thinking about was more subtle. Consider the following code, on an
implementation where INT_MAX == 32767, and INT_MIN == -32768:

int func(int a)
{
if(a < 300)
{
printf("a is small:%d\n", a);
}
return 200*a;
}

Because integer overflow has undefined behavior, and the value of 'a'
does not change anywhere between the if() statement and the return
statement, an implementation is permitted to build into it's translation
of this code the assumption that 200*a will NOT produce overflow. This
implies that -164 < a && a < 164. Therefore, the initial if() is
redundant, and can be optimized away. In other words, a fully conforming
implementation is permitted to optimize this code into the equivalent of:

int func(int a)
{
printf("a is small:%d\n", a);
return 200*a;
}

As a result, it's perfectly feasible for this code to be translated into
a program which prints out:

a is small:21000

This is not just a hypothetical possibility. Problems caused by real
defective code that triggered a similar fully conforming optimization by
a real and widely used compiler have been a topic for discussion in this
very forum within the past year or two. I think it involved
dereferencing a null pointer rather than integer overflow, but the basic
principle was the same. I can't remember enough of the context to
provide a proper citation - does anyone else recognize what I'm
referring to?
 
M

Michael Press

Keith Thompson said:
It would be perfectly reasonable for a compiler to offer such a flag.

The point is that a conforming compiler needn't do so, and even
if it does, it can still be a conforming compiler when that flag
is disabled.

A program that depends on integer types being closed under various
arithmetic operations *can* work correctly, if it's running under an
implementation that satisfies that dependency, whether by default or
in whatever configuration you choose to use. But it can't be 100%
portable, because not all implementations are required implement
closed arithmetic.

If you want most compilers to implement closed arithmetic, then
you already have your wish.

If you want the standard to require all compilers to do so, then
I don't think that's going to happen; trapping on overflow is too
useful (at least in some contexts) for the standard to forbid it.
(And the standard, at least currently, has nothing to say about
compiler flags; as far as the standard is concerned, for example,
"gcc -unsigned-char" and "gcc -fsigned-char" are two distinct
implementations.)

The standard could demand that an option for closed arithmetic
be available, or if overflow must be trapped, the programmer can
provide his own trap code, as with signal handlers.
 
M

Michael Press

Of course they do, but that isn't sufficient to prevent overflow.

In the field? Are you serious?
I can see overflow traps for development,
but not in the field where failure is costly.
 
I

Ian Collins

In the field? Are you serious?
I can see overflow traps for development,
but not in the field where failure is costly.

The real world is where problems are more likely to be found. Failure
may be costly, but errors can be disastrous.
 
M

Michael Press

Keith Thompson said:
Well, carefully range-checking each operand of each arithmetic operator
(assuming that "arguments" includes operands) should be sufficient to
prevent overflow. But it's hardly practical if the compiler doesn't
generate the checking code for you.

int x = rand();
int y = rand();
/* checking here */
int z = x * y;

No, no, no, no, no.
Reduce x and y to < sqrt(maximum rand()) so that overflow does not occur.
Range check input arguments so the computation does not overflow.
 
M

Michael Press

Ian Collins said:
The real world is where problems are more likely to be found. Failure
may be costly, but errors can be disastrous.

Are you playing word games now? Error---Failure.
If it traps in the field it is disaster whether
it called error or failure.
 
I

Ian Collins

Are you playing word games now? Error---Failure.
If it traps in the field it is disaster whether
it called error or failure.

No at all. What would you rather do, reset the X-ray machine, or
overdose the patient?

The controllers I used to work on, while not safety critical did power
lots of expensive and often remote equipment. The wrong voltage at the
wrong time could kill it. The system would carry on on default settings
if the controller went away, so a reset was the safest failure mode.
 
B

Ben Bacarisse

Michael Press said:
No, no, no, no, no.
Reduce x and y to < sqrt(maximum rand()) so that overflow does not occur.
Range check input arguments so the computation does not overflow.

But that changes the computation. If a multiplication might overflow,
you can't simply decide to alter the operands so that is does not.

Since I am posting, I'll make a remark that I decided was not worth a
post of its own... This example is not ideal because x, y and the
correct value of z are all positive, so the programmer could use
unsigned arithmetic making the test for "overflow" simpler, in part
because it can then be done after the multiplication. Maybe input or a
conversion from argv[x] is a better way to suggest arbitrary int values.
 
M

Malcolm McLean

Are you playing word games now? Error---Failure.
If it traps in the field it is disaster whether
it called error or failure.
No, errors are usually worse (unless you are programming video games).

The reason is that "inject 500ml of saline" (instead of 500ul) will
kill the patient, "arithmetic overflow at line 1000" is no different
to the needle breaking or the bag bursting. Mechanical failures
happen, the system as a whole is usually robust to them.
 
K

Keith Thompson

Michael Press said:
The standard could demand that an option for closed arithmetic
be available, or if overflow must be trapped, the programmer can
provide his own trap code, as with signal handlers.

Yes, it could. Feel free to advocate it, but IMHO it's not going to
happen.
 
K

Keith Thompson

Michael Press said:
No, no, no, no, no.

Yes, yes, yes, yes, yes.
Reduce x and y to < sqrt(maximum rand()) so that overflow does not occur.
Range check input arguments so the computation does not overflow.

In other words, if the result you want to compute will cause an
overflow, compute a different result? Sorry, you don't get to do that.

The relevant bound is sqrt(INT_MAX), not sqrt(RAND_MAX). But even then,
restricting the ranges of both operands that way rejects many cases that
would not overflow. For example, if x==1, any value of y is ok. If x
== 7, and INT_MAX == 2**31-1, any value of y up to 306783378 is ok. And
so forth. And as Ben points out, my use of rand() obscures the case
where x and/or y can be negative.

Suppose you're asked to write a program with the following
specification:

The program takes exactly two command-line arguments, both of which
are string representations of int values. If it doesn't receive
exactly two arguments, it terminates with an error message and
returns a failure indication to the environment; otherwise it may
assume that both arguments are valid int values. If the product of
the two arguments is in the range INT_MIN..INT_MAX, the product is
printed to stdout followed by a new-line. Otherwise, the program
aborts with an error message and returns a failure indication to
the environment.

You don't get to change the requirements to avoid overflow. You don't
get to print an error message for any arguments that would not overflow.
You must print an error message for any arguments that would overflow.

Here's a partial solution:

#include <stdio.h>
#include <stdlib.h>

int multiplication_would_overflow(int x, int y)
{
/*
* TODO: Implement this.
*/
return 0;
}

int main(int argc, char **argv)
{
if (argc != 3) {
fprintf(stderr, "Usage: %s num num\n", argv[0]);
exit(EXIT_FAILURE);
}
/*
* Note: atoi() does not perform error checking.
*/
const int x = atoi(argv[1]);
const int y = atoi(argv[2]);

if (multiplication_would_overflow(x, y)) {
fprintf(stderr, "%d * %d would overflow\n", x, y);
exit(EXIT_FAILURE);
}
else {
printf("%d\n", x * y);
}
exit(EXIT_SUCCESS);
}
 
M

Malcolm McLean

#include <stdio.h>
#include <stdlib.h>

int multiplication_would_overflow(int x, int y)
{
    /*
     * TODO: Implement this.
     */
{ asm;
mul x, y
returnregister = highvalue
}
    return ;

}
It's easy in most assemblers, because usually there's an instruction
which gives a double width result. It's hard in C, but that's just a
quirk of the language.
However most people don't check for overflow, otherwise
multiplication_would_overflow() would be in the standard library.
 
K

Keith Thompson

Keith Thompson said:
int multiplication_would_overflow(int x, int y)
{
/*
* TODO: Implement this.
*/
return 0;
}
[...]

I forgot to mention that, of course, it's entirely possible
to implement the multiplication_would_overflow() function in
portable C. I think I've seen versions posted here (I'm just too
lazy to track them down or recreate them). The point is that it's
very difficult to get it right (for example, INT_MIN * -1 overflows
on 2's-complement systems but not on others), and computationally
expensive, far more expensive than the multiplication itself.

On most CPUs, there's a way (which varies from one CPU to another)
to compute x * y safely and determine after the fact whether there
was an overflow, but C doesn't expose that. There typically isn't
even a non-portable way to do overflow checking in C, unless you
resort to inline assembly.

Note that I'm talking about the way C is currently defined, not
arguing that it *should* be defined that way. Personally, I'd like
to see something similar to C#'s "checked" keyword, which requires
overflow checking for an expression; the problem is that it depends
on C#'s exception mechanism, and adding exceptions to C seems to
be more than the committee is willing to do.
 
M

Malcolm McLean

Note that I'm talking about the way C is currently defined, not
arguing that it *should* be defined that way.  Personally, I'd like
to see something similar to C#'s "checked" keyword, which requires
overflow checking for an expression; the problem is that it depends
on C#'s exception mechanism, and adding exceptions to C seems to
be more than the committee is willing to do.
What you could do is add limits

int x = 100 <between 0 and INT_MAX>;

exceeding the limits is undefined behaviour, so cheapo compilers can
add the functionality trivially. A decent compiler, however, will
crash out with an error message.
 
K

Keith Thompson

Malcolm McLean said:
What you could do is add limits

int x = 100 <between 0 and INT_MAX>;

exceeding the limits is undefined behaviour, so cheapo compilers can
add the functionality trivially. A decent compiler, however, will
crash out with an error message.

That's what we have now, but with the added capability of specifying
a narrower range than INT_MIN .. INT_MAX. And it needs a much
more precise specification. Does the "between" clause apply to an
object? An expression? A type?

What I'm talking about is a way to guarantee that a given expression
will *either* yield a mathematically correct result *or* indicate, in
some way that I can handle in my program, that it's unable to do so.

Second-best would be a guarantee that in this code fragment:

int x = INT_MAX + 1;
puts("OOPS!");

the string "OOPS!" will not be printed.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,085
Messages
2,570,597
Members
47,220
Latest member
AugustinaJ

Latest Threads

Top