Performance of signed vs unsigned types

M

Michael Press

Thomas Scrace said:
Does not assume it goes negative.

By assuming that adding two positive ints, or multiplying a positive int
by 2, will produce a negative result when the operation overflows

No, it does not.

Well, that's what I see on both my home machine (AMD64 Ubuntu Linux) and
my work machine (SGI IRIX) when I instrument the else clause with a
"printf("%d\n", y)". If something else happens on the machines that
you're familiar with, that's perfectly conforming, too - which is
precisely my point.

Your responses are just a little too laconic. Could you expand a little
on what it is you're trying to communicate?

Not until you prove or retract your claim that I and my code
assume addition or multiplication on int produces negative results.

I wouldn't call it a "claim" as such,

We disagree.
when it was explicitly qualified
by the phrase "seems to", explicitly acknowledging that what seems to be
the case, might not be. A "guess" might be a more appropriate way of
describing it. I can't imagine how I could possibly prove anything about
what your motivations were while writing that code, and I have no
intention of trying.

It was not my intention to suggest any certainty about why you thought
that code should work. I said as much, in words you chose to clip:
However, I suppose that's not the only assumption that would make it work.

Since I never wrote something such as if( x < 0), there are no grounds
for assuming I wrote about negatives or overflow.
My comment that:
If something else happens on the machines that you're familiar with, that's perfectly conforming, too

was also intended to concede that you might have relied upon some other
assumption. Since you seem offended by the idea that you were relying on
that particular assumption, I'll happily retract my guesses as to what
you were thinking when you wrote that code. However, that leaves me with
no idea why you thought it would work. Would you care to explain what
assumptions you were making about the results of signed integer overflow
when you wrote that code?

It did work, on the machines where I tested it, and it worked precisely
because x+step did produce a negative value.

Here is a list of assumptions. They are as complete and consistent
as I could make them in the time spent; but very likely reducible.

Assumptions.

* There is a set called int.
* int is finite.
* There are binary operators {+, *} on int.
* int is closed under {+, *}.
* + is commutative
* * is commutative
* Multiplication is distributive over addition.
* There is an additive identity 0
* There is a multiplicative identity 1.
* There is an order relation < on int.
* The relation < is transitive.
* If a,b in int, then exactly one of a < b, b < a, a = b obtains.
* 0 < 1.
* If a < b, then a < a + 1 and there is no x such that a < x < a + 1.
* Addition is repeated addition of 1.
- That is if 0 < b, then there is a positive integer (not int!), n, such that
b = 1 + 1 + ... + 1 (n times).
- If 0 < a + b, then a + b = a + [1 + 1 + 1 + ... + 1 (n times)].
* Multiplication is repeated addition.

Those axioms seem awfully familiar to me somehow...

Yes, but they are not exactly anything.
For instance, int is not an additive group; but is a semigroup.
I should have included associativity of addition and multiplication.
 
J

James Kuyper

I am aware of overflow, but simply ignored it.

If you ignore the possibility of overflow, the loop never ends; x and
step just keep doubling in size. I would normally say "until they
overflow", but if you're ignoring overflow, there's nothing else to
terminate the doubling. On those grounds, I think I'm well justified in
assuming that overflow was not merely relevant to the program, but was
in fact central to the whole purpose of that program. Writing such code
without thinking about overflow is more than a little crazy. I
considered it far less insulting to assume that you were making
unportable assumptions about the behavior on overflow, than to assume
that you were insane.
Yes associativity of additions and multiplication.
An additive inverse is not guaranteed in 2's complement arithmetic.

I didn't realize that; I thought that, in 2's complement arithmetic, if
INT_MIN < x <= INT_MAX, the additive inverse of x was -x, while INT_MIN
is it's own additive inverse. The behavior of INT_MIN+INT_MIN is, of
course, undefined as far as the C standard is concerned, but on the 2's
complement machines I tried it on, it came up as 0, as I expected. Am I
missing something here?

....
No and I do not know.

If you don't have any idea which way your code will branch at it's if()
statements, how can you have any confidence in it's ability to produce
the desired results, whatever those might be?

....
Basically I assume.
addition and multiplication are closed on int,

Which is also not guaranteed. However, the implementation I described
which always returns the contents of register 7 does not violate closure
of int under addition and multiplication; yet I doubt that your program
would work properly on such an implementation.

So you made a total of at least five assumptions not guaranteed to be
true by the C standard, in code posted as a response to my assertion
that "most of the ways I can think of for computing the range of a
signed type are subject to serious portability issues.". I will grant
you that you did not label your code as a counter-example to that
assertion, but if it was not so-intended, what was the point of posting
that code?
 
K

Keith Thompson

Michael Press said:
[snip]
Assumptions.

* There is a set called int.
* int is finite.
* There are binary operators {+, *} on int.
* int is closed under {+, *}.

signed ints aren't gaurenteed to be closed under {+, *}.

[snip]

So the program can bomb.
Is it time to fix the C standard to make int closed under {+,*}.

The argument against doing so is that different hardware handles
overflow in different ways. Imposing a single mechanism on all
implementations could cause serious performance issues on systems that
don't behave as the standard specifies. (Or, perhaps more likely, we
just wouldn't have conforming C implementations on those platforms.)

It's not quite as bad if we just require + and * to yield int results,
without specifying what those results are, but so far the feeling has
been that this is best left up to individual implementations.

Is it time to change that? Maybe. Can all the relevant parties agree
on that? Not likely.
 
M

Michael Press

James Kuyper said:
If you ignore the possibility of overflow, the loop never ends; x and
step just keep doubling in size. I would normally say "until they
overflow", but if you're ignoring overflow, there's nothing else to
terminate the doubling. On those grounds, I think I'm well justified in
assuming that overflow was not merely relevant to the program, but was
in fact central to the whole purpose of that program. Writing such code
without thinking about overflow is more than a little crazy. I
considered it far less insulting to assume that you were making
unportable assumptions about the behavior on overflow, than to assume
that you were insane.

I know that the set int is finite, so eventually x cannot grow
any larger. Call it overflow if it pleases you.
I didn't realize that; I thought that, in 2's complement arithmetic, if
INT_MIN < x <= INT_MAX, the additive inverse of x was -x, while INT_MIN
is it's own additive inverse. The behavior of INT_MIN+INT_MIN is, of
course, undefined as far as the C standard is concerned, but on the 2's
complement machines I tried it on, it came up as 0, as I expected. Am I
missing something here?

No, I did not think through the case INT_MIN + INT_MIN.
...

If you don't have any idea which way your code will branch at it's if()
statements, how can you have any confidence in it's ability to produce
the desired results, whatever those might be?

I expect to get consistent results from a+b and a<b.
Which is also not guaranteed. However, the implementation I described
which always returns the contents of register 7 does not violate closure
of int under addition and multiplication;

An implementation that always returns register 7,
or only when !( x < x + step).

If the implementation always returns register 7,
then register 7 always contains INT_MAX. Does the
standard require that INT_MAX exist?
yet I doubt that your program
would work properly on such an implementation.

I do not see why not.
Machine returns contents of register 7.
Therefore register 7 contains INT_MAX,
or else it register 7 < x and the program
keeps plugging away, eventually adding 1 successively.
So you made a total of at least five assumptions not guaranteed to be
true by the C standard, in code posted as a response to my assertion
that "most of the ways I can think of for computing the range of a
signed type are subject to serious portability issues.". I will grant
you that you did not label your code as a counter-example to that
assertion, but if it was not so-intended, what was the point of posting
that code?

The only assumption I know to be violated is closure of addition.
The code can be made to work using only addition.
Tell me which implementations bomb on overflow.
I will be content to avoid them.

If the implementation does not bomb on overflow
what prevents my program from returning INT_MAX.
 
M

Michael Press

Keith Thompson said:
Michael Press said:
[snip]

Assumptions.

* There is a set called int.
* int is finite.
* There are binary operators {+, *} on int.
* int is closed under {+, *}.

signed ints aren't gaurenteed to be closed under {+, *}.

[snip]

So the program can bomb.
Is it time to fix the C standard to make int closed under {+,*}.

The argument against doing so is that different hardware handles
overflow in different ways. Imposing a single mechanism on all
implementations could cause serious performance issues on systems that
don't behave as the standard specifies. (Or, perhaps more likely, we
just wouldn't have conforming C implementations on those platforms.)

It's not quite as bad if we just require + and * to yield int results,
without specifying what those results are, but so far the feeling has
been that this is best left up to individual implementations.

Is it time to change that? Maybe. Can all the relevant parties agree
on that? Not likely.

Exactly what do an implementations do when they
do not return an int to on x + y?
 
S

Seebs

Exactly what do an implementations do when they
do not return an int to on x + y?

Well, one common form would be an arithmetic trap, same kind of thing
as you get when dividing by zero.

-s
 
J

James Kuyper

....
I know that the set int is finite, so eventually x cannot grow
any larger. Call it overflow if it pleases you.

The fact that x cannot grow any larger is not the issue. The fact that
your code attempts to calculate x+step when it's mathematical value is
too large to be stored in an int is an an issue, and overflow is the
term for that situation. It's not about pleasing me, the term is used by
the standard itself.

....
I expect to get consistent results from a+b and a<b.

That's yet another unjustified assumption, if a+b overflows. As
indicated below, the fact that it's unjustified is more important than I
had originally thought.
An implementation that always returns register 7,
or only when !( x < x + step).\

Sorry, my first statement was that it returns the value stored in
register 7 "for any int expression that overflows". My second statement
didn't include that qualifying clause, which may have caused confusion -
in particular, I can't make sense of several of your next few paragraphs
- that may be due to this confusion.

Note that what the comparison operator is comparing, when x+step
overflows, is the value of x and the value stored in register 7; the
comparison can be either true or false, depending upon what's in
register 7. There's no necessary relationship between x+step overflowing
and the value that results from the comparison expression.
If the implementation always returns register 7,
then register 7 always contains INT_MAX.

There's no such requirement. There are, in fact, no applicable
requirements when x+step overflows; that's what "undefined behavior" means.
... Does the
standard require that INT_MAX exist?

It requires that a conforming implementation provide a definition for
INT_MAX in that expands into a constant expression of type int. That
rules out, among other things, the possibility that INT_MAX might change
during the execution of the program.
I do not see why not.
Machine returns contents of register 7.
Therefore register 7 contains INT_MAX,

That does not follow.
or else it register 7 < x and the program
keeps plugging away, eventually adding 1 successively.

It wasn't until I sat down to answer this message that I took a really
close look at the program you wrote; I knew that it depended upon
undefined behavior, which guarantees that is non-portable; but I had not
previously realized how robust it was. Your program's correct behavior
depends on upon an assumption, and one that's not guaranteed to be true
by the standard; but it's an a far weaker assumption than I had thought
it was: it assumes that the result of an int expression that overflows
is an valid int value.

A constant value doesn't prevent your program from working, regardless
of what that value is. A random value doesn't prevent it from working,
though the process gets a little slower. The absolute worst case
behavior occurs when an overflowing x+step expression always has a value
of x+1. If int happens to be a 64-bit type, that behavior would make
your program take an absurdly long time determining the value of INT_MAX
- but it would still eventually determine it.

I wish I'd done that analysis earlier, so I wouldn't have wasted my time
arguing about what the possible values of x+step are.

The code is still not portable, because the result of an int expression
that overflows is not restricted to valid int values. Keith has already
given an example based upon real-life exposure to a system that used
64-bit hardware to emulate 48-bit integers; on such an implementation,
your code would end up setting x to 2^63-1, rather than the correct
value of 2^48-1. Since x does not have a valid int value, there are no
requirements on what value the printf() expression would print out when
x == 2^63-1.

The alternative is not simply having your program blow up. Other,
subtler possibilities allowed by the fact that the behavior of your
program is undefined include:
1. (y = x+step) > x might return true, even though y is actually less
than x.
2. (y = 2*step) > step might return true, even though y is actually less
than step; in particular, y might be negative, causing step to become
negative.
3. step might also, arbitrary, become negative for no particularly good
reason other than the fact that the behavior is undefined.
4. The lines of your program might start executing in an arbitrary
order, unrelated to the order that's required when the behavior is
actually defined by the standard.

I should have brought up these other possibilities in the first place; I
apologize for wasting your time on irrelevant possibilities.
The only assumption I know to be violated is closure of addition.
The code can be made to work using only addition.
Tell me which implementations bomb on overflow.
I will be content to avoid them.

That you choose to avoid them does not make your code portable, if it
won't work as intended on those implementations.
If the implementation does not bomb on overflow
what prevents my program from returning INT_MAX.

See above. The 64-bit/48-bit implementation Keith mentioned is probably
the single most plausible alternative behavior that would actually
prevent your program from printing INT_MAX, without bombing.
 
M

Michael Press

Seebs said:
Well, one common form would be an arithmetic trap, same kind of thing
as you get when dividing by zero.

Who benefits from having such an implementation available?
 
M

Michael Press

James Kuyper said:
The alternative is not simply having your program blow up. Other,
subtler possibilities allowed by the fact that the behavior of your
program is undefined include:
1. (y = x+step) > x might return true, even though y is actually less
than x.
2. (y = 2*step) > step might return true, even though y is actually less
than step; in particular, y might be negative, causing step to become
negative.
3. step might also, arbitrary, become negative for no particularly good
reason other than the fact that the behavior is undefined.
4. The lines of your program might start executing in an arbitrary
order, unrelated to the order that's required when the behavior is
actually defined by the standard.

Who benefits from having these implementations available?
 
B

Barry Schwarz

Who benefits from having such an implementation available?

I don't recall ever taking advantage of the feature so I can't say who
benefits but every system I have worked on since 1967 has add a trap
for integer overflow and integer underflow built into the hardware.
 
J

James Kuyper

Who benefits from having these implementations available?

Those were mere examples of possibilities, and I can't claim to know of
any specific benefit associated any of those particular possibilities.

However, the committee decided to leave the behavior on overflow
undefined, by failing to define it. It was not an accidental omission.
It is in fact cited, in the standard itself, as the prime example of how
behavior can be undefined by reason of a lack of an applicable
definition. The committee could, instead, have stated that the result on
overflow is unspecified, in which case your program would be guaranteed
to work perfectly. The fact that they chose to leave the behavior
undefined strongly suggests that they knew of particular contexts in
which it would have been difficult to implement C, if the integer
overflow was only allowed to generate an unspecified result.

In general, not having to worry about making the compiled code work
properly when there is an overflow (for any particular definition of
properly), removes a barrier to producing better optimization. I have no
idea whether there's any worthwhile optimizations on the other side of
that barrier, but I wouldn't recommend betting against the possibility
that there are.
 
B

Ben Bacarisse

Michael Press said:
Who benefits from having such an implementation available?

They are useful for debugging and testing. Arithmetic overflow is very
often an indicator of a program fault rather than being the intended
behaviour. It's handy to be able to catch it early.

There are program where, particularly during testing, it would be
helpful to treat even unsigned wrap-round as a trap condition and I
would not be surprised to find this option on some safety-critical
systems.

The question has another answer: they benefit people who have hardware
that behaves like this! This is, presumably, the main reason signed
arithmetic is defined this way in C.
 
S

Seebs

Who benefits from having such an implementation available?

People who make chips with that behavior?

There is a reason that so many processors trap on invalid operands: It
turns out to be a desireable behavior sometimes.

-s
 
L

lawrence.jones

Michael Press said:
Who benefits from having such an implementation available?

People writing safety-critical code usually prefer a trap (which can be
caught and handled) to quietly getting the wrong answer (which is harder
to detect and could have serious consequences).
 
M

Malcolm McLean

People writing safety-critical code usually prefer a trap (which can be
caught and handled) to quietly getting the wrong answer (which is harder
to detect and could have serious consequences).
Yes, video games are about the only application area where wrong
results are preferable to no results.
 
M

Michael Press

James Kuyper said:
Those were mere examples of possibilities, and I can't claim to know of
any specific benefit associated any of those particular possibilities.

However, the committee decided to leave the behavior on overflow
undefined, by failing to define it. It was not an accidental omission.
It is in fact cited, in the standard itself, as the prime example of how
behavior can be undefined by reason of a lack of an applicable
definition. The committee could, instead, have stated that the result on
overflow is unspecified, in which case your program would be guaranteed
to work perfectly. The fact that they chose to leave the behavior
undefined strongly suggests that they knew of particular contexts in
which it would have been difficult to implement C, if the integer
overflow was only allowed to generate an unspecified result.

In general, not having to worry about making the compiled code work
properly when there is an overflow (for any particular definition of
properly), removes a barrier to producing better optimization. I have no
idea whether there's any worthwhile optimizations on the other side of
that barrier, but I wouldn't recommend betting against the possibility
that there are.

I do not see why overflow is any kind of impediment.
Whatever is in the register is what you get.
Physical machines that can trap on overflow do not
have to trap on overflow.
 
M

Michael Press

Barry Schwarz said:
I don't recall ever taking advantage of the feature so I can't say who
benefits but every system I have worked on since 1967 has add a trap
for integer overflow and integer underflow built into the hardware.

Yes, the traps are there. The trap need not be taken.
If it is taken, control can still be returned to the
program.
 
M

Michael Press

Seebs said:
People who make chips with that behavior?

There is a reason that so many processors trap on invalid operands: It
turns out to be a desireable behavior sometimes.

And sometimes not. Offer a compiler flag that
makes arithmetic on integer types closed.
 
M

Michael Press

People writing safety-critical code usually prefer a trap (which can be
caught and handled) to quietly getting the wrong answer (which is harder
to detect and could have serious consequences).

People writing safety-critical code do not
range check arguments, and examine return values?
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,086
Messages
2,570,598
Members
47,221
Latest member
LashundaCh

Latest Threads

Top