mixing signed and unsigned

John Harrison · Feb 18, 2007

Ok, let me think this through carefully. I just feel in my bones that
something is wrong with the line

if (A <= MAX_A && n < static_cast<ptrdiff_t>(-A) ||

as a means of checking whether n below [A,0).

We know A is unsigned and

A <= MAX_A

where MAX_A == 1+static_cast<size_t>(std::numeric_limits<ptrdiff_t>::max())

i.e.

A in [0,MAX_A]

Thus -A in [2^N-1,2^N-MAX_A] or -A == 0.

Now, what happens if we cast -A to the signed type? For values of -A above
numeric_limits<ptrdiff_t> (and that is almost all values that occur here),
we have an implementation defined value [4.7/3].

This is probably bad.

Best

Kai-Uwe Bux

Oh well, back to the drawing board. Why can't the standard just say that
result of signed/unsigned comparisons is the mathematically correct one?

john

Kai-Uwe Bux · Feb 18, 2007

John said:
Ok, let me think this through carefully. I just feel in my bones that
something is wrong with the line

if (A <= MAX_A && n < static_cast<ptrdiff_t>(-A) ||

as a means of checking whether n below [A,0).

We know A is unsigned and

A <= MAX_A

where MAX_A ==
1+static_cast<size_t>(std::numeric_limits<ptrdiff_t>::max())

i.e.

A in [0,MAX_A]

Thus -A in [2^N-1,2^N-MAX_A] or -A == 0.

Now, what happens if we cast -A to the signed type? For values of -A
above numeric_limits<ptrdiff_t> (and that is almost all values that occur
here), we have an implementation defined value [4.7/3].

This is probably bad.

Best

Kai-Uwe Bux

Click to expand...

Oh well, back to the drawing board. Why can't the standard just say that
result of signed/unsigned comparisons is the mathematically correct one?

Now, that would take all the fun out of the built-in types, wouldn't it?

As far as I can tell, the standard make the following choice: the signed
types are supposed to be the machines native types. Accordingly, there are
very few guarantees about what they do. Being cynical, once coud say that
the best thing about them, is that it is mathematically defined how they
convert to the unsigned types. The unsigned types make string guarantees:
they implement arithmetic mod 2^N (N is the bitlength). This goes even for
multiplication. Thus unsigned arithmetic is perfectly well-defined.

The standard then had to decide what to do about mixed expressions. The
choice was made that in this case, predictability comes before performance.
The way to ensure predictability is to cast all operands in a mixed binary
expression to unsigned. You have a point with regard to the comparison
operators: it is rather un-natural.

My way to deal with this, is to test first whether the signed value is
negative. Then, the very next step has to be a cast. From then on,
everything is well-defined.

BTW: now, I think, I got a somewhat way of doing the problem. The code
reflects the natural order of the ranges, and uses only one cast. It may
trigger tons of warnings, though, about mixed comparisons.

#include <iostream>
#include <limits>

int main ( void ) {
unsigned long A = 30;
unsigned long B = 20;
unsigned long C = 100;
int x = 0;
while ( std::cin >> x ) {

if ( x < 0 ) {
// we cast first so that unary - does not cause an overflow:
if ( -static_cast<unsigned long>(x) > A ) {
// underflow: treated like overflow!
goto out_of_range;
} else {
std::cout << "range 1";
}
} else {
if ( x < B ) {
std::cout << "range 2";
} else if ( x < C ) {
std::cout << "range 3";
} else {
// overflow:
out_of_range:
std::cout << "out of range";
}
}

std::cout << '\n';
}
}

(You may consider the goto a hook for the maintenace programmer should the
need arise in the future to distinguish between overflow and underflow

Best

Kai-Uwe Bux

John Harrison · Feb 18, 2007

Kai-Uwe Bux said:
John Harrison wrote:

Ok, let me think this through carefully. I just feel in my bones that
something is wrong with the line

if (A <= MAX_A && n < static_cast<ptrdiff_t>(-A) ||

as a means of checking whether n below [A,0).

We know A is unsigned and

A <= MAX_A

where MAX_A ==
1+static_cast<size_t>(std::numeric_limits<ptrdiff_t>::max())

i.e.

A in [0,MAX_A]

Thus -A in [2^N-1,2^N-MAX_A] or -A == 0.

Now, what happens if we cast -A to the signed type? For values of -A
above numeric_limits<ptrdiff_t> (and that is almost all values that occur
here), we have an implementation defined value [4.7/3].

This is probably bad.

Best

Kai-Uwe Bux

Click to expand...

Oh well, back to the drawing board. Why can't the standard just say that
result of signed/unsigned comparisons is the mathematically correct one?

Click to expand...

Now, that would take all the fun out of the built-in types, wouldn't it?

As far as I can tell, the standard make the following choice: the signed
types are supposed to be the machines native types. Accordingly, there are
very few guarantees about what they do. Being cynical, once coud say that
the best thing about them, is that it is mathematically defined how they
convert to the unsigned types. The unsigned types make string guarantees:
they implement arithmetic mod 2^N (N is the bitlength). This goes even for
multiplication. Thus unsigned arithmetic is perfectly well-defined.

The standard then had to decide what to do about mixed expressions. The
choice was made that in this case, predictability comes before performance.
The way to ensure predictability is to cast all operands in a mixed binary
expression to unsigned. You have a point with regard to the comparison
operators: it is rather un-natural.

My way to deal with this, is to test first whether the signed value is
negative. Then, the very next step has to be a cast. From then on,
everything is well-defined.

BTW: now, I think, I got a somewhat way of doing the problem. The code
reflects the natural order of the ranges, and uses only one cast. It may
trigger tons of warnings, though, about mixed comparisons.

#include <iostream>
#include <limits>

int main ( void ) {
unsigned long A = 30;
unsigned long B = 20;
unsigned long C = 100;
int x = 0;
while ( std::cin >> x ) {

if ( x < 0 ) {
// we cast first so that unary - does not cause an overflow:
if ( -static_cast<unsigned long>(x) > A ) {
// underflow: treated like overflow!
goto out_of_range;
} else {
std::cout << "range 1";
}
} else {
if ( x < B ) {
std::cout << "range 2";
} else if ( x < C ) {
std::cout << "range 3";
} else {
// overflow:
out_of_range:
std::cout << "out of range";
}
}

std::cout << '\n';
}
}

(You may consider the goto a hook for the maintenace programmer should the
need arise in the future to distinguish between overflow and underflow

Best

Kai-Uwe Bux

Yes, I think that's the way to go. Thanks for your help.

john

Greg Herlihy · Feb 18, 2007

I have a problem. I want to compare an integral value, n, against three
half open ranges as follows

[-A, 0) // range 1
[0, B) // range 2
[B, C} // range 3

Each range corresponds to a different outcome and if the integral value
isn't within any of the ranges, that's a fourth outcome. So far so easy,
the problem is that n is a signed quantity and A, B, C are unsigned
quantities. Apart from this obscure corner of my code this makes perfect
sense, so I don't want to change the signedness of anything.

How to I write these tests so that my code is reasonably understandable,
rather than a horrible mess of casts and compiler warnings?

One more point, of the unsigned quantity, only B is guaranteed small
enough that it could be safely cast to a signed integer.

As long as the comparisons return the expected results - the values
being compared need not match the original ones:

if (( n - (-A)) >= 0 and n < 0)
{
// 1
}
else
if ( n >= 0 and n < B - 0 )
{
// 2
}
else
if ( n - B >= 0 and n - B < C - B))
{
// 3
}
else
{
// 4
}

Greg

Kai-Uwe Bux · Feb 18, 2007

Greg said:
I have a problem. I want to compare an integral value, n, against three
half open ranges as follows

[-A, 0) // range 1
[0, B) // range 2
[B, C} // range 3

Each range corresponds to a different outcome and if the integral value
isn't within any of the ranges, that's a fourth outcome. So far so easy,
the problem is that n is a signed quantity and A, B, C are unsigned
quantities. Apart from this obscure corner of my code this makes perfect
sense, so I don't want to change the signedness of anything.

How to I write these tests so that my code is reasonably understandable,
rather than a horrible mess of casts and compiler warnings?

One more point, of the unsigned quantity, only B is guaranteed small
enough that it could be safely cast to a signed integer.

Click to expand...

As long as the comparisons return the expected results - the values
being compared need not match the original ones:

if (( n - (-A)) >= 0 and n < 0)

Does not work for n == -40, A == 30: Will yield true where it should say
false.

{
// 1
}
else
if ( n >= 0 and n < B - 0 )

What is the "-0" supposed to accomplish?

{
// 2
}
else
if ( n - B >= 0 and n - B < C - B))

This also looks fishy: in "n - B" the signed n will be cast to unsigned
first. The difference will then be taken mod 2^N (N is the bitlength). The
result will _always_ be >= 0. Thus, the first test has no chance to fail.
It would be really surprising if a one-sided test can check membership in
an interval.

{
// 3
}
else
{
// 4
}

Best

Kai-Uwe Bux

Greg Herlihy · Feb 18, 2007

Greg said:
Greg said:

I have a problem. I want to compare an integral value, n, against three
half open ranges as follows

[-A, 0) // range 1
[0, B) // range 2
[B, C} // range 3

Each range corresponds to a different outcome and if the integral value
isn't within any of the ranges, that's a fourth outcome. So far so easy,
the problem is that n is a signed quantity and A, B, C are unsigned
quantities. Apart from this obscure corner of my code this makes perfect
sense, so I don't want to change the signedness of anything.

How to I write these tests so that my code is reasonably understandable,
rather than a horrible mess of casts and compiler warnings?

One more point, of the unsigned quantity, only B is guaranteed small
enough that it could be safely cast to a signed integer.

Click to expand...

As long as the comparisons return the expected results - the values
being compared need not match the original ones:

if (( n - (-A)) >= 0 and n < 0)

Click to expand...

Does not work for n == -40, A == 30: Will yield true where it should say
false.

True, the first test should be:

n >= -A and n - (-A) < 0 - (-A)

....which can be simplified (see below).

What is the "-0" supposed to accomplish?

Perhaps another 0 would help:

if ( n >= 0 and n - 0 < B - 0 )

The goal is to illustrate the formula being applied - which is:

(n >= lower) and ((n - lower) > (upper - lower))

This also looks fishy: in "n - B" the signed n will be cast to unsigned
first. The difference will then be taken mod 2^N (N is the bitlength). The
result will _always_ be >= 0. Thus, the first test has no chance to fail.
It would be really surprising if a one-sided test can check membership in
an interval.

True, I misapplied the formula in the last case. So with the corrections,
the entire sequence of tests would be:

if ( n >= -A and n + A < A )
{
// 1
}
else
if ( n >= 0 and n - 0 < B - 0 )
{
// 2
}
else
if ( n >= B and n - B < C - B )
{
// 3
}
else
{
// 4
}

Greg

Dave Rahardja · Feb 18, 2007

I have a problem. I want to compare an integral value, n, against three
half open ranges as follows

[-A, 0) // range 1
[0, B) // range 2
[B, C} // range 3

Each range corresponds to a different outcome and if the integral value
isn't within any of the ranges, that's a fourth outcome. So far so easy,
the problem is that n is a signed quantity and A, B, C are unsigned
quantities. Apart from this obscure corner of my code this makes perfect
sense, so I don't want to change the signedness of anything.

How to I write these tests so that my code is reasonably understandable,
rather than a horrible mess of casts and compiler warnings?

One more point, of the unsigned quantity, only B is guaranteed small
enough that it could be safely cast to a signed integer.

Click to expand...

Not the most efficient technique, but quite understandable:

if (n < 0) // signed comparison
{
if (n >= -A) // signed
{
// 1
}
else
{
// 4
}
}
else if (n < B) // unsigned
{
// 2
}
else if (n < C) // unsigned
{
// 3
}
else
{
// 4
}

Note that case 4 appears twice--hence the inefficiency. But I don't think you
can state the logic in any simpler manner.

-dr

enums and signed/unsigned	4	May 12, 2013
mixing signed and unsigned	3	Mar 6, 2006
signed vs. unsigned multiplication	8	Jun 17, 2012
Mixing signed and unsigned code in an applet	0	Mar 7, 2011
[half OT] About the not-in-common range of signed and unsigned char	6	Jul 14, 2010
Why is the return type of count_if() "signed" rather than "unsigned"?	2	Jun 22, 2010
Performance of signed vs unsigned types	84	Apr 20, 2011
conversion from unsigned to signed with a value range shift	5	Sep 21, 2009

mixing signed and unsigned

John Harrison

Kai-Uwe Bux

John Harrison

Greg Herlihy

Kai-Uwe Bux

Greg Herlihy

Dave Rahardja

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads