Unsinged types

M

Malcolm McLean

בת×ריך ×™×•× ×©×‘×ª,21 ביולי 2012 18:27:33 UTC+1, מ×ת Bart:
Malcolm McLean&qu <[email protected]> wrote in message


Not completely. A 1 pixel x 3 billion pixel black white image only uses
375MB.
That's why int should be 64 bits on 64 bit machines.

You could technically have a 3 billion x 1 pixel image on a 32 bit machine,but you'll only have one of them in memory at any one time,and it can't bea coloured image. But a lot of operating systems won't allow such a big array. It's annoying that signed int doesn't handle this case, but the situation is rare enough that we can live with it.

Most people with a need to handle such images will move to 64 bit operatingsystems with more memory. If int is 64 bits, the code will work without any need for a rewrite.
 
S

Seungbeom Kim

It's also a way to help with some calculations. For example,
you can start from a given date, subtract fourteen from tm_mday,
re-normalize, and easily find out "What date was a fortnight
before January 5?"
Exactly.


You can't do that with signed integers, either.

Of course, signed integers are not without their limits (and the design
limits should lie safely within the environmental limits).

However, *most* of the values we deal with are in small ranges that are
not too far from zero; ranges near INT_MAX, for example, but far from
zero are uncommon, if any, and they may need an upgrade to a larger
integer type anyway.

Given that, getting around in signed is usually safe, as you're near
the center, far from the edges, of the field. Doing that in unsigned,
however, is like playing near the edge and risks falling off.


,- usually here
*****
signed [-------+-------+-------+-------]XXXX DANGER
| | |
-INT_MAX 0 INT_MAX UINT_MAX
| | |
unsigned DANGER XXXX[-------+-------+-------+-------]

And in the cases where naive subtraction doesn't work, you don't
just get a possibly surprising but predictable outcome: You get
undefined behavior.

True. But (2U - 5U) yielding 4294967293U is not very useful, no matter
how predictable it is, and failing to take that into account is a bug,
just as getting undefined behavior from signed arithmetic overflow is.
Oh, come on! You might just as well complain about double-ness
being "contagious."

Their "contagiousnesses" per se may be similar, but their effects are
different; conversion from integer to double doesn't (usually) change
the value being converted, or the result of the comparison at least,
but conversion from signed to unsigned often does, often in a very
surprising way, which is exactly my point in the paragraph above.
Besides, why blame the surprises on the unsigned operand? They
don't arise from either one of the operands in isolation, but from
the combination of the two -- so the signed operand is every bit as
much to blame as the unsigned. If you blame one, you should blame
the other equally.[*]

[*] Okay, that doesn't always happen in real life: Doheny was
acquitted of offering the bribe that Fall was convicted of taking.
But Roaring Twenties jurisprudence is a poor model for programming!

Sorry, I don't understand the footnote.

Anyway, you're right that the surprises arise from the combination.
If your values are usually closer to the upper limit of signed (say,
INT_MAX) than to zero, then signed gets in the way more often and may
deserve more "blame." If, on the other hand, they are usually closer
to the lower limit of unsigned, i.e. zero, then unsigned more often
gets in the way. If both cases happen equally, you may want to blame
them equally. In most of the cases I encounter, they don't.
Your mileage may vary, though.
... or when mixing signed integer with long double complex, or
when mixing unsigned long with pointer-to-pointer-to-T, or ... In
fact, your recommendation to "be careful when..." can be improved
by deleting "when" and everything after it. Just be careful, okay?

It's not a unanimous improvement, as it makes my statement more general
and more "vacuously true." Maybe I should have said "be *more* careful"
to be clearer, but I won't delete the when-clause. I don't object to
your being careful in everything, though.
 
B

BartC

Seungbeom Kim said:
On 7/20/2012 5:11 PM, Seungbeom Kim wrote:

You can't do that with signed integers, either.

Of course, signed integers are not without their limits (and the design
limits should lie safely within the environmental limits).

However, *most* of the values we deal with are in small ranges that are
not too far from zero; ranges near INT_MAX, for example, but far from
zero are uncommon, if any, and they may need an upgrade to a larger
integer type anyway.

Given that, getting around in signed is usually safe, as you're near
the center, far from the edges, of the field. Doing that in unsigned,
however, is like playing near the edge and risks falling off.


,- usually here
*****
signed [-------+-------+-------+-------]XXXX DANGER
| | |
-INT_MAX 0 INT_MAX UINT_MAX
| | |
unsigned DANGER XXXX[-------+-------+-------+-------]

And in the cases where naive subtraction doesn't work, you don't
just get a possibly surprising but predictable outcome: You get
undefined behavior.

True. But (2U - 5U) yielding 4294967293U is not very useful, no matter
how predictable it is

You've put the point across much better than I've ever been able to, in many
more posts (-1<5 is true, -1<5u is false, etc.)

But, the language is apparently always right. Even if it's due to 'existing
practice'.
 
B

Ben Bacarisse

Malcolm McLean said:
בת×ריך ×™×•× ×©×‘×ª, 21 ביולי 2012 17:48:51 UTC+1, מ×ת Ben Bacarisse:
Malcolm McLean (e-mail address removed) writes:

These are arguments are about code that no one has, or should, write.
It's a straw man. Why not just show how much simpler your code is than
mine when getcursorposition uses int *s rather than unsigned int *s?
That will make it clear just how much damage the using of unsigned has
introduced.
void drawoctogonroundcursor(void)
{
int octx[8];
int octy[8];
int cx, cy;
int d = 3; /* this gives the size of the octogon step */

getcursorposition(&cx, &cy);
octx[0] = cx-d; octy[0] = cy-2*d;
octx[1] = cx+d; octy[1] = cy-2*d;
octx[2] = cx+2*d; octy[2] = cy-d;
octx[3] = cx+2*d; octy[3] = cy+d;
octx[4] = cx+d; octy[4] = cy+2*d;
octx[5] = cx-d; octy[5] = cy+2*d;
octx[6] = cx-2*d; octy[6] = cy+d;
octx[7] = cx-2*d; octy[7] = cy-d;

drawpolygon(octx, octy, 8);

}

There's no messing about.

So to sum up, the messing about caused by an (in my opinion
inappropriate) use of unsigned is a declaration and two assignments. I
don't see that as a strong reason to discourage their use, especially in
more appropriate situations. For example, is the third parameter of
drawpolygon signed or unsigned?

I don't want to be dogmatic about this -- I am sure there are cases
where someone having chosen the "wrong" signedness causes problems, but
I don't think this is a clear-cut example. In contrast, the cases where
you need to pure binary semantics of C's unsigned types are, in my
experice, more common (but then I used to write cryptographic code).
We can focus completely on the drawing
logic, which I might have got wrong.

Well, I assumed you meant a regular octagon which make a loop the
obvious way to go, but that's not really the issue here.
 
M

Malcolm McLean

בת×ריך ×™×•× ×©×‘×ª,21 ביולי 2012 22:35:35 UTC+1, מ×ת Ben Bacarisse:
Malcolm McLean &lt;[email protected]&gt; writes:

So to sum up, the messing about caused by an (in my opinion
inappropriate) use of unsigned is a declaration and two assignments.

Well, I assumed you meant a regular octagon which make a loop the
obvious way to go, but that's not really the issue here.
A regular octagon would have to take reals. Then you don't see the problem,as there isn't an unsigned real type. A drawrectangle() could reasonably be written to take four parameters, so here you don't really see the problemeither. You might need an ugly cast, but there's not too damaging.
The problem comes when you start passing about integers by indirection. A sane integer-based drawpolygon() function has to accept a pointer to signed integers, maybe wrapped in a structure. But cx and cy are unsigned. So either you cast every assignment, which is a lot of extra characters, or you create special variables signed_cx and signed_cy. Both are a real nuisance.
I know that you can specify the octogon as offsets and then fill the bufferin a loop. There are ways round it. But the types of integers shouldn't bea factor in that consideration about how to write the function.
 
T

Tim Rentsch

Eric Sosman said:
Things that you assume should be non-negative often turn out to be
not always so. For example, though no one thinks of a negative day
of the month, the tm_* members of struct tm are signed, because you
sometimes need to be able to represent out-of-range values.

It's also a way to help with some calculations. For example,
you can start from a given date, subtract fourteen from tm_mday,
re-normalize, and easily find out "What date was a fortnight
before January 5?" [snip unrelated]

Can also easily be done using unsigned types.
 
T

Tim Rentsch

Seungbeom Kim said:
On 2012-07-20 14:41, Eric Sosman wrote: [snip]
And in the cases where naive subtraction doesn't work, you don't
just get a possibly surprising but predictable outcome: You get
undefined behavior.

True. But (2U - 5U) yielding 4294967293U is not very useful, no matter
how predictable it is, [snip]

That's an overstatement. It may not be useful in ways that
you would like it to be, or in all the cases you would like
it to be, but there still are lots of cases where it is
useful. To give one example, if we want to test if
variable 'i' is in the range [20 .. 30), with signed
arithmetic that takes two comparisons, but with unsigned
arithmetic this can be done with one comparison:

if( (i-20u) < 10 ) ...

Unsigned arithmetic doesn't behave in a way that most
people are used to, but the behavior it has still is
quite useful, in lots of different ways.
 
B

BartC

Tim Rentsch said:
Seungbeom Kim said:
On 2012-07-20 14:41, Eric Sosman wrote: [snip]
And in the cases where naive subtraction doesn't work, you don't
just get a possibly surprising but predictable outcome: You get
undefined behavior.

True. But (2U - 5U) yielding 4294967293U is not very useful, no matter
how predictable it is, [snip]

That's an overstatement. It may not be useful in ways that
you would like it to be, or in all the cases you would like
it to be, but there still are lots of cases where it is
useful. To give one example, if we want to test if
variable 'i' is in the range [20 .. 30), with signed
arithmetic that takes two comparisons, but with unsigned
arithmetic this can be done with one comparison:

if( (i-20u) < 10 ) ...

You're replacing two comparisons, with a comparison and a subtraction, and
now have code that will cause people to scratch their heads (with the
mysterious introduction of '10' and '20u') instead of just writing:

if (i>=20 && i<30)

and leaving it to the compiler to sort out, as we're constantly told to do.
It's possible also that i will be <20, so only one comparison gets done
anyway.

(I do sometimes use the idiom, in assembly, but when i needs to be offset
to start at zero anyway.)
Unsigned arithmetic doesn't behave in a way that most
people are used to

'Unintuitive'
 
P

Phil Carmody

Seungbeom Kim said:
True. But (2U - 5U) yielding 4294967293U is not very useful, no matter
how predictable it is, and failing to take that into account is a bug,
just as getting undefined behavior from signed arithmetic overflow is.

unsigned int find_new_right(unsigned int old_left,
unsigned int old_right,
unsigned int new_left)
{
return new_left + (old_right - old_left);
}

In calling find_new_right(5,2,100) I have "failed to take into account"
that the 2U-5U within its evaluation is 4294967293U.

However, there's no bug.

So it's not comparable to the undefined behaviour from overflow when
using signed arithmetic.

Phil
--
I'd argue that there is much evidence for the existence of a God.
Pics or it didn't happen.
-- Tom (/. uid 822)
 
B

BartC

Phil Carmody said:
unsigned int find_new_right(unsigned int old_left,
unsigned int old_right,
unsigned int new_left)
{
return new_left + (old_right - old_left);
}

In calling find_new_right(5,2,100) I have "failed to take into account"
that the 2U-5U within its evaluation is 4294967293U.

However, there's no bug.

Yes, you sometimes get these seriously weird-looking intermediate results
that often magically get cancelled out. But not always.

Try (2u - 5u) > 100u;
 
P

Phil Carmody

BartC said:
Yes, you sometimes get these seriously weird-looking intermediate
results that often magically get cancelled out. But not always.

Try (2u - 5u) > 100u;

Did. Worked as expected. As TR points out, you've just verified
that 2 is not in the range [5..105].

Phil
--
I'd argue that there is much evidence for the existence of a God.
Pics or it didn't happen.
-- Tom (/. uid 822)
 
M

Malcolm McLean

בת×ריך ×™×•× ×¨×שון, 22 ביולי 2012 12:29:21 UTC+1, מ×ת Phil Carmody:
Seungbeom Kim &lt;[email protected]&gt; writes:
unsigned int find_new_right(unsigned int old_left,
unsigned int old_right,
unsigned int new_left)
{
return new_left + (old_right - old_left);
}

In calling find_new_right(5,2,100) I have failed to take into account
that the 2U-5U within its evaluation is 4294967293U.

However, there's no bug.

So it's not comparable to the undefined behaviour from overflow when
using signed arithmetic.
Let's say int is 8 bits to make the numbers human-readable.

We pass in 0, 255 and 10. We get the result 9. The program then chugs through with a wrong value.

Now let's examine the signed case. We pass in 0 and 127 and 10 (we can't represent numbers greater that 127, but let's say 100 is the sane limit). 10 + 127 is 137. So if we've got a decent platform, the program comes juddering to halt with the message "arithmetical overflow". Is this better of worse? It depends what you're doing. If you're writing a video game then the baddy jumps to column 9 of the screen and you hope that the player just thinksit's a feature of the game. If you're giving a dose of a radioactive substance to a cancer patient, it's much better to give the error message.
 
G

gwowen

To give one example, if we want to test if
variable 'i' is in the range [20 .. 30), with signed
arithmetic that takes two comparisons, but with unsigned
arithmetic this can be done with one comparison:

One comparison and a subtraction.
    if(  (i-20u) < 10  ) ...

That's awful code. I mean, it'd be fine for an obfuscated C contest as
it's correct, but awful. It's created for a computer to understand,
but not for a human. You've written an arithmetic expression who's
meaning changes when you reorder the terms. Yes, its technically
correct but *thats not the only sort of correct*.

Is the best reason for unsigned types so that you can write truly
awful code that demonstrates how au fait with the standard you are?
 
M

Malcolm McLean

בת×ריך ×™×•× ×©× ×™,23 ביולי 2012 19:02:54 UTC+1, מ×ת Kenneth Brody:
On 7/21/2012 2:37 PM, Malcolm McLean wrote:

On the other hand, there's probably more software that assumes that int
is 32 bits which would be 'broken'[1] than there would be
software ? as you describe.
There's a lot of code which assumes that int can index an array. Strictly you should use size_t as a general-purpose indexing variable. But almost no-one does this. The implications are just too horrible.
But data represents something in the real world. It's very rare to have more than 2 billion measurements or observations, and to want to handle them all in one pass. An important exception is fake data designed by malicious users trying to bring down the system.
 
T

Tim Rentsch

gwowen said:
To give one example, if we want to test if
variable 'i' is in the range [20 .. 30), with signed
arithmetic that takes two comparisons, but with unsigned
arithmetic this can be done with one comparison:

One comparison and a subtraction.

Why, yes, thank you for pointing out that obvious and completely
incidental fact.
That's awful code. I mean, it'd be fine for an obfuscated C contest as
it's correct, but awful. It's created for a computer to understand,
but not for a human. You've written an arithmetic expression who's
meaning changes when you reorder the terms. Yes, its technically
correct but *thats not the only sort of correct*.

Is the best reason for unsigned types so that you can write truly
awful code that demonstrates how au fait with the standard you are?

Personally I think this idiom is easy to learn and also a good
one to know, but that's really beside the point, which was to
illustrate a property that unsigned types have that is useful.
Whether this is done in open code, or encapsulated inside a
macro, or in code that is programmatically generated, doesn't
change the key result, viz., that unsigned types have some useful
properties that signed types don't.
 
T

Tim Rentsch

BartC said:
Tim Rentsch said:
Seungbeom Kim said:
On 2012-07-20 14:41, Eric Sosman wrote: [snip]
And in the cases where naive subtraction doesn't work, you don't
just get a possibly surprising but predictable outcome: You get
undefined behavior.

True. But (2U - 5U) yielding 4294967293U is not very useful, no matter
how predictable it is, [snip]

That's an overstatement. It may not be useful in ways that
you would like it to be, or in all the cases you would like
it to be, but there still are lots of cases where it is
useful. To give one example, if we want to test if
variable 'i' is in the range [20 .. 30), with signed
arithmetic that takes two comparisons, but with unsigned
arithmetic this can be done with one comparison:

if( (i-20u) < 10 ) ...

You [snip] now have code that will cause people to scratch their heads
[snip]

Perhaps so, but the point was to illustrate the property,
not to illustrate code that might take advantage of it.
In this case, for example, people who aren't used to this
idiom could bundle it up inside a macro definition.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,078
Messages
2,570,570
Members
47,204
Latest member
MalorieSte

Latest Threads

Top