unsigned short addition/subtraction overflow

O

Old Wolf

If we have two "unsigned short"s, values 65535 and 3 respectively,
and go to add them, we continue to have "case a" and "case b".
In case (a), the sum is 65535U + 3U, which has type unsigned int
and value 2. In case (b), the sum is 65535 + 3, which has type
signed int and value 65538.

Just to make sure this is absolutely clear in my mind:
You are saying that the type of the expression
(unsigned short)65535 + (unsigned short)3
could be something other than "unsigned short" ?

Also, if we have
unsigned short a = 65535, b = 3;
then could the type of
(a + b)
not be "unsigned short" ?

This would be relevant to templates and/or function overloading, eg.
void f(unsigned int t) { /*...*/ }
void f(signed int t} { /*...*/ }

and you went: f((unsigned short)65535 + (unsigned short) 3)
or: f(65535u + 3u)
or, with the above declarations:
f(a+b) // might not even know till runtime


Finally, I had been taught that "65535u" means (unsigned int)65535
which is a whole different kettle of fish to (unsigned short)65535.
Is this correct?
 
P

Peter Nilsson

Old Wolf said:
Just to make sure this is absolutely clear in my mind:

You should get a good C book, or check out the Standard(s) or a draft like
N869.
You are saying that the type of the expression
(unsigned short)65535 + (unsigned short)3
could be something other than "unsigned short" ?

It _has_ to be.
Also, if we have
unsigned short a = 65535, b = 3;
then could the type of
(a + b)
not be "unsigned short" ?

The type of (a + b) will _never_ be unsigned short, since integer operands
of + are always subject to integral promotion.

Try the following program on an implementation where short and int have
different sizes:

#include <stdio.h>

int main(void)
{
unsigned short a = 65530;
unsigned short b = 3;

printf("%u\n", (unsigned) sizeof( a ));
printf("%u\n", (unsigned) sizeof( + a ));
printf("%u\n", (unsigned) sizeof( a + b ));

return 0;
}
This would be relevant to templates and/or function overloading, eg.
void f(unsigned int t) { /*...*/ }
void f(signed int t} { /*...*/ }

Ask a C++ group.
Finally, I had been taught that "65535u" means (unsigned int)65535
which is a whole different kettle of fish to (unsigned short)65535.
Is this correct?

Define 'kettle of fish'.

A singular u suffix means the constant (if valid) is an unsigned integral
type starting in rank from unsigned int, through the higher ranked standard
unsigned integers, then (for C99) suitable extended integer types(*). For
65535u, it's an unsigned int.

(*) It seems that constants larger than the standard integer type range need
not necessarily attain the lowest ranked eligible extended integer type.
 
A

Andy

Peter Nilsson said:
It _has_ to be.


The type of (a + b) will _never_ be unsigned short, since integer operands
of + are always subject to integral promotion.

Yes, but if you assign the result to an unsigned short variable, then
the result will be truncated and is still unsigned short and you still have
modulus 2^n (n=bit size of unsigned short) arithmetic. Am I right? eg.

unsigned short i = (unsigned short)65535 + (unsigned short)3;

result of i is 2. Correct?

Andy
 
P

pete

Andy said:
Yes, but if you assign the result to an unsigned short variable,
then the result will be truncated and is
still unsigned short and you still have
modulus 2^n (n=bit size of unsigned short) arithmetic.
Am I right? eg.

unsigned short i = (unsigned short)65535 + (unsigned short)3;

result of i is 2. Correct?

The cast operates on the constants.
The results of the casts, are the operands to the addition operator.
The operands of the addition operator are promoted to either
int or unsigned.
The right operand of the assignment operator
is converted to the type of the left.

If INT_MAX is 32767
then the operands will be promoted to unsigned.
If the result of the addition is greater than UINT_MAX
then the result will be reduced to 2,
otherwise the result of the addition will be 65538.

if INT_MAX is greater than 32767
then the operands will be promoted to int.
If the result of the addition is greater than INT_MAX,
then the behavior is undefined,
otherwise the result of the addition will be 65538.

If the result of the addition is greater than USHRT_MAX,
then i will be assigned a value of 2,
otherwise, i will be asigned the value of the result of the addition,
unless the addition caused undefined behavior.
 
A

Andy

pete said:
if INT_MAX is greater than 32767
then the operands will be promoted to int.
If the result of the addition is greater than INT_MAX,
then the behavior is undefined,
otherwise the result of the addition will be 65538.

If the result of the addition is greater than USHRT_MAX,
then i will be assigned a value of 2,
otherwise, i will be asigned the value of the result of the addition,
unless the addition caused undefined behavior.

Can unsigned short be more than 2 bytes long? If not, then
(unsigned short)c1 + (unsigned short)c2 will never cause undefined
behavior because
1. If INT_MAX == USHRT_MAX, then c1 and c2 will be promoted
to unsigned short and unsigned short additions will never
cause undefined behaviors.
2. If INT_MAX > USHRT_MAX, then c1 and c2 will be promoted to
int, but c1+c2 is always <= 2*65535 so the addition can
never overflow because INT_MAX is >= 4 bytes.

So then is correct to say that (unsigned short)c1 + (unsigned short)c2
will never cause undefined behaviors? The reason is because in the
book "C, Traps and Pitfalls", the author states that unsigned arithmetic
shall always be done modulus 2^n manner where n is the number of bits
for the unsigned variable. But the edition I got is old (~1991) and
predates the ANSI standard. How about (unsigned long)c1 + (unsigned long)c2?

Andy
 
R

Richard Heathfield

Andy said:
Can unsigned short be more than 2 bytes long?

Yes. On a Cray, for example, it might easily be eight bytes long.

book "C, Traps and Pitfalls", the author states that unsigned arithmetic
shall always be done modulus 2^n manner where n is the number of bits
for the unsigned variable. But the edition I got is old (~1991) and
predates the ANSI standard.

The ANSI Standard dates from 1989. The author's statement is, however,
correct.
 
P

pete

Andy said:
Can unsigned short be more than 2 bytes long? If not, then
(unsigned short)c1 + (unsigned short)c2 will never cause undefined
behavior because
1. If INT_MAX == USHRT_MAX, then c1 and c2 will be promoted
to unsigned short and unsigned short additions will never
cause undefined behaviors.

Nothing ever gets promoted to unsigned short.
"the integer promotions" are either to int or to unsigned int.

N869
6.3.1 Arithmetic operands
6.3.1.1 Boolean, characters, and integers
[#2]
If an int can represent all values of the original type, the
value is converted to an int; otherwise, it is converted to
an unsigned int. These are called the integer
promotions.
2. If INT_MAX > USHRT_MAX, then c1 and c2 will be promoted to
int, but c1+c2 is always <= 2*65535 so the addition can
never overflow because INT_MAX is >= 4 bytes.

So then is correct to say that
(unsigned short)c1 + (unsigned short)c2
will never cause undefined behaviors?
No.

The reason is because in the
book "C, Traps and Pitfalls",
the author states that unsigned arithmetic
shall always be done modulus 2^n manner where n is the number of bits
for the unsigned variable. But the edition I got is old (~1991) and
predates the ANSI standard.

The problem is that you don't know
if the operands to the addition operator above, are unsigned.
How about (unsigned long)c1 + (unsigned long)c2?

The operands of the addition operator above, are unsigned types
which are not subject to the integer promotions,
so it's OK.
 
P

Peter Nilsson

....
Can unsigned short be more than 2 bytes long?

Yes. The only _size_ criteria is that sizeof(short) >= 1. Of course, it has
to satisfy the minimum range requirement [0..65535], so the size of a short
(and therefore unsigned short) must be at least 16 bits. But since CHAR_BIT
may be anything >= 8, like 16 or 32 (e.g. DSP chipsets often have
implementations where CHAR_BIT is 32), it is possible for sizeof(short) to
be 1.

Theoretically, sizeof(short) > sizeof(long) is allowed in a conforming
implementation without contradiction.

Basically, the byte size of unsigned short is totally irrelevent to the
issue. The semantics of C integer expressions are defined by type and value,
not by the size of the representations.

You would do well to avoid forming preconceptions about the size of various
integer types. Not because there are theoretical possibilities, but because
there are very real practical problems that can result as a consequence.
E.g., when the world went from 16 to 32-bit home computers an awful lot of
poor code had to be painstakingly rewritten. Now that we sit on the edge of
64-bit machines becoming the 'norm', you should appreciate that making
assumptions about integer sizes may cost you (or the people who have to
maintain your code) in a few years time.
If not, then
(unsigned short)c1 + (unsigned short)c2 will never cause undefined
behavior because
1. If INT_MAX == USHRT_MAX, then c1 and c2 will be promoted
to unsigned short and unsigned short additions will never
cause undefined behaviors.

INT_MAX == USHRT_MAX is extremely unlikely[*], but in such a case, the
operands of + will both be promoted to int since all the values of unsigned
short _are_ representable as an int. The promotion to int applies to the
operands, which in this case are...

(unsigned short) c1 and (unsigned short) c2

A cast will not _override_ integral promotion.
2. If INT_MAX > USHRT_MAX, then c1 and c2 will be promoted to
int, but c1+c2 is always <= 2*65535

Strictly speaking, in this case, the sum of two unsigned short values is
always less than or equal to 2*USHRT_MAX+1.
so the addition can
never overflow because INT_MAX is >= 4 bytes.

There is nothing in the standards that state that an int must be twice the
size of an unsigned short. There are plenty of implementations where short
and int have the same properties except for their 'rank'.
So then is correct to say that (unsigned short)c1 + (unsigned short)c2
will never cause undefined behaviors? The reason is because in the
book "C, Traps and Pitfalls", the author states that unsigned arithmetic
shall always be done modulus 2^n manner where n is the number of bits
for the unsigned variable.

Yes, but you have to be careful to assertain whether the arithmetic is
indeed being performed on unsigned operands. In the case of adding two
unsigned short values, that isn't a given.
But the edition I got is old (~1991) and
predates the ANSI standard.

I've never read the book, but what it says is true. What the author may not
have known at the time of original writing was whether unsigned short would
possibly promote to int, instead of always promoting to unsigned int. [This
was apparently contested by the C committee.]
How about (unsigned long)c1 + (unsigned long)c2?

The two operands have type unsigned long and are therefore not subject to
integral promotion, nor indeed any other promotion. So the result is an
unsigned long calculated modulo ULONG_MAX+1.

[*] C90 apparently has no concept of padded integers or integer trap
representations, so you're not likely to find a conforming C implementation
where INT_MAX == USHRT_MAX any time soon. It's possible in C99, but I doubt
anyone will ever actually build such an implementation. [It would be
possible on a machine which used floating point instructions to mimic
integer calculations, but such a machine would probably have a hard time
implementing unsigned integer types as a whole, so the compiler writers
would probably give it up as a bad job. ;)]
 
P

Peter Nilsson

Andy said:
It _has_ to be.


The type of (a + b) will _never_ be unsigned short, since integer operands
of + are always subject to integral promotion.

Yes, but if you assign the result to an unsigned short variable, then
the result will be truncated and is still unsigned short and you still have
modulus 2^n (n=bit size of unsigned short) arithmetic. Am I right?[/QUOTE]

Yes, but only assuming the value to be assigned can itself be computed
without undefined behaviour.
eg.

unsigned short i = (unsigned short)65535 + (unsigned short)3;

result of i is 2. Correct?

Not necessarily. Even after casting, the operands of + are _still_ subject
to integral promotion. So, the expression is effectively...

#if USHRT_MAX <= INT_MAX

unsigned short i = (int) (unsigned short) 65535 + (int) (unsigned short)
3;

#else

unsigned short i = (unsigned) (unsigned short) 65535 + (unsigned)
(unsigned short) 3;

#endif

It's highly unlikely, but it is possible for INT_MAX to equal USHRT_MAX, in
which case, the addition may overflow an int, invoking undefined behaviour.
 
A

Andy

Peter Nilsson said:
#if USHRT_MAX <= INT_MAX

unsigned short i = (int) (unsigned short) 65535 + (int) (unsigned short)
3;

#else

unsigned short i = (unsigned) (unsigned short) 65535 + (unsigned)
(unsigned short) 3;

#endif

It's highly unlikely, but it is possible for INT_MAX to equal USHRT_MAX, in
which case, the addition may overflow an int, invoking undefined behaviour.

Yes. I got it now. Thanks for clearing things up with the <=
point. I guess, then I can say that unsigned short
addition/subtraction on a machine where sizeof(unsigned short) ==
sizeof(int) will always yield valid modulus 2^n results. Correct?

Thanks
Andy
 
A

Andy

Richard Heathfield said:
Yes. On a Cray, for example, it might easily be eight bytes long.

Really? From my 1st day at work, I've been taught that
unsigned short is always 2 bytes long while int is variable in length.
You learn something everyday.

Thanks
Andy
 
B

Ben Pfaff

Really? From my 1st day at work, I've been taught that
unsigned short is always 2 bytes long while int is variable in length.

Interesting. `unsigned short' is commonly 2 bytes in size, and 2
bytes is commonly 16 bits, but neither is required. 16 bits is
the minimum size of an `unsigned short', though.
 
R

Richard Heathfield

Andy said:
Really? From my 1st day at work, I've been taught that
unsigned short is always 2 bytes long while int is variable in length.
You learn something everyday.

Here's another one for your collection of strange facts. On a 32-bit DSP,
you might easily find that unsigned short is just *one* byte long (that
byte being 32 bits wide, of course).
 
C

Chris Torek

... I guess, then I can say that unsigned short
addition/subtraction on a machine where sizeof(unsigned short) ==
sizeof(int) will always yield valid modulus 2^n results. Correct?

Well, C99 allows for "padding bits" in various representations, in
which case it is possible (though bizarre) to have, e.g.,
CHAR_BIT = 9, sizeof(short) = sizeof(unsigned short) = 2, and
sizeof(int) = 2; yet INT_MAX could be 131071 (no padding bits)
while USHRT_MAX is 65535 (two padding bits). In this case, since
USHRT_MAX < INT_MAX, "unsigned short" promotes to (signed) int
under the ANSI/ISO widening rules!

Such an implementation is certainly bizarre, but it appears to meet
the Standard's requirements.

Note that unless an implementation is *really* strange, we either
have USHRT_MAX > INT_MAX (e.g., 16-bit PDP-11), or 2*USHRT_MAX <=
INT_MAX (e.g., 16-bit "short", 32-bit "int" systems). If the first
is true -- if USHRT_MAX exceeds INT_MAX -- then unsigned short
promotes to unsigned int and arithmetic works mod 2**n. Otherwise,
double USHRT_MAX is still a valid "int" value, so the sum of any
two unsigned short values after they are promoted to signed int
values is a valid, nonnegative "int" value. For instance,
65535 + 65535 is 131070, and on 18-bit-or-more machines this should
be strictly less than INT_MAX (131071 or more). Of course, the
difference (subtraction) of two such values can be a negative "int",
and the sum of three or more such values can overflow.
 
K

Kelsey Bjarnason

Really? From my 1st day at work, I've been taught that
unsigned short is always 2 bytes long while int is variable in length.
You learn something everyday.

Y'all been taught wrong. Types in C and C++ (apart from the new
specified-size types in C99) are defined by minimum required ranges (and
the caveat that some types must be as large or larger than others).

Net result: an unsigned can be one byte, or 128 bytes, as long as it
handles _at least_ the minimum required range.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,129
Messages
2,570,770
Members
47,329
Latest member
FidelRauch

Latest Threads

Top