Integer Promotion?

  • Thread starter Frederick Gotham
  • Start date
F

Frederick Gotham

I set about trying to find a portable way to set the value of UCHAR_MAX. At
first, I thought the following would work:

#define UCHAR_MAX ~( (unsigned char)0 )


However, it didn't work for me. Could someone please explain to me what's
going on? I would have thought that the following happens:

(1) The literal, 0, whose type is int, gets converted to an unsigned char.

0000 0000

(2) The resultant unsigned char then has all its bits flipped.

1111 1111


My hunch is that there's some sort of integer promotion at work, but I
don't know exactly how it works.

Could someone please enlighten me?
 
B

Ben Pfaff

Frederick Gotham said:
I set about trying to find a portable way to set the value of UCHAR_MAX. At
first, I thought the following would work:

#define UCHAR_MAX ~( (unsigned char)0 )


However, it didn't work for me. Could someone please explain to me what's
going on?

The operand of ~ is subject to the integer promotions, which
means that the `unsigned char' value 0 is converted to int (or,
possibly, to unsigned int) before the complement happens.

An expression with the right value is (unsigned char) -1, but
that's not suitable for UCHAR_MAX because it contains a cast and
because it has the type unsigned char, whereas UCHAR_MAX should
have type int (or, possibly, unsigned int).
 
L

Lew Pitcher

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Frederick said:
I set about trying to find a portable way to set the value of UCHAR_MAX. At
first, I thought the following would work:

#define UCHAR_MAX ~( (unsigned char)0 )


However, it didn't work for me.

First off, why are you trying to /set/ a value that is (or should be)
supplied to you by your C implementation? Are you just trying to
second-guess your compiler, or are you trying to change the limit that
the compiler imposes. FWIW, that manifest constant is (or should be)
defined by the compiler, and externalized for your information only.

Secondly, what do you mean by "it didn't work for me". What did you get?
What did you expect to get? How did the #define fail you?

[snip]


- --

Lew Pitcher, IT Specialist, Corporate Technology Solutions,
Enterprise Technology Solutions, TD Bank Financial Group

(Opinions expressed here are my own, not my employer's)
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.3 (MingW32)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFEicyxagVFX4UWr64RAhMXAJ9zUQiKCbqNtfmepCYpysBlVeML5ACfUsVc
/llIMf24SnZYOd3pDY2vBM8=
=6eGA
-----END PGP SIGNATURE-----
 
M

Mark Odell

Frederick said:
I set about trying to find a portable way to set the value of UCHAR_MAX. At
first, I thought the following would work:

#define UCHAR_MAX ~( (unsigned char)0 )

What's wrong with the version in limits.h?
 
E

Eric Sosman

Frederick Gotham wrote On 06/09/06 15:24,:
I set about trying to find a portable way to set the value of UCHAR_MAX. At
first, I thought the following would work:

#define UCHAR_MAX ~( (unsigned char)0 )
[...]

You were right to suspect the integer promotions, and
others have explained how they bollix things up for you.
As for the portable expression, try

#define MY_UCHAR_MAX ((unsigned char)-1)
 
F

Frederick Gotham

Eric Sosman posted:
Frederick Gotham wrote On 06/09/06 15:24,:
I set about trying to find a portable way to set the value of
UCHAR_MAX. At first, I thought the following would work:

#define UCHAR_MAX ~( (unsigned char)0 )
[...]

You were right to suspect the integer promotions, and
others have explained how they bollix things up for you.
As for the portable expression, try

#define MY_UCHAR_MAX ((unsigned char)-1)


I initially wrote code which used UCHAR_MAX, but then I made a C++
template function out of it. Once I had made the template, I needed a
universal way to get the maximum value for an unsigned integer type. At
first, I had something like:

(Don't worry, it's C code...

typedef unsigned char AnyUnsignedIntegerType;

void ArbitraryFunc()
{
AnyUnsignedIntegerType max_val = ~(AnyUnsignedIntegerType)0;
}


But it didn't work.
 
F

Frederick Gotham

Ben Pfaff posted:

The operand of ~ is subject to the integer promotions, which
means that the `unsigned char' value 0 is converted to int (or,
possibly, to unsigned int) before the complement happens.


Is there any sort of guide I can read which would show me when and why
integer promotion happens?
 
P

pete

Frederick said:
Eric Sosman posted:
void ArbitraryFunc()
{
AnyUnsignedIntegerType max_val = ~(AnyUnsignedIntegerType)0;
}

But it didn't work.

As Eric Sosman implied, it's supposed to be:

AnyUnsignedIntegerType max_val = (AnyUnsignedIntegerType)-1;

That works for any and all unsigned integer types.
 
E

Eric Sosman

Frederick Gotham wrote On 06/09/06 16:59,:
Ben Pfaff posted:






Is there any sort of guide I can read which would show me when and why
integer promotion happens?

Well, there's the Standard ...

Informally, when a "narrow" integer is used as an operand
or is passed to a function without a prototype or is passed as
one of the "..." arguments to a variable-argument function,
the value is first promoted to int or to unsigned int. But
let's not rush ahead: First, what's a "narrow" type?

Formally, the types subject to promotion are those whose
"rank" is less than that of int. Informally, these are the
types with "fewer bits" than int: char and short (on most
systems), bit-fields with small widths, and all of these in
both signed and unsigned flavors. If the compiler decides not
to use full-sized ints for some enum types, they are also
subject to promotion. In C99, the promotable list adds _Bool
and perhaps some implementation-defined narrow types that the
Standard doesn't enumerate.

All right, those are the types that get promoted. What
do they get promoted to? The rule is straightforward:

- If all the possible values of a "narrow" type can be
represented by an int, the type promotes to int.

- Otherwise, the type promotes to unsigned int.

The potentially tricky part is that stuff about ranges,
because the range of values a particular type can represent is
up to the implementation (subject to some requirements), and
the ranges vary from one implementation to another. Let's
work through a few examples.

First, an easy one: What happens to a short? The range
of short can be different on different implementations, but
we know it will never be wider than the range of the same
implementation's int: Every short value is also a legitimate
int value. Therefore, short always promotes to int, on every
C implementation everywhere. That wasn't so bad, was it?

How about unsigned short? Now things get tougher. On
many systems an int has 32 bits and an unsigned short has 16.
On these systems, every unsigned short value is also an int
value, so unsigned short promotes to int. But what about a
system where int and short are both 16 bits wide? On such
systems INT_MAX will be 32767 but USHRT_MAX will go all the
way up to 65535, so there exist unsigned short values that
are too large for int. On these systems, unsigned short will
promote to unsigned int.

Similar issues arise with other narrow types: you usually
know that they are "narrow," but not what they'll promote to.
Even the lowly char promotes to int on some systems and to
unsigned int on others -- yes, Virginia, there are systems
where char and int have the same bit count, and if char is
unsigned (remember, the signedness of char is also at the
discretion of the implementation) CHAR_MAX will be equal to
UCHAR_MAX and both will be bigger than INT_MAX.

Why does C introduce all this confusion about promotion?
Hard to say for certain, but the confusion is probably an
attempt to make things simpler. ("Say, what?") Consider:
The underlying hardware probably doesn't have an instruction
that multiplies an unsigned short by a signed bit-field of
width eleven. Yet the language allows you as a programmer
to write such an expression, so what is the compiler to do
with it? C's answer is to promote the values to some flavor
of int, the machine's "natural" word size and the one most
likely to be supported best by the instruction set.

As a language, C is fairly close to the hardware. The
matter of promotion is one place where the hardware becomes
more visible to the programmer than it is elsewhere, and
sometimes more troublesome, too.
 
C

Coos Haak

Op Fri, 09 Jun 2006 21:55:01 GMT schreef pete:
As Eric Sosman implied, it's supposed to be:

AnyUnsignedIntegerType max_val = (AnyUnsignedIntegerType)-1;

That works for any and all unsigned integer types.

With one (albeit minor) provision that the machine uses two-complement
arithmetic ;-)
 
B

Ben Pfaff

Coos Haak said:
Op Fri, 09 Jun 2006 21:55:01 GMT schreef pete:

With one (albeit minor) provision that the machine uses two-complement
arithmetic ;-)

No, it has no such requirement. The C standard (C89 and C99)
specifies that a negative value is converted from a signed to an
unsigned type by adding Utype_MAX + 1.
 
E

Eric Sosman

Coos Haak wrote On 06/09/06 18:10,:
Op Fri, 09 Jun 2006 21:55:01 GMT schreef pete:




With one (albeit minor) provision that the machine uses two-complement
arithmetic ;-)

I repeat: "For any and all unsigned integer types."
And I add: "... no matter what representation the machine
uses for negative signed integer types."
 
F

Frank Silvermann

Eric said:
Frederick Gotham wrote On 06/09/06 16:59,:

Well, there's the Standard ...

Informally, when a "narrow" integer is used as an operand
or is passed to a function without a prototype or is passed as
one of the "..." arguments to a variable-argument function,
the value is first promoted to int or to unsigned int. But
let's not rush ahead: First, what's a "narrow" type?

Formally, the types subject to promotion are those whose
"rank" is less than that of int. Informally, these are the
types with "fewer bits" than int: char and short (on most
systems), bit-fields with small widths, and all of these in
both signed and unsigned flavors. If the compiler decides not
to use full-sized ints for some enum types, they are also
subject to promotion. In C99, the promotable list adds _Bool
and perhaps some implementation-defined narrow types that the
Standard doesn't enumerate.

All right, those are the types that get promoted. What
do they get promoted to? The rule is straightforward:

- If all the possible values of a "narrow" type can be
represented by an int, the type promotes to int.

- Otherwise, the type promotes to unsigned int.

The potentially tricky part is that stuff about ranges,
because the range of values a particular type can represent is
up to the implementation (subject to some requirements), and
the ranges vary from one implementation to another. Let's
work through a few examples.

First, an easy one: What happens to a short? The range
of short can be different on different implementations, but
we know it will never be wider than the range of the same
implementation's int: Every short value is also a legitimate
int value. Therefore, short always promotes to int, on every
C implementation everywhere. That wasn't so bad, was it?

How about unsigned short? Now things get tougher. On
many systems an int has 32 bits and an unsigned short has 16.
On these systems, every unsigned short value is also an int
value, so unsigned short promotes to int. But what about a
system where int and short are both 16 bits wide? On such
systems INT_MAX will be 32767 but USHRT_MAX will go all the
way up to 65535, so there exist unsigned short values that
are too large for int. On these systems, unsigned short will
promote to unsigned int.

Similar issues arise with other narrow types: you usually
know that they are "narrow," but not what they'll promote to.
Even the lowly char promotes to int on some systems and to
unsigned int on others -- yes, Virginia, there are systems
where char and int have the same bit count, and if char is
unsigned (remember, the signedness of char is also at the
discretion of the implementation) CHAR_MAX will be equal to
UCHAR_MAX and both will be bigger than INT_MAX.

Why does C introduce all this confusion about promotion?
Hard to say for certain, but the confusion is probably an
attempt to make things simpler. ("Say, what?") Consider:
The underlying hardware probably doesn't have an instruction
that multiplies an unsigned short by a signed bit-field of
width eleven. Yet the language allows you as a programmer
to write such an expression, so what is the compiler to do
with it? C's answer is to promote the values to some flavor
of int, the machine's "natural" word size and the one most
likely to be supported best by the instruction set.

As a language, C is fairly close to the hardware. The
matter of promotion is one place where the hardware becomes
more visible to the programmer than it is elsewhere, and
sometimes more troublesome, too.
#include <stdio.h>
#include <stdlib.h>


int main(void)
{
int i, a, b, d;
char c;
short e;

i = 0xFFFFFE3C;
c = i;
e = i;
a = sizeof(i);
b = sizeof(c);
d = sizeof(e);
printf("ints: %d while chars: %d and shorts: %d\n", a, b, d);
printf("c is %c\n", c);
printf("e is %hd\n", e);
return 0;
}
/* end source */
Am I getting closer to your point here, or closer to the confusion? The
char becomes '3C' and the short 'FE3C' frank
 
F

Frank Silvermann

Eric said:
Frederick Gotham wrote On 06/09/06 16:59,:

Well, there's the Standard ...

Informally, when a "narrow" integer is used as an operand
or is passed to a function without a prototype or is passed as
one of the "..." arguments to a variable-argument function,
the value is first promoted to int or to unsigned int. But
let's not rush ahead: First, what's a "narrow" type?

Formally, the types subject to promotion are those whose
"rank" is less than that of int. Informally, these are the
types with "fewer bits" than int: char and short (on most
systems), bit-fields with small widths, and all of these in
both signed and unsigned flavors. If the compiler decides not
to use full-sized ints for some enum types, they are also
subject to promotion. In C99, the promotable list adds _Bool
and perhaps some implementation-defined narrow types that the
Standard doesn't enumerate.

All right, those are the types that get promoted. What
do they get promoted to? The rule is straightforward:

- If all the possible values of a "narrow" type can be
represented by an int, the type promotes to int.

- Otherwise, the type promotes to unsigned int.

The potentially tricky part is that stuff about ranges,
because the range of values a particular type can represent is
up to the implementation (subject to some requirements), and
the ranges vary from one implementation to another. Let's
work through a few examples.

First, an easy one: What happens to a short? The range
of short can be different on different implementations, but
we know it will never be wider than the range of the same
implementation's int: Every short value is also a legitimate
int value. Therefore, short always promotes to int, on every
C implementation everywhere. That wasn't so bad, was it?

How about unsigned short? Now things get tougher. On
many systems an int has 32 bits and an unsigned short has 16.
On these systems, every unsigned short value is also an int
value, so unsigned short promotes to int. But what about a
system where int and short are both 16 bits wide? On such
systems INT_MAX will be 32767 but USHRT_MAX will go all the
way up to 65535, so there exist unsigned short values that
are too large for int. On these systems, unsigned short will
promote to unsigned int.

Similar issues arise with other narrow types: you usually
know that they are "narrow," but not what they'll promote to.
Even the lowly char promotes to int on some systems and to
unsigned int on others -- yes, Virginia, there are systems
where char and int have the same bit count, and if char is
unsigned (remember, the signedness of char is also at the
discretion of the implementation) CHAR_MAX will be equal to
UCHAR_MAX and both will be bigger than INT_MAX.

Why does C introduce all this confusion about promotion?
Hard to say for certain, but the confusion is probably an
attempt to make things simpler. ("Say, what?") Consider:
The underlying hardware probably doesn't have an instruction
that multiplies an unsigned short by a signed bit-field of
width eleven. Yet the language allows you as a programmer
to write such an expression, so what is the compiler to do
with it? C's answer is to promote the values to some flavor
of int, the machine's "natural" word size and the one most
likely to be supported best by the instruction set.

As a language, C is fairly close to the hardware. The
matter of promotion is one place where the hardware becomes
more visible to the programmer than it is elsewhere, and
sometimes more troublesome, too.
#include <stdio.h>
#include <stdlib.h>


int main(void)
{
int i, a, b, d;
char c;
short e;

c = 'J';
e = c;
i = e;


a = sizeof(i);
b = sizeof(c);
d = sizeof(e);
printf("ints: %d while chars: %d and shorts: %d\n", a, b, d);
printf("c is %c\n", c);
printf("e is %hd\n", e);
printf("i is %d\n", i);
return 0;
}

Good grief. I'm not going to find an example of this trouble by the
integer demotion of the previous post. So I think I have signed versus
unsigned and at the fringes of the ranges. frank
 
F

Frederick Gotham

Eric Sosman posted:

- If all the possible values of a "narrow" type can be
represented by an int, the type promotes to int.

- Otherwise, the type promotes to unsigned int.


What if a signed short had 56 value bits and no padding bits, and a signed
int had 48 value bits and 8 padding bits? I believe this would satisfy:

sizeof(int) >= sizeof(short)

but at the same time, an "int" would have greater range than "short". In
such circumstances, you'd be promoting to a narrower type!


(Unless of course the Standard says that a signed int must have greater or
equal range than a signed short... ?)
 
F

Frederick Gotham

Eric Sosman posted:

First, an easy one: What happens to a short? The range
of short can be different on different implementations, but
we know it will never be wider than the range of the same
implementation's int: Every short value is also a legitimate
int value. Therefore, short always promotes to int, on every
C implementation everywhere. That wasn't so bad, was it?


Wups, disregard my previous post -- I spoke too soon!

How about unsigned short? Now things get tougher. On
many systems an int has 32 bits and an unsigned short has 16.
On these systems, every unsigned short value is also an int
value, so unsigned short promotes to int. But what about a
system where int and short are both 16 bits wide? On such
systems INT_MAX will be 32767 but USHRT_MAX will go all the
way up to 65535, so there exist unsigned short values that
are too large for int. On these systems, unsigned short will
promote to unsigned int.


Understood.

Similar issues arise with other narrow types: you usually
know that they are "narrow," but not what they'll promote to.
Even the lowly char promotes to int on some systems and to
unsigned int on others -- yes, Virginia, there are systems
where char and int have the same bit count, and if char is
unsigned (remember, the signedness of char is also at the
discretion of the implementation) CHAR_MAX will be equal to
UCHAR_MAX and both will be bigger than INT_MAX.

Why does C introduce all this confusion about promotion?
Hard to say for certain, but the confusion is probably an
attempt to make things simpler. ("Say, what?") Consider:
The underlying hardware probably doesn't have an instruction
that multiplies an unsigned short by a signed bit-field of
width eleven. Yet the language allows you as a programmer
to write such an expression, so what is the compiler to do
with it? C's answer is to promote the values to some flavor
of int, the machine's "natural" word size and the one most
likely to be supported best by the instruction set.

As a language, C is fairly close to the hardware. The
matter of promotion is one place where the hardware becomes
more visible to the programmer than it is elsewhere, and
sometimes more troublesome, too.


Takes for the explanation, very helpful.
 
P

pete

Frederick said:
Eric Sosman posted:


What if a signed short had 56 value bits and no padding bits,
and a signed int had 48 value bits and 8 padding bits?

Can't be that way.
I believe this would satisfy:

sizeof(int) >= sizeof(short)

There is no "sizeof(int) >= sizeof(short)" requirement.

The part of the standard, titled "Sizes of integer types"
says nothing about the sizes of integer types,
but only describes their minimum ranges.
but at the same time,
an "int" would have greater range than "short". In
such circumstances, you'd be promoting to a narrower type!

(Unless of course the Standard says
that a signed int must have greater or
equal range than a signed short... ?)

The standard does say that.

The other concept that needs to be introduced here, is "rank"

N869
6.2.5 Types
[#8] For any two types with the same signedness and
different integer conversion rank (see 6.3.1.1), the range
of values of the type with smaller integer conversion rank
is a subrange of the values of the other type.

6.3 Conversions
6.3.1.1 Boolean, characters, and integers
[#1] Every integer type has an integer conversion rank
defined as follows:
-- No two signed integer types shall have the same rank,
even if they have the same representation.
-- The rank of a signed integer type shall be greater than
the rank of any signed integer type with less
precision.
-- The rank of long long int shall be greater than the
rank of long int, which shall be greater than the rank
of int, which shall be greater than the rank of short
int, which shall be greater than the rank of signed
char.
-- The rank of any unsigned integer type shall equal the
rank of the corresponding signed integer type, if any.
-- The rank of any standard integer type shall be greater
than the rank of any extended integer type with the
same width.
-- The rank of char shall equal the rank of signed char
and unsigned char.
-- The rank of _Bool shall be less than the rank of all
other standard integer types.
-- The rank of any enumerated type shall equal the rank of
the compatible integer type (see 6.7.2.2).
-- The rank of any extended signed integer type relative
to another extended signed integer type with the same
precision is implementation-defined, but still subject
to the other rules for determining the integer
conversion rank.
-- For all integer types T1, T2, and T3, if T1 has greater
rank than T2 and T2 has greater rank than T3, then T1
has greater rank than T3.
 
E

Eric Sosman

Frederick said:
Eric Sosman posted:






What if a signed short had 56 value bits and no padding bits, and a signed
int had 48 value bits and 8 padding bits? I believe this would satisfy:

sizeof(int) >= sizeof(short)

There is no such requirement to be satisfied. What *is*
required is that the *values* of short must be a subset (not
necessarily a proper subset) of the *values* of int. Note that
the promotion rule is given in terms of *values*, not of padding
bits, value bits, little bits, itty bits, or manic fits.

From several of your recent posts, I get the impression that
you are now in that phase of fascination with representations
that besets many beginning programmers (and from which some few
never seem to recover). My advice, and I hope you will take it
seriously, is to FORGET about representational issues perhaps 90%
of the time. Don't think of an int as a collection of bits with
various purposes, think of it as a little iron box containing a
number. Most of the time YOU DON'T CARE how that number is
represented: two's complement, signed magnitude, Roman numerals,
or a carefully-calibrated pressure on the walls of the box. All
you SHOULD care about is that you can use the value in an expression,
and perhaps discard the box' current contents and fill it with a
new value.

There are exceptions. There are times when it is in fact
helpful or even necessary to think about the individual bits and
bit fields in a larger entity. But IMHO programmers do themselves
a disservice by thinking far too much about the representation of
an integer than about its value.

C programmers seem (after years of entirely non-scientific
observation) to be somewhat more likely than users of other
languages to persist in this infantile preoccupation. They are
always fretting about an integer's bits, about a pointer's bits,
about the arrangement of bit-field struct elements, about padding
bytes, about endianness, and so on and so on. Yes, it is true
that all of these issues sometimes need attending to -- but they
don't need the degree of worry that C programmers devote to them.
Oddly, C programmers don't seem unusually obsessed with the bits
in floating-point representations; they are usually content to let
the number just be itself, which I consider a healthy attitude.
I suggest that you worry about the bits of an int only SLIGHTLY
more often than you worry about the bits of a double. You'll be
better for it.
 
F

Frederick Gotham

Eric Sosman posted:



Many C++ books explain it like that. (But more on this below...)

From several of your recent posts, I get the impression that
you are now in that phase of fascination with representations
that besets many beginning programmers (and from which some few
never seem to recover).


I thinks it's because I want as firm an understanding as I can grasp of
the core features and functionality of the language.

Plus I just find it downright interesting.

While I can program proficiently at the moment, I feel a little
incompetent when the conversation switches to things like integer
promotion, and bitwise operations on signed integers.

I want to understand these things.

C programmers seem (after years of entirely non-scientific
observation) to be somewhat more likely than users of other
languages to persist in this infantile preoccupation. They are
always fretting about an integer's bits, about a pointer's bits,
about the arrangement of bit-field struct elements, about padding
bytes, about endianness, and so on and so on.


Which is exactly the reason why I'm on this newsgroup.

I program in C++ -- but if you've ever taken a look at comp.lang.c++, all
the conversation is to do with vectors, multiple inheritence, virtual
base class templates...

I want to talk about the core functionality of the language, and this is
the best place to do it, and it has the greatest minds when it comes to
it too. If I ask about integer promotion on this newsgroup rather than on
a C++ one, I'll get about 500% more responses.

C programmers don't seem unusually obsessed with the bits
in floating-point representations; they are usually content to let
the number just be itself, which I consider a healthy attitude.


I actually wanted to learn about floating-point representations at one
stage... but it only took one minute of reading through some very
technical documentation to put me off. I guess I just don't have the
"mathematical" kind of brain. Sure, I'm good at maths, but I rather drink
bleach than spend my day on permutations and other multi-syllabic words
of that nature... ; )

People who do a PhD in maths have a unique kind of brain; the kind of
brain that doesn't want to vomit when it comes to topics like
differentiation, integration, etc..
 
R

Roberto Waltman

Eric said:
From several of your recent posts, I get the impression that
you are now in that phase of fascination with representations
that besets many beginning programmers (and from which some few
never seem to recover). My advice, and I hope you will take it
seriously, is to FORGET about representational issues perhaps 90%
of the time.

I was about to post asking Mr. Gotham what was the reason for these
questions. I suspect there may be an problem he is trying to solve
that has not been explicitly described yet.

PS: I would change your "90% of the time" to either 99% or 100%,
depending on the rounding algorithm used.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,955
Messages
2,570,117
Members
46,705
Latest member
v_darius

Latest Threads

Top