Bit-fields and integral promotion

C

Carsten Hansen

Suppose I'm using an implementation where an int is 16 bits.
In the program below, what function is called in the first case,
and what is called in the second case?
Also, if there is a difference between C89 and C99, I would
like to know.
I have tried with different compilers, and I see some differences.
Before I file a bug report with my C vendor, I would like to
know what the correct behavior is.

struct S
{
unsigned int a:4;
unsigned int b:16;
};

void foo(void);
void bar(void);

int main(void)
{
struct S s;
s.a = 0;
s.b = 0;

if (s.a - 5 < 0)
foo();
else
bar();

if (s.b - 5 < 0)
foo();
else
bar();

return 0;
}


Carsten Hansen
 
C

CBFalconer

Carsten said:
Suppose I'm using an implementation where an int is 16 bits.
In the program below, what function is called in the first case,
and what is called in the second case?
Also, if there is a difference between C89 and C99, I would
like to know.
I have tried with different compilers, and I see some differences.
Before I file a bug report with my C vendor, I would like to
know what the correct behavior is.

struct S
{
unsigned int a:4;
unsigned int b:16;
};

void foo(void);
void bar(void);

int main(void)
{
struct S s;
s.a = 0;
s.b = 0;

if (s.a - 5 < 0) foo();
else > bar();
if (s.b - 5 < 0) foo();
else bar();
return 0;
}

(quote slightly reformatted for clarity)

foo and bar, respectively. An unsigned can never be less than 0.
 
C

Chris Williams

Suppose I'm using an implementation where an int is 16 bits.

doesn't change anything. The problem isn't the "16" that is in there
but rather the "unsigned." Unsigned variables can only be positive
(they have no sign.) A signed variable has the option to be either
positive or negative--though it won't be able to be as big of a number
as an unsigned value (explained more at the bottom.)
In the program below what function is called in the first case,
and what is called in the second case?

Looks like bar() bar() to me which means that either I or Falconer is
spacing out. If it can never be negative, then it will never be less
than 0 so the first statement for both of these should be false and
bar() will be called.
Also, if there is a difference between C89 and C99, I would
like to know.

I'm a lightweight on the specifications so no comment.
I have tried with different compilers, and I see some differences.

If different compilers are giving different results, my best bet would
be that some are doing what they should do and not allowing an unsigned
value to be less than 0, while as others are trying to be intelligent
and correct the programmer's mistakes invisibly. While I wouldn't say
that the second "intelligent" compiler is buggy I certainly wouldn't
want to use it. Bugs are best corrected by you and not merely disappear
based on some particular quirk of your development
environment--otherwise the instant your environment changes, everything
comes to a halt.

And back to the topic of signed versus unsigned, the way that these
values are represented is exactly the same:

16 bit value
1111111111111111 = 65535
1111111111111111 = -1

The only way for your computer to know which one you meant is when you
specify "int" or "unsigned int." Based on that it will treat the same
exact bits in two entirely different ways.
To be specific, an unsigned number grows like this:

0 = 0
1 = 1
10 = 2
11 = 3
100 = 4
101 = 5
110 = 10
....
1111111111111101 = 65533
1111111111111110 = 65534
1111111111111111 = 65535

A signed number is the same up until the 16th bit

0 = 0
1 = 1
10 = 2
11 = 3
....
111111111111101 = 32765 //15 bits
111111111111110 = 32766
111111111111111 = 32767
1000000000000000 = -32768 //16 bits
1000000000000001 = -32767
1000000000000010 = -32766
....
1111111111111101 = -3
1111111111111110 = -2
1111111111111111 = -1

In the end, the word "hello" <- right there is just a bunch of ones and
zeroes. It is all just a matter of how you tell your computer it should
interpret it.
0010 1101 0100 0011 0110 1000 0111 0010 0110 1001 0111 0011
 
W

Wojtek Lerch

Chris Williams said:
Looks like bar() bar() to me which means that either I or Falconer is
spacing out. If it can never be negative, then it will never be less
than 0 so the first statement for both of these should be false and
bar() will be called.

What can never be negative? s.a can't, but s.a-5 can.


6.7.2.1p9: "A bit-field is interpreted as a signed or unsigned integer type
consisting of the specified number of bits." The type of s.a is a 4-bit
unsigned type. Since all the values of such a type can be represented by
int, the integer promotions convert it to int rather than to unsigned int,
and the value of s.a-5 is -5 rather than UINT_MAX-4.
 
C

CBFalconer

CBFalconer said:
(quote slightly reformatted for clarity)

foo and bar, respectively. An unsigned can never be less than 0.

I was wrong - bar and bar. For some reason I assumed the a field
was specified as int. Chris Williams made me look again.
 
C

Carsten Hansen

CBFalconer said:
I was wrong - bar and bar. For some reason I assumed the a field
was specified as int. Chris Williams made me look again.

Assume you are using an implementation with 32-bits int, and you change
the width of b to be 32, does that change your answer?

gcc, Intel's compiler, Comeau's compiler and Metrowerks compiler all
give the answers
foo
bar

whereas Microsoft's compiler gives the answer
bar
bar

Carsten Hansen
 
W

Wojtek Lerch

CBFalconer said:
I was wrong - bar and bar. For some reason I assumed the a field
was specified as int. Chris Williams made me look again.

Since all the possible values of a four-bit unsigned bit-field can be
represented by an int, I'd expect it to get promoted to int, rather than
unsigned. What's my mistake?
 
C

CBFalconer

Wojtek said:
What can never be negative? s.a can't, but s.a-5 can.

6.7.2.1p9: "A bit-field is interpreted as a signed or unsigned integer type
consisting of the specified number of bits." The type of s.a is a 4-bit
unsigned type. Since all the values of such a type can be represented by
int, the integer promotions convert it to int rather than to unsigned int,
and the value of s.a-5 is -5 rather than UINT_MAX-4.

Aha. The installation can interpret a bit field as either signed
or unsigned, but I believe needs to document which choice it has
taken. Thus either original behaviour can be legitimate, depending
on the system documentaion. This could be used as a test for what
the documentation should say.
 
X

xarax

Carsten Hansen said:
Assume you are using an implementation with 32-bits int, and you change
the width of b to be 32, does that change your answer?

That should not change the answer. Remember that the
field is unsigned, regardless of its bit width. Therefore,
the subtraction expression is unsigned.
gcc, Intel's compiler, Comeau's compiler and Metrowerks compiler all
give the answers
foo
bar

Check their documentation to see if they support
unsigned bit fields. They may be ignoring the
"unsigned" qualifier.
whereas Microsoft's compiler gives the answer
bar
bar

That's what I would expect from a compiler that
supports unsigned bit fields.
 
X

xarax

Wojtek Lerch said:
What can never be negative? s.a can't, but s.a-5 can.


6.7.2.1p9: "A bit-field is interpreted as a signed or unsigned integer type
consisting of the specified number of bits."

That's apparently what the standard says.
The type of s.a is a 4-bit unsigned type. Since all the values of such a type
can be represented by int, the integer promotions convert it to int rather
than to unsigned int, and the value of s.a-5 is -5 rather than UINT_MAX-4.

You added that part.

Are you saying that, say, an 8-bit unsigned char can
be promoted to a wider signed int? Please quote the
standard that says unsigned integers can be widened
to signed integers.
 
T

TTroy

Wojtek said:
Since all the possible values of a four-bit unsigned bit-field can be
represented by an int, I'd expect it to get promoted to int, rather than
unsigned. What's my mistake?

My compiler runs foo and foo (quit different from others'). On my
system, ints are 32-bits, since a signed int can hold all possible
values of a 16bit-field, i am presuming it's integrally promoting s.a
and s.b into signed int within the if conditions. Then subtracking -5
(a signed int constant) yields a -5.. which is less than 0.

I guess that's where I get foo, foo from. Then again, I just learned
the term "integral promotion" yesterday.
 
W

Wojtek Lerch

xarax said:
That's apparently what the standard says.


You added that part.

The part outside the quotation marks? Yes, of course.
Are you saying that, say, an 8-bit unsigned char can
be promoted to a wider signed int? Please quote the
standard that says unsigned integers can be widened
to signed integers.

Yes.

6.3.1.1p2: "If an int can represent all values of the original type, the
value is converted to an int; otherwise, it is converted to an unsigned int.
These are called the integer promotions."
 
T

TTroy

xarax said:
That's apparently what the standard says.
UINT_MAX-4.

You added that part.

Are you saying that, say, an 8-bit unsigned char can
be promoted to a wider signed int? Please quote the
standard that says unsigned integers can be widened
to signed integers.

Like I said in my other post, there is total confusion around integral
promotions and arithmetic conversions. All 5 of the 5 C programmers at
my workplace don't understand it, which says a lot (about them and the
topic). I bet Chris Torek is the only one who truly understands it.
 
W

Wojtek Lerch

CBFalconer said:
Aha. The installation can interpret a bit field as either signed
or unsigned, but I believe needs to document which choice it has
taken. Thus either original behaviour can be legitimate, depending
on the system documentaion. This could be used as a test for what
the documentation should say.

No, the implementation only has choice when the bit-field is declared as
plain "int":

6.7.2p5: "Each of the comma-separated sets designates the same type, except
that for bit-fields, it is implementation-defined whether the specifier int
designates the same type as signed int or the same type as unsigned int."
 
C

CBFalconer

Wojtek said:
Since all the possible values of a four-bit unsigned bit-field can
be represented by an int, I'd expect it to get promoted to int,
rather than unsigned. What's my mistake?

IMO the field was specified as unsigned int, so the 4 bits are
expressed as such a type. Then the arithmetic with an int is done,
and arithmetic between unsigned and signed requires that the signed
be promoted to unsigned. Thus the result is unsigned.
 
T

TTroy

CBFalconer said:
IMO the field was specified as unsigned int, so the 4 bits are
expressed as such a type. Then the arithmetic with an int is done,
and arithmetic between unsigned and signed requires that the signed
be promoted to unsigned. Thus the result is unsigned.

But isn't that only true if we're converting between types of the same
width.
In this case,the 4-bit field is to be converted to an int. Since an int
can hold any value a 4-bit field can, doesn't the integral promotion
default to an int(signed)?
 
J

James Kuyper

CBFalconer said:
IMO the field was specified as unsigned int, so the 4 bits are
expressed as such a type. Then the arithmetic with an int is done,
and arithmetic between unsigned and signed requires that the signed
be promoted to unsigned. Thus the result is unsigned.


6.5.6p4, describing subtraction: "If both operands have arithmetic type,
the usual arithmetic conversions are performed on them."

6.3.1.8p1, defining the usual arithmetic conversions: "... Otherwise,
the integer promotions are performed on both operands. ..."

6.3.1.1p2, defining the integer promotions, for among other things, bit
fields of type unsigned int: "... If an int can represent all values of
the original type, the value is converted to an int; otherwise, it is
converted to an unsigned int. These are called the _integer promotions_."

What am I missing?
 
K

Kevin Bracey

In message <[email protected]>
"TTroy said:
But isn't that only true if we're converting between types of the same
width. In this case,the 4-bit field is to be converted to an int. Since an
int can hold any value a 4-bit field can, doesn't the integral promotion
default to an int(signed)?

Section 6.3.1.1p2:

"The following may be used in an expression wherever an int or unsigned
int may be used:

- An object or expression with an integer type whose integer
conversion rank is less than the rank of int and unsigned int.
- A bit-field of type _Bool, int, signed int or unsigned int.

If an int can represent all values of the original type, the value
is converted to an int; otherwise it is converted to an unsigned int.
These are called the integer promotions."

The ambiguity arises from what the "original type" is, and hence what "all
values" are. In the case of the 4-bit unsigned bitfield, is it of type
unsigned int, so all values are 0..UINT_MAX, or are all values 0..15?

I'd always understood it to be the latter interpretation, so it promotes
to int. A look check at our compiler agrees with this - it promotes to int,
unless in pcc compatibility mode where it promotes to unsigned int.

This may be supported by 6.7.2.1p9 which states:

"A bit-field is interpreted as a signed or unsigned integer type
consisting of the specified number of bits."

That at least gives wording strong enough to allow the bit-field to have
a distinct type with range 0..15 for the purposes of 6.3.1.1p2.
 
W

Wojtek Lerch

CBFalconer said:
IMO the field was specified as unsigned int, so the 4 bits are
expressed as such a type. Then the arithmetic with an int is done,
and arithmetic between unsigned and signed requires that the signed
be promoted to unsigned. Thus the result is unsigned.

The definition of integer promotion say quite specifically that for integers
whose conversion rank is less than the rank of int, as well as for
bit-fields, whether the value is converted to int or unsigned int depends on
whether int can represent all the values of the "original" type. So I guess
the question is what exactly the "original" type is for bit-fields.

Unfortunately, the standard doesn't seem pedantically consistent about the
type of bit-fields. In 6.7.2.1p9, it says that "a bit-field is interpreted
as a signed or unsigned integer type consisting of the specified number of
bits". On the other hand, 6.7.2.1p4 says, "A bit-field shall have a type
that is a qualified or unqualified version of _Bool, signed int, unsigned
int, or some other implementation-defined type", and 6.3.1.1p2 also talks
about "a bit-field of type _Bool, int, signed int, or unsigned int". Why
does the latter make the distinction between "int" and "signed int" while
the former does not? If a four-bit field is declared as "int" on an
implementation that makes it unsigned, does that mean that the bit-field is
"of type int" but "has type unsigned int" and is "interpreted" as a four-bit
unsigned type? Which of those three different types is the "original" type
that the definition of integer promotions refers to?...
 
C

Chris Williams

Gah. Knew I shouldn't have posted. Or at least not without first
plugging it into a compiler to verify I wasn't talking out of ...my
ear.

Apologies,
Chris
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,997
Messages
2,570,239
Members
46,827
Latest member
DMUK_Beginner

Latest Threads

Top