is "typedef int int;" illegal????

W

Wojtek Lerch

Eric Sosman said:
Does the Standard require that the 1's bit and the
2's bit of an `int' reside in the same byte?
No.

Or is the
implementation free to scatter the bits of the "pure
binary" representation among the different bytes as it
pleases? (It must, of course, scatter the corresponding
bits of signed and unsigned versions in the same way.)

Of course.
If the latter, I think there's the possibility (a
perverse possibility) of a very large number of permitted
"endiannesses," something like

(sizeof(type) * CHAR_BIT) !
-----------------------------
(CHAR_BIT !) ** sizeof(type)

Argument: There are `sizeof(type) * CHAR_BIT' bits (value,
sign, and padding) in the object, so the number of ways to
permute the bits is the factorial of that quantity. But C
cannot detect the arrangement of individual bits within a
byte, so each byte of the object divides the number of
detectably different arrangements by `CHAR_BIT!'.

But of course you can detect the order, at least in cases where padding bits
don't obscure the view. Take a look at the representation of a power of
two. If there are no padding bits, only one of the bytes has a non-zero
value, and that value is a power of two as well. And of course you can
easily detect which power of two it is.

If you assume a clear distinction between padding bits and value bits, the
correct answer is

( sizeof(type) * CHAR_BIT ) !
------------------------------------------
( number_of_padding_bits ) !

But if you don't, things can get a little fuzzy. For instance, imagine an
implementation that requires that for any valid int representation, the top
two bits of its first byte must be either both set or both unset. It
doesn't matter which one you choose to consider a value bit and which one a
padding bit; but my formula counts those two choices as two distinct
combinations.
 
D

David R Tribble

Wojtek said:
Of course.

True, but ISO C also encompasses existing practice. And existing
practice throughout all of the history of mechanical computers has
been to arrange the digits of machine words in some reasonably
practical, if not completely obvious, way.

There is a limit to how far a language standard can go in covering all
implementations, and that limit is usually dictated by actual existing
implementations. ISO C does not cover trinary implementations
(as far as I can tell) for the simple reason that it does not have to.

All of which makes it entirely reasonable and possible to invent
a handful of standard macros that could adequately describe the
salient characteristics of the underlying native hardware words
used to implement the standard abstract datatypes of C.

But if there really did exist some arcane architecture that just
could not be described in this way, we can always provide a
macro like __STDC_NO_ENDIAN. I'm willing to bet, though,
that such a system could not support a conforming C
implementation in the first place.

-drt
 
W

Wojtek Lerch

David R Tribble said:
True, but ISO C also encompasses existing practice. And existing
practice throughout all of the history of mechanical computers has
been to arrange the digits of machine words in some reasonably
practical, if not completely obvious, way. ....
All of which makes it entirely reasonable and possible to invent
a handful of standard macros that could adequately describe the
salient characteristics of the underlying native hardware words
used to implement the standard abstract datatypes of C.

A language standard needs to be consistent about how far it allows
conforming implementations to deviate from the currently existing practice.
It doesn't make sense for one part of the standard to allow implementations
with strange bit orders, while another part contains definitions or
requirements that turn into meaningless gibberish when applied to such
implementations. If you want to mandate existing practice in this regard,
propose adding a requirement to the standard that bans strange bit orders.
Without such a ban, you'll need to carefully pick the words that specify
your macros, to make sure that they make sense even for implementations that
use strange bit orders. Or implementations that use strings and pulleys
rather than electric current in a semiconducting material.

And keep in mind that unless the types your macros describe have no padding
bits, they're completely useless anyway. (Or do you disagree?) Perhaps you
want to propose a ban on padding bits, too?
 
T

tedu

Wojtek said:
But of course you can detect the order, at least in cases where padding bits
don't obscure the view. Take a look at the representation of a power of
two. If there are no padding bits, only one of the bytes has a non-zero
value, and that value is a power of two as well. And of course you can
easily detect which power of two it is.

how could you do this?

assume i have a 4 bit unsigned int, to make things easy. the bits are
ordered 1423. so decimal to binary:
1 == 1000
2 == 0010
3 == 1010
....

how can you detect that 1 is bit pattern 1000?
 
J

Jordan Abel

how could you do this?

assume i have a 4 bit unsigned int, to make things easy. the bits are
ordered 1423. so decimal to binary:
1 == 1000
2 == 0010
3 == 1010
...

how can you detect that 1 is bit pattern 1000?

How about a 16-bit unsigned int that shows up as
16 15 14 13 12 11 10 9 87651423

If you set it to 1, you could look at it as an array of two unsigned
chars and it shows up as 0x00 0x04
 
K

Keith Thompson

tedu said:
how could you do this?

assume i have a 4 bit unsigned int, to make things easy. the bits are
ordered 1423. so decimal to binary:
1 == 1000
2 == 0010
3 == 1010
...

how can you detect that 1 is bit pattern 1000?

You can detect the value of each bit (at run time, probably not at
compile time) by using an array of unsigned char to constructing N
values, each with exactly one bit set to 1 and all the others set to 0.
This works only if there are trap representations.
 
K

Keith Thompson

David R Tribble said:
True, but ISO C also encompasses existing practice. And existing
practice throughout all of the history of mechanical computers has
been to arrange the digits of machine words in some reasonably
practical, if not completely obvious, way.

There is a limit to how far a language standard can go in covering all
implementations, and that limit is usually dictated by actual existing
implementations. ISO C does not cover trinary implementations
(as far as I can tell) for the simple reason that it does not have to.

All of which makes it entirely reasonable and possible to invent
a handful of standard macros that could adequately describe the
salient characteristics of the underlying native hardware words
used to implement the standard abstract datatypes of C.
[...]

There is precedent for introducing additional constraints on integer
representations. The C90 standard said very little about how integer
type are represented; C99 added a requirement that signed integers
must be either sign and magnitude, two's complement, or ones'
complement, and (after the standard was published) that all-bits-zero
must be a representation of 0.

If there are good reasons to do so, it might be reasonable to have
additional constraints in a new version of the standard, as long as no
existing or likely implementations violate the new assumptions. For
example, I doubt that any conforming C99 implementation would have a
PDP-11-style middle-endian representation (unless somebody's actually
done a C99 implementation for the PDP-11).

Figuring out what restrictions would be both reasonable and useful is
another matter.
 
W

Wojtek Lerch

Jordan Abel said:
How about a 16-bit unsigned int that shows up as
16 15 14 13 12 11 10 9 87651423

If you set it to 1, you could look at it as an array of two unsigned
chars and it shows up as 0x00 0x04

Run this program on your implementation:

#include <stdio.h>
#include <limits.h>

typedef unsigned short TYPE;

int main( void ) {
union {
TYPE bit;
unsigned char bytes[ sizeof(TYPE) ];
} u;
unsigned i, j;
for ( u.bit = 1; u.bit != 0; u.bit <<= 1 )
for ( i=0; i<sizeof(TYPE); ++i )
for ( j=0; j<CHAR_BIT; ++j )
if ( u.bytes & 1 << j )
printf( "%u\n", i * CHAR_BIT + j + 1 );
return 0;
}

If your machine has an N-bit unsigned short with no padding, the program
will print out a permutation of the numbers from 1 to N. The C standard
doesn't forbid any of the N! possible permutations, and the program allows
you to detect which one of them you're dealing with.
 
T

tedu

Jordan said:
How about a 16-bit unsigned int that shows up as
16 15 14 13 12 11 10 9 87651423

If you set it to 1, you could look at it as an array of two unsigned
chars and it shows up as 0x00 0x04

ah, thanks, i hadn't checked that unsigned char uses "pure binary
notation". but it would be 0x08, right?
 
J

Joe Wright

tedu said:
how could you do this?

assume i have a 4 bit unsigned int, to make things easy. the bits are
ordered 1423. so decimal to binary:
1 == 1000
2 == 0010
3 == 1010
...

how can you detect that 1 is bit pattern 1000?
Bits are not arbitrarily ordered the way bytes might be. Your four bits
are ordered 3210 (as powers of 2) and you couldn't change it if you
wanted to.
 
W

Wojtek Lerch

Joe Wright said:
Bits are not arbitrarily ordered the way bytes might be. Your four bits
are ordered 3210 (as powers of 2) and you couldn't change it if you wanted
to.

Bytes aren't ordered arbitrarily either; they're naturally ordered according
to their addresses. Each bit can be uniquely identified by giving the byte
offset from the beginning of the object and the power of two it represents
in its byte, when you look at it as a byte (i.e. an unsigned char).

At the same time, all the value bits of your integer type have a natural
order based on the powers of two they represent in that type. In principle,
that order has nothing to do with the order of bytes and bits within
bytes -- implementations are free to pick whatever mapping they want. It
just happens that taking contiguous ranges of the value bits and mapping
them to whole bytes without reordering the bits is the simplest way of
implementing C in silicon-based hardware -- and that's how all the existing
implementations map them, even though the C standard doesn't require it that
way.
 
J

Joe Wright

Wojtek said:
Bytes aren't ordered arbitrarily either; they're naturally ordered according
to their addresses. Each bit can be uniquely identified by giving the byte
offset from the beginning of the object and the power of two it represents
in its byte, when you look at it as a byte (i.e. an unsigned char).

At the same time, all the value bits of your integer type have a natural
order based on the powers of two they represent in that type. In principle,
that order has nothing to do with the order of bytes and bits within
bytes -- implementations are free to pick whatever mapping they want. It
just happens that taking contiguous ranges of the value bits and mapping
them to whole bytes without reordering the bits is the simplest way of
implementing C in silicon-based hardware -- and that's how all the existing
implementations map them, even though the C standard doesn't require it that
way.
At the hardware design level, manufacturers decide, by their own lights,
how to order the bytes in an integer object. Long ago Intel decided on
low byte low address (little endian). In the same era and for their own
reasons, Motorola decided to put the high byte at the low address (big
endian). I understand that DEC, CDC, Cray and others come up with their
own organizations of bytes within an object. But..

...The byte is the atomic object. The bits within the byte can't be moved
around like the bytes in a long. A byte with value one hundred will have
a binary bitset of 01100100 on all systems where byte is eight bits. And
you couldn't change it if you wanted to.
 
J

Jordan Abel

Bits are not arbitrarily ordered the way bytes might be. Your four bits
are ordered 3210 (as powers of 2) and you couldn't change it if you
wanted to.

But they could be ordered differently when you look at it as an int than
when you look at it as chars.
 
J

Jordan Abel

At the hardware design level, manufacturers decide, by their own lights,
how to order the bytes in an integer object. Long ago Intel decided on
low byte low address (little endian). In the same era and for their own
reasons, Motorola decided to put the high byte at the low address (big
endian). I understand that DEC, CDC, Cray and others come up with their
own organizations of bytes within an object. But..

..The byte is the atomic object. The bits within the byte can't be moved
around like the bytes in a long. A byte with value one hundred will have
a binary bitset of 01100100 on all systems where byte is eight bits.
And you couldn't change it if you wanted to.

But a byte with value one hundred followed by some more bytes each with
value zero could be a word with value three.
 
J

Joe Wright

Jordan said:
But they could be ordered differently when you look at it as an int than
when you look at it as chars.

No, they can't. The bits of a byte are ordered as they are. The bit
order cannot change between int and char.
 
P

pete

Joe said:
The bit order cannot change between int and char.

I don't think that there's any requirement
for the two lowest order bits of an int type object,
to be in the same byte,
if sizeof(int) is greater than one.
 
J

Jordan Abel

No, they can't. The bits of a byte are ordered as they are. The bit
order cannot change between int and char.

int may have padding bits. unsigned char may not. necessarily, the
padding bits in the int show up as value bits in the unsigned char.
 
W

Wojtek Lerch

Joe Wright said:
..The byte is the atomic object. The bits within the byte can't be moved
around like the bytes in a long. A byte with value one hundred will have a
binary bitset of 01100100 on all systems where byte is eight bits. And you
couldn't change it if you wanted to.

That's simply because you insist on displaying the bits in the conventional
order, with the most significant one on the left and the least significant
one on the right. By the same token, a 16-bit unsigned short with value
three hundred has to be displayed as the bit pattern 0000000100101100, and
there's no way to change that. But if you decide to order the bits
according to how they're laid out in the bytes, you might end up with
something like 00000001 00101100, or 00101100 00000001, or maybe even
11000010 00010000.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,176
Messages
2,570,950
Members
47,503
Latest member
supremedee

Latest Threads

Top