is "typedef int int;" illegal????

K

kuyper

Joe said:
No, they can't. The bits of a byte are ordered as they are. The bit
order cannot change between int and char.

Citation please?

I don't see anything in the standard that requires the value bits of
any two unrelated integer types to be in the same order. It's certainly
feasible, though very expensive, for an implementation to have 'int'
values represented using the bits within each byte in the reverse
order with which those bits would be interpreted as unsigned char. Such
an implementation would be very unnatural, but it would be perfectly
feasible, and could be done in a way that's perfectly conforming. If
you can find a clause in the standard prohibiting such an
implementation, please cite it.

It would be much more plausible at the hardware level: I think it would
be quite feasible to design a chip where instructions that work on
2-byte words interpret the values of bits in precisely the reverse
order of the way that they're interpreted by instructions that work on
one byte at a time. I can't come up with any good reason to do so, but
I suspect it could be done fairly efficiently, achieving almost the
same speeds as more rationally-designed hardware.

The point isn't that there's any good reason to do this; I can't come
up with any. The point is that the standard deliberately fails to
specify such details. I believe that the people who wrote the standard
worked on the principle that it should avoid specifying anything that
it doesn't have a pressing need to specify. That makes it possible to
implement C on a wide variety of platforms, including ones using
technologies that didn't even exist when the standard was first
written. Can you think of any reason why the standard should specify
that unrelated integer types order their bits within each byte the same
way?
 
J

Joe Wright

Wojtek said:
That's simply because you insist on displaying the bits in the conventional
order, with the most significant one on the left and the least significant
one on the right. By the same token, a 16-bit unsigned short with value
three hundred has to be displayed as the bit pattern 0000000100101100, and
there's no way to change that. But if you decide to order the bits
according to how they're laid out in the bytes, you might end up with
something like 00000001 00101100, or 00101100 00000001, or maybe even
11000010 00010000.

Displaying bits of a byte in conventional order is a "good thing"
because it allows you and I to know what we are talking about. My main
point is that at the byte level, we must do that. The value five is
always 00000101 at the byte level. Always.

CPU "design" will determine the byte order of objects in memory. The
"design" cannot determine the bit order of a byte simply because byte is
the finest granularity available. The CPU cannot address a 'bit'.
 
J

Joe Wright

pete said:
I don't think that there's any requirement
for the two lowest order bits of an int type object,
to be in the same byte,
if sizeof(int) is greater than one.
Ok, I'll play. Assume sizeof (int) is 2.

int i;
char c = 3;

Assume c looks like 00000011

i = c;

I suppose little endian i looks like 00000011 00000000
and big endian i looks like 00000000 00000011

Your turn.
 
P

pete

Joe said:
Ok, I'll play. Assume sizeof (int) is 2.

int i;
char c = 3;

Assume c looks like 00000011

i = c;

I suppose little endian i looks like 00000011 00000000

If the two lowest order bits, are in seperate bytes, then it's:
00000001 00000001
in either endian
 
P

pete

pete said:
If the two lowest order bits, are in seperate bytes, then it's:
00000001 00000001
in either endian

For sizeof(int) == 2, CHAR_BIT == 8

c = 0: 00000000 00000000
c = 1: 00000001 00000000
c = 2: 00000000 00000001
c = 3: 00000001 00000001
c = 4: 00000010 00000000
c = 5: 00000011 00000000
c = 6: 00000010 00000001
c = 7: 00000011 00000001
c = 8: 00000000 00000010
 
W

Wojtek Lerch

pete said:
Joe Wright wrote: ....

If the two lowest order bits, are in seperate bytes, then it's:
00000001 00000001

Why couldn't it be

10000000 10000000

or

00000001 10000000

?
in either endian

I don't think "endian" applies here. How do you define "endian" without
assuming that bits are grouped in bytes according to their value?
 
P

pete

Wojtek said:
Why couldn't it be

10000000 10000000

or

00000001 10000000

?

Those are fine.
I don't think "endian" applies here.
How do you define "endian" without
assuming that bits are grouped in bytes according to their value?

The bits *are* grouped in bytes according to their values.

It's just that
"the two lowest order bits, are in seperate bytes"
is an incomplete specification.
 
W

Wojtek Lerch

Joe said:
Displaying bits of a byte in conventional order is a "good thing"
because it allows you and I to know what we are talking about. My main
point is that at the byte level, we must do that. The value five is
always 00000101 at the byte level. Always.

Displaying bits in the conventional order is often a "good thing"
because it simplifies communication by allowing you to assume that the
convention doesn't need to be explained. But that doesn't make it the
only possible order, or even the only useful order. In a discussion
about serial transmission of data, it may be more appropriate to
display the bits in the order they're transmitted; and if the protocol
being discussed transmits the least significant bit first, you'll end
up displaying a byte with the value five as 10100000. Or maybe just
1010000, if it's a seven-bit protocol. Similarly, if you were
explaining how the bits are represented by the state of transistors in
some chip, you might prefer to display them in the order they're laid
out in the chip. There are many ways to order the bits of a byte, and
there's no rule in the C standard that forbids displaying them in an
unconventional order.
CPU "design" will determine the byte order of objects in memory. The
"design" cannot determine the bit order of a byte simply because byte is
the finest granularity available. The CPU cannot address a 'bit'.

*Which* CPU cannot address a bit? My understanding is that some can.
Anyway, what does that have to do with the C standard?

The bits are just some physical circuits in silicon. Some operations
of the CPU are designed to implement some mathematical operations, in
which case the bits are designed to represent some mathematical values
-- typically, various powers of two. Depending on the operation, the
same physical bit may represent different values: for
instance, the bit that represents the value 1 in an 8-bit operation may
represent the value 0x100 in a 16-bit operation. The exact rules of
how the various operations assign values to the various pieces of
silicon are not the busines of the C standard; the only thing the C
standard does require is that if you look at the contents of a region
of memory as a single value of an integer type T and then as sizeof(T)
values of type unsigned char, then there must be a mapping between
those values that can be described in terms of the bits of the binary
representations of the values. The text doesn't say how the mapping
must order the bits, only that it must exist. If you believe that
there is a requirement there that I have missed, please let me know
where to find it.
 
K

kuyper

Joe Wright wrote:
....
Displaying bits of a byte in conventional order is a "good thing"
because it allows you and I to know what we are talking about. My main
point is that at the byte level, we must do that. The value five is
always 00000101 at the byte level. Always.

An implementation can generate code for integer arithmetic which
handles a bit pattern of 00000101 as if it represented, for example, a
value of 160. This is not at all "natural", or efficient, but it could
still conform to the C standard. The standard doesn't say what it would
need to say to make it nonconforming. That fact is precisely what
ensures that it would also be feasible (though difficult) to create a
conforming implementation of C for a platform with which implements
trinary arithmetic or binary-coded-decimal (BCD) arithmetic at the
hardware level (I mention those two, out an infinity of other
possibilities, because actual work has been done on both of those kinds
of hardware, though I'm not sure trinary computers were ever anything
but a curiosity).
CPU "design" will determine the byte order of objects in memory. The
"design" cannot determine the bit order of a byte simply because byte is
the finest granularity available. The CPU cannot address a 'bit'.

I agree about bits not being addressable (at least on most
architectures - I remember vaguely hearing about machines where they
were addressable). However, the implementation can generate code which
extracts the bits, and inteprets them in any fashion that the
implementation chooses, regardless of what interpretation the hardware
itself uses for those bits. Any hardware feature which made that
impossible would also render it impossible to implement C's bitwise
operators, because support for arbitrary reinterpretation of bit
patterns can be built up out of those operators.

And, as I said before, nothing prevents the hardware itself from
interpreting bit patterns in different ways depending upon which
instructions are used, or which mode of operation has been turned on. I
know I've seen hardware with the ability to interpret bytes as either
binary or BCD, depending upon which instructions were used.
 
K

kuyper

Joe said:
Ok, I'll play. Assume sizeof (int) is 2.

What is is you're "playing"? You didn't address the point he raised.
int i;
char c = 3;

Assume c looks like 00000011

i = c;

I suppose little endian i looks like 00000011 00000000
and big endian i looks like 00000000 00000011

And I suppose that another possiblity is that i looks like 00010000
00000100. What does the standard say that rules out my supposition?
What does it say to make your two suppositions the only possibilities?
You aren't "playing" until you actual cite the relevant text which my
supposition would violate.
 
D

Dave Thompson

Jordan Abel wrote:
...

They aren't (6.7.2p2), but conceptually it would be a meaningful
concept, and I suspect there are certain obscure situations where
they'd be useful.

PL/I provides both COMPLEX FLOAT and COMPLEX FIXED, where FIXED is
already a generalization from integer.

I can't decide whether that's a point for or against the idea.

- David.Thompson1 at worldnet.att.net
 
D

Douglas A. Gwyn

Keith said:
There is precedent for introducing additional constraints on integer
representations. The C90 standard said very little about how integer
type are represented; C99 added a requirement that signed integers
must be either sign and magnitude, two's complement, or ones'
complement, and (after the standard was published) that all-bits-zero
must be a representation of 0.

Actually C90 required use of a "binary numeration system", which
according to the reference DP dictionary for C89 was almost the
same thing (we overlooked to possibility of a "bias" added to
the pure-binary interpretation of the bit values). C99 just
made it clear that we didn't mean for there to be a bias. The
all-bits-zero requirement was always our intent; indeed we
often said (among ourselves at least) that calloc() would
properly initialize all integer-type members within the
allocated structure. It was only when it was pointed out that
the intent wasn't actually guaranteed by the spec that we
decided to fix that, since it is considered to be an important
property that is widely exploited in practice.
... I doubt that any conforming C99 implementation would have a
PDP-11-style middle-endian representation (unless somebody's actually
done a C99 implementation for the PDP-11).

But there is little value in excluding the PDP-11 longword
format if you're going to allow any variability at all.

In fact I do work with a PDP-11 C (native and cross-) compiler
that is migrating in the direction of C99 conformance.

If you want tighter specification of (integer) types, do so with
some additional mechanism (as we did with <stdint.h>), not by
trying to reduce the range of platforms that can reasonably
conform to the general standard.
 
D

Douglas A. Gwyn

Joe said:
..The byte is the atomic object. The bits within the byte can't be moved
around like the bytes in a long. A byte with value one hundred will have
a binary bitset of 01100100 on all systems where byte is eight bits. And
you couldn't change it if you wanted to.

There is more to it than that.

01100100(2) is *by convention* just a way of denoting a
mathematical entity that can also be denoted 100(10).
On virtually every current architecture, there is no
actual ordering of bits within hardware storage units
(which often are wider than 8 bits). The ALU imposes
an interpretation on some of the bit "positions" when
it performs carry upon addition, for example, and if
the programming language is going to properly map
human notions of arithmetic operations onto storage
bits *and* exploit ALU arithmetic operations, then it
will have to arrange for values to be represented in
the "native" format. But in principle the PL could
store the bits in some different order and simulate
those few operations where that would affect the
result. You could tell the difference only by
externally probing the data bus (or some similar means).

The C standard requires that the human-notion integer
values act with respect to arithmetic operations the
way that we normally think of them operating, and since
binary/octal/hexadecimal/decimal notations all have a
well-known standard interrelationship, that determines
a lot of the properties that integers will appear to
have. Combine that with the implementor's wanting to
conform with the machine architectural conventions so
that he can maximally exploit the available hardware,
and that nails down most choices -- but differently for
different architectures.
 
K

Keith Thompson

Douglas A. Gwyn said:
Keith Thompson wrote: [...]
... I doubt that any conforming C99 implementation would have a
PDP-11-style middle-endian representation (unless somebody's actually
done a C99 implementation for the PDP-11).

But there is little value in excluding the PDP-11 longword
format if you're going to allow any variability at all.

In fact I do work with a PDP-11 C (native and cross-) compiler
that is migrating in the direction of C99 conformance.

Well, bang goes that idea.
If you want tighter specification of (integer) types, do so with
some additional mechanism (as we did with <stdint.h>), not by
trying to reduce the range of platforms that can reasonably
conform to the general standard.

My thought was that if the requirements could be tightened up without
affecting any implementations, it might be worth doing. The fact that
there are still PDP-11 C implementations means that's not a
possibility in this case.

Out of idle curiosity, is it still necessary for PDP-11
implementations to use a middle-endian representation? It's
implemented purely in software, right? I suppose compatibility with
exising code and with data files is enough of a reason not to change
it now.

What byte ordering is used for 64-bit integers (long long)?
 
D

Douglas A. Gwyn

Keith said:
Out of idle curiosity, is it still necessary for PDP-11
implementations to use a middle-endian representation? It's
implemented purely in software, right? I suppose compatibility with
exising code and with data files is enough of a reason not to change
it now.

It is almost never logically "necessary" to use any particular
representation. In fact, DEC PDP-11 FORTRAN used a strictly
little-endian layout for 32-bit integers, largely in order to
permit "pass by reference" punning pointer-to-long as pointer-to-
short or pointer-to-byte, which was convenient for some libraries.
Ritchie's PDP-11 C implementation uses the mixed-endian layout,
which reduces generated code size (and time consumed) in some
cases, since the FP11 long-integer instructions assume that
memory layout. *Some* operations are simulated in software and
might have similar code either way, but when the hardware
long-arithmetic operations are used there is a benefit to
conforming with their expectations.
And yes, compatibility with existing data formats, particularly
on the same platform, is a significant constraint when making
such choices in the real world.
What byte ordering is used for 64-bit integers (long long)?

That's not yet implemented in my version of Ritchie's compiler,
due to too much knowledge of the specific original architecture
being exploited throughout the lower levels of the implementation.
(It was hard enough to find the implict assumptions that the host
is a PDP-11 in the original code and replace those.)
I do parse "long long" but map it onto plain "long", which isn't
C99 conformant since that's only 32 bits wide.
My inclination would be to represent the 64-bit integer type as
strictly little-endian and use trimmed-down multiple-precision
algorithms for the run-time support.
Perhaps the GCC PDP-11 target (developed by comebody else) has
64-bit support for long long, in which case other implementations
should probably follow its lead.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,176
Messages
2,570,950
Members
47,503
Latest member
supremedee

Latest Threads

Top