Keith Thompson said:
(snip)
In a conforming C implementation, as you say, a byte must be at
least 8 bits. That implies that a pointer must be at least 8 bits,
but not that all those bits are significant.
A hosted environment must support objects of at least 65535
bytes, which would imply at least 16-bit pointers even if only
one object existed -- but that doesn't apply to freestanding
(embedded) environments, as a 4-bit system is almost certain to
be. The requirement to support string literals of at least 4095
characters, assuming the resulting string can be supported at run
time, implies at least 12-bit pointers -- and more than that if
you want more than that single object in your running program.
Seems to me that the phrase "N bit system" is mostly useful when
discussing machines where the data register width, address width,
memory bus width, and ALU width are all the same, or maybe three
of them are the same. Otherwise, it is more useful if you
qualify the statement with which width(s) you mean.
As mentioned, the 4004, generally considered a 4 bit processor,
has 12 bit addresses, but I believe 4 bit ALU, 4 bit registers,
and 4 bit memory bus.
The 8080, (and 6800, 6502, and others of the time), generally
considered 8 bit processors, have a 16 bit address space,
but 8 bit registers (sometimes used in pairs), 8 bit ALU
and 8 bit memory bus.
IBM S/360 and S/370, generally considered 32 bit architectures,
have 32 bit registers, 24 bit address space, implementations
with ALU and memory data bus width from 8 to 64 bits.
The PDP-11, considered 16 bit, has 16 bit registers, 16 bit
addresses (though possibly with bank switching to allow for
a larger physical address space), byte addressable, but
commonly implemented with a 16 bit memory bus, and, I
believe, usually a 16 bit ALU.
The expansion of microprocessors fromn 8 to 16 bits tended
to also require a larger address space. The TI 9900 was one
of the earlier ones, but not so cheap or easy to use, so
as to catch on easily.
The 8086, more or less an extension to the 8080, with a 16 but
ALU, 16 bit memory bus, and 16 bit registers, but a 20 bit
address space, popularized the transition to 16 bit systems
affordable to more users. (The PDP-11, down to the LSI 11/03,
had been available, but not priced to compete with existing 8
bit systems.)
The PDP-10, generally considered 36 bits, with 36 bit registers,
presumably 36 bit ALU, I believe 36 bit data bus, but 18 bit
(user) address space (addressing 36 bit words). The address
space was expanded with an extended addressing system in later
processors.
I can imagine a conforming implementation with 16-bit data pointers,
of which 13 bits are significant, permitting 8192 distinct addresses.
On the other hand, I doubt that it would be worth the effort
to provide a conforming C implementation for a 4-bit system.
For example, it wouldn't be able to handle arrays of 4-bit objects
without compiler extensions.
Yes. For the usual uses, such as BCD calculators, it would be
very nice to have a 4 bit data type.
It might make sense to provide an implementation of a C-like language
that *doesn't* conform to the C standard, but that's closer to
the semantic level of the underlying hardware. Portability isn't
likely to be much of a concern; any software written for a 4-bit
system is likely intended to run only on that system.
In any case, the size of a pointer is going to be whatever it
needs to be. Saying it's a "4-bit system" doesn't provide enough
information to answer the question. (I think such systems typically
have much more ROM than RAM; there might be different kinds of
addresses to refer to them.)
As noted above, for smaller systems it is usually the address space
that is larger than the other three. It seems likely that the
width (though not necessarily the amount of actual memory available)
would be a multiple of four. Assuming addressing 4 bit nybbles,
the program space likely has to be larger than 256 addressable
units, so 12 or 16 bit addresses. Besides already knowing about
the 4004, 12 makes somewhat more sense. With the extra hardware
for 16 bit addressing, the cost of an 8 bit data bus isn't so
far off, even with a fundamental 4 bit data type.
But for such systems, including later ones such as the 8048
and successors, it is more usual to have separate data and
instruction space, (though maybe with one data bus) and
so separate width for data and instruction pointers.
So, one should not as for "the size of a pointer" but for
the "size of a data pointer" or "size of an instruction
pointer."
But as a question to get a discussion started, maybe it
isn't so bad. Just don't expect a simple answer.
-- glen