[given a particular problem of constructing a "pointer to address 0"
where CPU-address-0 holds the Interrupt Vector Table or "ivt":]
The --ivt expression will (just as the snippet of Mr Tobin) evaluate to 0,
whit the same results (in the hypothetical case of a non-0x00..0
null-pointer. Would that not give the same result? Ie. my pointe pointing to
some unwanted part of memory (or worse)?
Hypothetical. I allready checked the real-life situation.
There are a number of flawed ideas behind the question to start
with.
First, all we know about this hypothetical machine is that it has
an Interrupt Vector Table at CPU-address 0.
We need to know more. In order to make progress, I will define
some more about Version 1 of this particular hypothetical machine.
This is a word-addressed machine, with 32-bit words and 8-bit
"char"s.
The C compiler addresses chars with "byte pointers" that are
made by taking the machine's native "word pointers" and shifting
them left two bits. The two low-order bits are then used as
the byte index within the 32-bit word. Converting a byte
pointer to a word pointer uses a right-shift operation, discarding
the byte offset. Code of the form:
int *ip;
void *vp;
int x;
ip = &x;
vp = ip;
printf("(unsigned int)ip: %x (unsigned int)vp: %x\n",
(unsigned int)ip, (unsigned int)vp);
compiles to assembly of the form:
mov [addr_of_x], r1 # ip = &x
sll r1, 2, r2 # vp = ip
mov [addr_of_str], a0 # string in arg0 register
mov r1, a1 # arg1 in arg1 register
mov r2, a2 # arg2 in arg2 register
call printf # invoke printf()
and hence prints things like:
(unsigned int)xp: 0x100412c1 (unsigned int)vp: 0x40104b04
i.e., the value in vp is numerically four times greater than
that in xp. Pointer-to-integer casts simply take the raw value
stored in the pointer; it is up to the programmer to make sure
that he knows whether he is dealing with a byte pointer (with
the extra low-order bits) or a word pointer.
Structure pointers are always word pointers; structures are
always a multiple of four of the 8-bit bytes long. A struct
holding a single "char" has three bytes of padding.
We are almost there, but we still need to know how integer-to-pointer
conversions work. Here things are a bit odd: the integral constant
zero converts, at compile time, to the machine's internal nil
pointer, which is 0x3fffffff as a word pointer, and thus 0xfffffffc
as a byte pointer:
int *ip;
void *vp;
ip = 0;
vp = 0;
compiles to:
mov #3fffffff, r1 # ip = NULL
mov #fffffffc, r2 # vp = NULL
Since we need to address the IVT structure at CPU-location-zero
(not CPU-location-0x3fffffff), we cannot just write:
struct iv_s *ivt = 0; /* doesn't work - sets register to 0x3fffffff */
Adding a cast does not help, because we are still using a "null
pointer constant" as the C standard defines the term. So we
might resort to Old Wolf's attempt:
struct iv_s *ivt = (struct iv_s*) sizeof *ivt;
--ivt;
Unfortunately, this does not work either. Here sizeof *ivt is,
say, 64 -- big enough to hold 16 4-byte vector entries -- but
we need to set the register to 16, not 64. The reason is that
"--ivt" moves it down by 64 bytes, which is 16 words:
mov 64, r1 # ivt = (struct iv_s *)sizeof *ivs;
sub 16, r1 # ivt--
Hence the correct C code is:
struct iv_s *ivt = (struct iv_s *)16;
--ivt;
which compiles to:
mov 16, r1
sub 16, r1
leaving r1 set to 0 as desired. Or, equivalently, we can try:
const int i = 0;
struct iv_s *ivt = (struct iv_s *)0;
because in C, "i" is not an "integer constant" at all (despite
the red-herring "const" keyword), hence it is not an integer
constant zero. This might compile to:
mov 0, r1 # i = 0
mov r1, r2 # ivt = (struct iv_s *)i
Now we move from Version 1 of this machine to Version 2. Here the
compiler-writer has decided that he regrets his multiple pointer
formats with shift operations at every conversion. But he has not
chosen to make bytes be 32 bits long; instead, he has decided to
smuggle the 8-bit-byte offset into the *high* two bits of a 32-bit
word. (The hardware makes this particularly easy because the top
two bits are never put out on the address bus -- which is only 30
bits wide. The hardware uses 32-bit words, after all, so the
machine still addresses 4 giga-octets of memory.)
On Version 2 of the machine, we still need the same 16 that will
get subtracted by "--ivt", and the same C-source-level tricks work.
Version 3 of this machine, on the other hand, has new and different
hardware. The builders of Version 1 and Version 2 got sick of
trying to deal with an external 8-bit-wide world using 32-bit-wide
instructions, so they have rewired everything to use conventional
byte addressing. The machine's internal null pointers are still
0x3fffffff, for backwards-compatibility with Version 2, so:
struct iv_s *ivt = 0;
still does not work; but now instead of setting ivt to 16, we have
to set it to 64, because "--ivt" generates a "sub 64,r1":
struct iv_s *ivt = (struct iv_s *)64;
--ivt;
Of course, after spending dozens of man-years of work fixing broken
C code, the builders of this machine finally make Version 4, which
uses 8-bit-byte-addressed memory and has its internal null pointers
as all-bits-zero. They do this because any other arrangement is
just too painful. This is, of course, the same reason the IA32
architecture is still bug-for-bug compatible with the 80186: as it
turns out, hardware is quite soft, but software is almost impossibly
hard.