IVT at address 0

D

dandelion

Hi,

Just another question for the standards jockeys...

Suppose I have an Interrupt Vector Table located at address 0x0000 (16-bit
machine). I want to dump the context of the IVT, by treating it as an array
starting at (you guessed it) 0x0000. So I would have

struct iv_s* ivt = (struct iv_s *) 0x0000;

Which will yield a NULL pointer which may not be dereferenced, lest
undefined behavior would result.

Question: How do I get by this? Is it possible without copying the table (in
assembly to avoid dereferencing any pointers) as the current solution does
or blatantly ignore any UB and just count on my compiler not to mind too
much?
 
R

Richard Tobin

dandelion said:
Suppose I have an Interrupt Vector Table located at address 0x0000 (16-bit
machine).

You can't access such a thing without going outside what standard C
defines, so you shouldn't worry about it being address 0 any more than
if it was address 123456. Either your operating system and compiler
make it work, or they don't; the C standard doesn't come into it.

-- Richard
 
D

dandelion

Richard Tobin said:
You can't access such a thing without going outside what standard C
defines, so you shouldn't worry about it being address 0 any more than
if it was address 123456. Either your operating system and compiler
make it work, or they don't; the C standard doesn't come into it.

That's what I figured. Thanks.
 
R

Richard Bos

dandelion said:
Suppose I have an Interrupt Vector Table located at address 0x0000 (16-bit
machine). I want to dump the context of the IVT, by treating it as an array
starting at (you guessed it) 0x0000. So I would have

struct iv_s* ivt = (struct iv_s *) 0x0000;

Which will yield a NULL pointer which may not be dereferenced, lest
undefined behavior would result.

Well, first of all, if your system allows you to read and write from
address 0x0000 at all, its way to handle this case of UB should be to
behave as if you used a valid pointer.
However, beware of the snark. For all other integer values, I would
expect the above line to yield the expected pointer. A literal 0,
however, is special. It must, as you say (albeit using the wrong
capitalisation), evaluate to a null pointer. And a null pointer need not
be address zero. So, to make doubly sure that you do get address zero, I
suggest you use

int i=0;
struct iv_s *ivt=(struct iv_s *)i;

Of course, the pointer value resulting from such a conversion is system-
specific in any case, so even this is no guarantee. However, on the
(admittedly rare) systems where a null pointer is not all bits zero
internally, I would expect the one-line snippet to fail, whereas the
two-line code will probably work even there.

Richard
 
D

dandelion

Richard Bos said:
"dandelion" <[email protected]> wrote:

int i=0;
struct iv_s *ivt=(struct iv_s *)i;

<snip>

Don't worry...

I've just checked. UB or no UB, it works fine. NULL is defined as ((void *)
0) and portability is not an issue. The SW is laced with direct hw I/O
anyway.
Of course, the pointer value resulting from such a conversion is system-
specific in any case, so even this is no guarantee. However, on the
(admittedly rare) systems where a null pointer is not all bits zero
internally, I would expect the one-line snippet to fail, whereas the
two-line code will probably work even there.

Since the compiler has no way of knowing (well, that's a bit strong, but
probably it don't know) that i is 0 at runtime and therefore treats it like
any other pointer. As soon as i find a compiler for which NULL != ((void *)
0) i'll try your suggestion.

For now i'd like to keep it simple.

Ok. Thanks.
 
O

Old Wolf

dandelion said:
Since the compiler has no way of knowing (well, that's a bit strong, but
probably it don't know) that i is 0 at runtime and therefore treats it like
any other pointer. As soon as i find a compiler for which NULL != ((void *)
0) i'll try your suggestion.

The only options for NULL are 0 and (void *)0 .
(And any other expression that evaluates to that, eg. (void *)(3 - 3)).
The expression (NULL != ((void *)0) must always be false.

I'm not convinced that the above code won't give you a null
pointer anyway. If your system has a null pointer as
all-bits-zero then the difference is probably academic, but
I would do it this way:

struct iv_s *ivt = (struct iv_s*) sizeof *ivt;
--ivt;
 
D

dandelion

Old Wolf said:
The only options for NULL are 0 and (void *)0 .
(And any other expression that evaluates to that, eg. (void *)(3 - 3)).
The expression (NULL != ((void *)0) must always be false.

Yes. "NULL != ((void *)" is not the correct way to phrase what I had in
mind. I intended a situation there the null-pointer is not equal to all bits
zero. Sorry for the confusion (I *did* red the FAQ).
I'm not convinced that the above code won't give you a null
pointer anyway. If your system has a null pointer as
all-bits-zero then the difference is probably academic, but
I would do it this way:

struct iv_s *ivt = (struct iv_s*) sizeof *ivt;
--ivt;

The question *is* academic, since on this platform 0x0000 is the
null-pointer, but i'll ask anyway...

The --ivt expression will (just as the snippet of Mr Tobin) evaluate to 0,
whit the same results (in the hypothetical case of a non-0x00..0
null-pointer. Would that not give the same result? Ie. my pointe pointing to
some unwanted part of memory (or worse)?

Hypothetical. I allready checked the real-life situation.
 
O

Old Wolf

dandelion said:
The --ivt expression will (just as the snippet of Mr Tobin) evaluate to 0,
whit the same results (in the hypothetical case of a non-0x00..0
null-pointer. Would that not give the same result? Ie. my pointe pointing to
some unwanted part of memory (or worse)?

I don't think so. If there is actually an object of type 'struct iv_s'
at address 0, and the first line above worked as expected,
then you have a pointer to one-past-the-end of that object,
so it must be legal to decrement it and then be pointing to
the object.
This code doesn't ever convert an int 0 to a pointer.
Hypothetical. I allready checked the real-life situation.

You found a compiler with NULL not all-bits-zero ?
 
C

Chris Torek

[given a particular problem of constructing a "pointer to address 0"
where CPU-address-0 holds the Interrupt Vector Table or "ivt":]

The --ivt expression will (just as the snippet of Mr Tobin) evaluate to 0,
whit the same results (in the hypothetical case of a non-0x00..0
null-pointer. Would that not give the same result? Ie. my pointe pointing to
some unwanted part of memory (or worse)?

Hypothetical. I allready checked the real-life situation.

There are a number of flawed ideas behind the question to start
with.

First, all we know about this hypothetical machine is that it has
an Interrupt Vector Table at CPU-address 0.

We need to know more. In order to make progress, I will define
some more about Version 1 of this particular hypothetical machine.

This is a word-addressed machine, with 32-bit words and 8-bit
"char"s.

The C compiler addresses chars with "byte pointers" that are
made by taking the machine's native "word pointers" and shifting
them left two bits. The two low-order bits are then used as
the byte index within the 32-bit word. Converting a byte
pointer to a word pointer uses a right-shift operation, discarding
the byte offset. Code of the form:

int *ip;
void *vp;
int x;

ip = &x;
vp = ip;

printf("(unsigned int)ip: %x (unsigned int)vp: %x\n",
(unsigned int)ip, (unsigned int)vp);

compiles to assembly of the form:

mov [addr_of_x], r1 # ip = &x
sll r1, 2, r2 # vp = ip

mov [addr_of_str], a0 # string in arg0 register
mov r1, a1 # arg1 in arg1 register
mov r2, a2 # arg2 in arg2 register
call printf # invoke printf()

and hence prints things like:

(unsigned int)xp: 0x100412c1 (unsigned int)vp: 0x40104b04

i.e., the value in vp is numerically four times greater than
that in xp. Pointer-to-integer casts simply take the raw value
stored in the pointer; it is up to the programmer to make sure
that he knows whether he is dealing with a byte pointer (with
the extra low-order bits) or a word pointer.

Structure pointers are always word pointers; structures are
always a multiple of four of the 8-bit bytes long. A struct
holding a single "char" has three bytes of padding.

We are almost there, but we still need to know how integer-to-pointer
conversions work. Here things are a bit odd: the integral constant
zero converts, at compile time, to the machine's internal nil
pointer, which is 0x3fffffff as a word pointer, and thus 0xfffffffc
as a byte pointer:

int *ip;
void *vp;
ip = 0;
vp = 0;

compiles to:

mov #3fffffff, r1 # ip = NULL
mov #fffffffc, r2 # vp = NULL

Since we need to address the IVT structure at CPU-location-zero
(not CPU-location-0x3fffffff), we cannot just write:

struct iv_s *ivt = 0; /* doesn't work - sets register to 0x3fffffff */

Adding a cast does not help, because we are still using a "null
pointer constant" as the C standard defines the term. So we
might resort to Old Wolf's attempt:

struct iv_s *ivt = (struct iv_s*) sizeof *ivt;
--ivt;

Unfortunately, this does not work either. Here sizeof *ivt is,
say, 64 -- big enough to hold 16 4-byte vector entries -- but
we need to set the register to 16, not 64. The reason is that
"--ivt" moves it down by 64 bytes, which is 16 words:

mov 64, r1 # ivt = (struct iv_s *)sizeof *ivs;
sub 16, r1 # ivt--

Hence the correct C code is:

struct iv_s *ivt = (struct iv_s *)16;
--ivt;

which compiles to:

mov 16, r1
sub 16, r1

leaving r1 set to 0 as desired. Or, equivalently, we can try:

const int i = 0;
struct iv_s *ivt = (struct iv_s *)0;

because in C, "i" is not an "integer constant" at all (despite
the red-herring "const" keyword), hence it is not an integer
constant zero. This might compile to:

mov 0, r1 # i = 0
mov r1, r2 # ivt = (struct iv_s *)i

Now we move from Version 1 of this machine to Version 2. Here the
compiler-writer has decided that he regrets his multiple pointer
formats with shift operations at every conversion. But he has not
chosen to make bytes be 32 bits long; instead, he has decided to
smuggle the 8-bit-byte offset into the *high* two bits of a 32-bit
word. (The hardware makes this particularly easy because the top
two bits are never put out on the address bus -- which is only 30
bits wide. The hardware uses 32-bit words, after all, so the
machine still addresses 4 giga-octets of memory.)

On Version 2 of the machine, we still need the same 16 that will
get subtracted by "--ivt", and the same C-source-level tricks work.

Version 3 of this machine, on the other hand, has new and different
hardware. The builders of Version 1 and Version 2 got sick of
trying to deal with an external 8-bit-wide world using 32-bit-wide
instructions, so they have rewired everything to use conventional
byte addressing. The machine's internal null pointers are still
0x3fffffff, for backwards-compatibility with Version 2, so:

struct iv_s *ivt = 0;

still does not work; but now instead of setting ivt to 16, we have
to set it to 64, because "--ivt" generates a "sub 64,r1":

struct iv_s *ivt = (struct iv_s *)64;
--ivt;

Of course, after spending dozens of man-years of work fixing broken
C code, the builders of this machine finally make Version 4, which
uses 8-bit-byte-addressed memory and has its internal null pointers
as all-bits-zero. They do this because any other arrangement is
just too painful. This is, of course, the same reason the IA32
architecture is still bug-for-bug compatible with the 80186: as it
turns out, hardware is quite soft, but software is almost impossibly
hard. :)
 
D

dandelion

Chris Torek said:
[given a particular problem of constructing a "pointer to address 0"
where CPU-address-0 holds the Interrupt Vector Table or "ivt":]

The --ivt expression will (just as the snippet of Mr Tobin) evaluate to 0,
whit the same results (in the hypothetical case of a non-0x00..0
null-pointer. Would that not give the same result? Ie. my pointe pointing to
some unwanted part of memory (or worse)?

Hypothetical. I allready checked the real-life situation.

There are a number of flawed ideas behind the question to start
with.

Ok. I'm always eager for flawed ideas on my part to be corrected. However,
after a couple of readings it's still not very clear to me what the flaws
were. Or the ideas, for that matter. But that does not make your post less
interesting.

Oh, apropos 'flawed idea'... I wanted to amke sure wether the null-pointer
conversion (from 0 to whatever) took place at compile-time. Your post more
than confirmed that.

This is, of course, the same reason the IA32
architecture is still bug-for-bug compatible with the 80186: as it
turns out, hardware is quite soft, but software is almost impossibly
hard. :)

:). In a flippant mood I once wrote a floppy-boot loader. All in 8086
assembly.
--
In-Real-Life: Chris Torek, Wind River Systems
Salt Lake City, UT, USA (40°39.22'N, 111°50.29'W) +1 801 277 2603
email: forget about it http://web.torek.net/torek/index.html
Reading email is like searching for food in the garbage, thanks to
spammers.
 
O

Old Wolf

Chris Torek said:
[given a particular problem of constructing a "pointer to address 0"
where CPU-address-0 holds the Interrupt Vector Table or "ivt":]
[snip - explanation of problems with that]

How well would this one fare?

struct iv_s *ivt;
memset(&ivt, 0, sizeof ivt);
 
C

Chris Torek

Ok. I'm always eager for flawed ideas on my part to be corrected. However,
after a couple of readings it's still not very clear to me what the flaws
were. Or the ideas, for that matter. ...

I remember writing this, and trying to pick out what the assumptions
behind the question might have been, but not precisely what I thought
they could be and thus what was wrong with them. :)

Seriously, one thing that stood out was the idea that all pointers
were "byte pointers", as it were. The old PR1ME machines had 32-bit
"word pointers" and 48-bit "byte pointers", and even in the mid-1980s,
a new machine that came on the market -- the Data General Eclipse --
had separate *formats* for pointers, with word pointers pointing to
16-or-more-bit data, and byte pointers pointing to 8-bit data. A
word pointer's value was always half as much as the corresponding
byte pointer's value.
Oh, apropos 'flawed idea'... I wanted to amke sure wether the null-pointer
conversion (from 0 to whatever) took place at compile-time. Your post more
than confirmed that.

Yes. I always like to imagine what C might have been like if Dennis
had added a "nil" kewyord. Instead of the klunky Standard C three-word
phrase, "null pointer constant", we would just say "nil". A null
pointer would be obtained by using the keyword "nil" wherever a
pointer is required:

char *p = nil; /* sets p to the "null pointer of type char *" */

or:

(int *)nil /* produces the "null pointer of type int *" */

but if you wrote:

printf("nil prints out as %p\n", nil);

you would get a compile-time diagnostic (error message), because the
prototype for printf() is:

int printf(const char *, ...);

and placing the "nil" keyword in an untyped context would be an
error -- the compiler doesn't know *which* "nil" to use ("byte
pointer null", "word pointer null", etc.) on machines where there
are multiple kinds of "null pointer". You would fix this with:

printf("nil prints out as %p\n", (void *)nil);

The cast provides the context, telling the compiler "use the kind
of null pointer needed for the type `void *'".

Dennis did not do this, though. Instead of a keyword that means
"null pointer of type supplied by context, or error if context is
missing" we have "integral constant expression with value zero".
The problem *only* occurs when the required context is missing.
With a keyword -- whether it were spelled "nil" or "__builtin_null"
or even "_KUMQUAT" -- the compiler would know: aha, context missing,
must complain! With "integral constant zero", on the other hand,
what remains when the context is removed is a valid "int": 0.

Someday (in my copious spare time perhaps :) ) I should add a
trick to gcc, so that __builtin_nil (or however it is to be spelled)
is an integral constant expression with value 0, except that it
produces a diagnostic message whenever it is used in something
other than a pointer context. Then we can:

#define NULL __builtin_water_buffalo /* or however it is spelled */

and get what we would have now if Dennis had just added the keyword
a long time ago.
 
C

Chris Torek

Chris Torek said:
[given a particular problem of constructing a "pointer to address 0"
where CPU-address-0 holds the Interrupt Vector Table or "ivt":]
I would do it this way:

struct iv_s *ivt = (struct iv_s*) sizeof *ivt;
--ivt;
[snip - explanation of problems with that]

Make that "potential problems" -- maybe it works, maybe not. :)

How well would this one fare?

struct iv_s *ivt;
memset(&ivt, 0, sizeof ivt);

We would need to know one other thing about the hardware and/or
compiler involved: do pointers have any "special" bits that mark
them as valid pointers, for instance. If not -- if we just need
all-bits-zero in the pointer -- this would work (and work on all
four of the proposed variants of the hardware, making it a "better
answer" than the other tricks shown).

Again, the most important thing to note here is that accessing
this Interrupt Vector Table is *already* inherently machine-dependent,
so we can and should simply look at the compiler's documentation to
find out the machine-dependent method by which we accomplish this
machine-dependent task.
 
D

dandelion

Chris Torek said:
I remember writing this, and trying to pick out what the assumptions
behind the question might have been, but not precisely what I thought
they could be and thus what was wrong with them. :)

Yes... That helps. ;-).
Seriously, one thing that stood out was the idea that all pointers
were "byte pointers", as it were.

On todays systems, that's a pretty fair assumption, for reasons your post
makes very clear.
The old PR1ME machines had 32-bit
"word pointers" and 48-bit "byte pointers", and even in the mid-1980s,
a new machine that came on the market -- the Data General Eclipse --
had separate *formats* for pointers, with word pointers pointing to
16-or-more-bit data, and byte pointers pointing to 8-bit data. A
word pointer's value was always half as much as the corresponding
byte pointer's value.

word_ptr == byte_ptr << 1 ?
Yes. I always like to imagine what C might have been like if Dennis
had added a "nil" kewyord.

Personally I like the C++ definition of NULL. Any value that is not a valid
pointer, with a warning on the side that you should not assume NULL to have
any specific value. I spent quite a lot of time fishing "((void *) 0)" out
of C++ programs (C+- would be a better term) last weekend.

I do agree with your 'nil' idea, though. Some *proper* support for
null-pointer-constants would be appreciated, especially now the problems
with an 'on the fly' substitution of integer constant 0 have been shown.

Someday (in my copious spare time perhaps :) ) I should add a
trick to gcc, so that __builtin_nil (or however it is to be spelled)
is an integral constant expression with value 0, except that it
produces a diagnostic message whenever it is used in something
other than a pointer context. Then we can:

#define NULL __builtin_water_buffalo /* or however it is spelled */

and get what we would have now if Dennis had just added the keyword
a long time ago.

Make sure you post it. I'd be more than willing to invest some of my
precious
sparetime to test it.

Sparetime project # 3829192/D, by the way...
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,156
Messages
2,570,878
Members
47,404
Latest member
PerryRutt

Latest Threads

Top