...malloc ...from Rome :-)

K

Keith Thompson

The Intel *86 architecture (in 16-bit or 32-bit protected mode) is
an excellent example of this. Loading a (so-called "far") pointer
containing a no-longer-valid segment into, say, ES:SI or ES:ESI
will cause a trap.

Does loading a null pointer into ES:SI or ES:ESI cause a trap?
Yes, but you didn't need the declaration of p to get undefined
behavior. Just leaving off the argument to printf() matching %p
is enough.

Yes, but that illustrates a different point.
 
C

Chris Torek

Does loading a null pointer into ES:SI or ES:ESI cause a trap?

It could be arranged to do so, by making NULL's segment invalid.

If a compiler/system did set this up, so that in:

char *p, *q;
p = NULL;
q = p;

the assigment "q = p" trapped by loading p's NULL-value segment
into a segment register, that compiler/system would fail to conform.
The easiest way to fix this is to make sure that the "null segment"
(whatever segment number the compiler-writer chooses) is always
valid, if for some reason it is important to allow "q = p" to load
p's segment into a segment register.

In other words, simply loading a bad pointer into causes a trap, but it
should be OK to "load NULL", as long as you do not indirect through it:

p = NULL;
use(p); /* OK */
use2(*p); /* ERROR */

so a compiler must not trap on the call to use(), in case it
longjmp()s away. If it traps on the call to use2() (assuming this
call is actually attempted), that is fine. Since the hardware
traps on a load of the segment register with an invalid segment,
either NULL's segment has to be invalid, or the call to use()
has to not load the segment register (or both).
 
C

Chris Torek

Typo alert (low blood sugar? :) ... was posting just post-gym):

... Since the hardware
traps on a load of the segment register with an invalid segment,
either NULL's segment has to be invalid, or the call to use()
has to not load the segment register (or both).

Of course, that should read "either NULL's segment has to be VALID"
(not "invalid").
 
G

Gordon Burditt

yeah i can follow that now its kind of like
Does loading a null pointer into ES:SI or ES:ESI cause a trap?

That depends on what bit pattern you use for a null pointer. The
short answer is NO for an all-bits-zero null pointer.

I think some of the logic in handling selectors on the x86 architecture
was designed for C.

The all-bits-zero selector (actually, the two low-order bits of 16
are a don't care in this situation) generates a trap if you load
it into CS or SS. It doesn't cause a trap if you load it into DS,
ES, FS or GS. You *CANNOT* map that selector to anything. You
also can't dereference the pointer without a trap.

You can't load an all-bits-zero selector into CS since loading
CS:EIP is a branch instruction, and it's impossible to avoid
dereferencing it on the next instruction. The processor is rather
picky about keeping a valid stack pointer and it's difficult to
avoid dereferences into the stack segment if SS is NULL. I suppose
you could come up with a useful code sequence that loads SS with
NULL and manages not to dereference it, but the processor doesn't
allow it.

Gordon L. Burditt
 
G

Gordon Burditt

The Intel *86 architecture (in 16-bit or 32-bit protected mode) is
The easiest way to fix this is to make sure that the "null segment"
(whatever segment number the compiler-writer chooses) is always
valid, if for some reason it is important to allow "q = p" to load
p's segment into a segment register.

In other words, simply loading a bad pointer into causes a trap, but it
should be OK to "load NULL", as long as you do not indirect through it:

p = NULL;
use(p); /* OK */
use2(*p); /* ERROR */

so a compiler must not trap on the call to use(), in case it
longjmp()s away. If it traps on the call to use2() (assuming this
call is actually attempted), that is fine. Since the hardware
traps on a load of the segment register with an invalid segment,
either NULL's segment has to be invalid, or the call to use()
valid
has to not load the segment register (or both).

BUT, there's an odd halfway-in-between case for the all-bits-zero
selector (Intel x86 architecture): it's always a valid segment (so
you can load the pointer into a segment register without a trap),
but you can't dereference it. Just what you'd want for a null
pointer. And that particular selector ALWAYS acts like that: you
can't change it.

Gordon L. Burditt
 
C

CBFalconer

Gordon said:
.... snip ...

BUT, there's an odd halfway-in-between case for the all-bits-zero
selector (Intel x86 architecture): it's always a valid segment (so
you can load the pointer into a segment register without a trap),
but you can't dereference it. Just what you'd want for a null
pointer. And that particular selector ALWAYS acts like that: you
can't change it.

Oh, you can dereference it, but you probably shouldn't. If you do
you are messing with the zero-divide interrupt vector, IIRC.

--
"I support the Red Sox and any team that beats the Yankees"
"Any baby snookums can be a Yankee fan, it takes real moral
fiber to be a Red Sox fan"
"I listened to Toronto come back from 3:0 in '42, I plan to
watch Boston come back from 3:0 in 04"
 
J

James Stevenson

The Intel *86 architecture (in 16-bit or 32-bit protected mode) is
an excellent example of this. Loading a (so-called "far") pointer
containing a no-longer-valid segment into, say, ES:SI or ES:ESI
will cause a trap.


Yes, but you didn't need the declaration of p to get undefined
behavior. Just leaving off the argument to printf() matching %p
is enough.

Yeah please note the time on the posting

Date: Wed, 20 Oct 2004 00:43:43 +0100
From: James Stevenson <[email protected]>
 
D

Dave Vandervies

Chris Torek said:
In other words, simply loading a bad pointer into causes a trap, but it
should be OK to "load NULL", as long as you do not indirect through it:

p = NULL;
use(p); /* OK */
use2(*p); /* ERROR */

so a compiler must not trap on the call to use(), in case it
longjmp()s away.

Isn't it enough for use() to involve user-visible side effects like
input or output?

Or would that lead us to a long and twisty thread arguing about what
constitutes a strictly conforming use of user-visible side effects,
rather than simply invalidating a trap on the call to use()?


dave
 
M

Mark McIntyre

I suggest you get a 'backup' provider.

I have one. Problem was that my normal provider was *silently*
rejecting-but-accepting posts, and didn't advise my newsclient till I tried
to end the session....
 
C

Chris Torek

Isn't it enough for use() to involve user-visible side effects like
input or output?

Or would that lead us to a long and twisty thread arguing about what
constitutes a strictly conforming use of user-visible side effects,
rather than simply invalidating a trap on the call to use()?

It gets particularly complicated if we allow "undefined behavior"
to violate the laws of physics. :)

Seriously, I figured longjmp() was a simple example of "control
may never reach use2()". Other possibilities include exit()ing,
and doing irreversible user-visible side effects... but consider
how tricky "irreversible" might be. Suppose use(p) does:

write(1, "use() called\n", 13); /* POSIX */

so that "use() called" appears on stdout. This is certainly
user-visible -- or is it? What if it goes into a terminal-simulator
window that, before the pixels are even drawn on the screen, is
completely destroyed when use2(*p) causes the system to crash and
reboot?

Or, to rephrase the old saw about a tree in a forest: If a windows
box bluescreens before you can see a particular result, did it even
compute the result?
 
G

Gordon Burditt

BUT, there's an odd halfway-in-between case for the all-bits-zero
Oh, you can dereference it, but you probably shouldn't. If you do
you are messing with the zero-divide interrupt vector, IIRC.

No, you AREN'T messing with the zero-divide interrupt vector. The
global table doesn't have an entry 0 (even though where it would
logically go may overlap that vector). If you try to dereference a
NULL pointer (all-zero-bit selector), the processor won't look at
that entry, and even if you screw up the zero-divide vector to be
a valid table entry with associated memory, the dereference won't
work.

The all-bits zero selector (low-order two bits actually are don't-cares)
is a special case designed into the chip.

Gordon L. Burditt
 
K

Keith Thompson

Herbert Rosenau said:
[...]
if(start!=NULL)
{
memset((char *)start->acNome,'\0',sizeof(start->acNome));

Using memset to fill anything else as an array of char (or a string)
is undefined behavior too. E.g. filling a pointer with binary 0 bytes
can be an access violation.

Any object can be accessed as if it were an array of unsigned char.
Using memset() to fill a pointer object with zero bytes is perfectly
legal; if you then read the bytes back *as bytes*, you'll get back the
same zero bytes you wrote into it.
There is no guarantee that a NULL pointer
has any bits set to 0. It may have padding bits set to nonzer0 to be a
valid pointer.

That's correct. Using memset() to fill a pointer with zero bytes is
not necessarily useful (it can give you a null pointer on some
implementations, but that's not guaranteed), and if you do so any
attempt to use the pointer *as a pointer* can invoke undefined
behavior. But the undefined behavior doesn't occur until you attempt
to use the pointer (where "use" includes both dereferencing it and
just examining its value).
 
R

Raymond Martineau

No, you AREN'T messing with the zero-divide interrupt vector. The
global table doesn't have an entry 0 (even though where it would
logically go may overlap that vector).

Take a look at the following output from "debug.exe":
-d 0000:0000
:0000:0000 68 10 A7 00 BB 13 53 05-16 00 9A 03 B1 13 53 05 h.....S.......S.
:0000:0010 8B 01 70 00 B9 06 10 02-40 07 10 02 FF 03 10 02 ..p.....@.......
:0000:0020 46 07 10 02 0A 04 10 02-3A 00 9A 03 54 00 9A 03 F.......:...T...
:0000:0030 6E 00 9A 03 88 00 9A 03-A2 00 9A 03 FF 03 10 02 n...............
:0000:0040 A9 08 10 02 A4 09 10 02-AA 09 10 02 5D 04 10 02 ............]...
:0000:0050 B0 09 10 02 0D 02 DF 02-C4 09 10 02 8B 05 10 02 ................
:0000:0060 0E 0C 10 02 14 0C 10 02-1F 0C 10 02 AD 06 10 02 ................
:0000:0070 AD 06 10 02 A4 F0 00 F0-37 05 10 02 42 42 00 C0 ........7...BB..
:-g=00a7:1068
:
:Divide overflow

If it looks like a duck, walks like a duck and quacks like a duck...
If you try to dereference a
NULL pointer (all-zero-bit selector), the processor won't look at
that entry, and even if you screw up the zero-divide vector to be
a valid table entry with associated memory, the dereference won't
work.

If the processor didn't allow access to the null pointer, than I wouldn't
have been able to retrieve the address of the interrupt vector, let alone
know call it. If I wanted to, I could set the Zero-divide interrupt vector
to whatever I wanted, even something that could potentially reset the
computer.

The only reason the processor would prevent the address from being touched
is if it is running in protected mode. In this case, the interrupt table
is moved elsewhere in memory, and applications running under the
protected-mode kernel are given their own private address space. However,
it can still be possible to crash the system if the kernel allows it (e.g.
running debug under the default memory setting under Windows 95.)
 
C

CBFalconer

Gordon said:
No, you AREN'T messing with the zero-divide interrupt vector. The
global table doesn't have an entry 0 (even though where it would
logically go may overlap that vector). If you try to dereference a
NULL pointer (all-zero-bit selector), the processor won't look at
that entry, and even if you screw up the zero-divide vector to be
a valid table entry with associated memory, the dereference won't
work.

The all-bits zero selector (low-order two bits actually are
don't-cares) is a special case designed into the chip.

Then how does the zero-divide vector get set in the first place?

--
"I support the Red Sox and any team that beats the Yankees"
"Any baby snookums can be a Yankee fan, it takes real moral
fiber to be a Red Sox fan"
"I listened to Toronto come back from 3:0 in '42, I plan to
watch Boston come back from 3:0 in 04"
 
C

Chris Torek

(This is getting off-topic, and probably should move to some
x86-specific newsgroup...)

Then how does the zero-divide vector get set in the first place?

I am not an expert on the x86 architecture, and would have to haul
out my Pentium book to go into all the details, but this is mixing
up several different concepts.

The x86 has three "descriptor tables" involved here, called the GDT,
LDT, and IDT. These stand for "Global", "Local", and "Interrupt"
Descriptor Table.

Exceptions, including zero-divide, indirect through the IDT. Segments
loaded into data registers indirect through the GDT or LDT.

A "logical" (user-supplied) memory address -- this may be a virtual
memory address, depending on processor mode -- as specified by a
segment register plus another 16 or 32 bits of "address" register
(depending on mode) is handled by combining the DT-lookup-result
(from the segment) with the rest of the address. If this is a
virtual address, it is then run through a virtual-to-physical
translation. The final physical address, if valid, determines what
memory is read or written.

It is possible to have the GDT and IDT overlap, in which case IDT#0
is the zero-divide vector while IDT#1 is also GDT#1. Here GDT#0 is
unused (as Gordon Burditt pointed out -- this was not something I was
aware of), but you can access it by going through some nonzero GDT
entry. That is, just set DS to (say) 5000, making sure that the
GDT has at least 5001 entries starting with the unused #0, with
GDT[5000] mapping that same memory, and you are all set.

(I think it makes more sense to just set the IDT somewhere else,
myself, but it can be done this way.)
 
G

Gordon Burditt

BUT, there's an odd halfway-in-between case for the all-bits-zero
Take a look at the following output from "debug.exe":
[snip]

My bet is that MS-DOS is not running in protected mode. "selectors" only
exist in protected mode. The context of this thread is 16-bit or 32-bit
protected mode, where loading an invalid selector into a segment register
will cause a trap.
If the processor didn't allow access to the null pointer, than I wouldn't
have been able to retrieve the address of the interrupt vector, let alone
know call it. If I wanted to, I could set the Zero-divide interrupt vector
to whatever I wanted, even something that could potentially reset the
computer.

Selectors only exist in protected mode, and that's not what MS-DOS is running.
The only reason the processor would prevent the address from being touched
is if it is running in protected mode. In this case, the interrupt table
is moved elsewhere in memory, and applications running under the
protected-mode kernel are given their own private address space. However,
it can still be possible to crash the system if the kernel allows it (e.g.
running debug under the default memory setting under Windows 95.)

Gordon L. Burditt
 
J

James Stevenson

Take a look at the following output from "debug.exe":
[snip]

My bet is that MS-DOS is not running in protected mode. "selectors" only
exist in protected mode. The context of this thread is 16-bit or 32-bit
protected mode, where loading an invalid selector into a segment register
will cause a trap.

Ok explain how memory is access in MS-DOS at an address greater than
0xFFFF since it is running in 16-Bit real mode 0xFFFF is the highest
memory address you can possibly have. But yet it was possible to use
the whole < 640Kb (subtract tsr's of course) region using far pointers and
memory selectors.
 
G

Gordon Burditt

Take a look at the following output from "debug.exe":
-d 0000:0000
[snip]

My bet is that MS-DOS is not running in protected mode. "selectors" only
exist in protected mode. The context of this thread is 16-bit or 32-bit
protected mode, where loading an invalid selector into a segment register
will cause a trap.

Ok explain how memory is access in MS-DOS at an address greater than
0xFFFF since it is running in 16-Bit real mode 0xFFFF is the highest
memory address you can possibly have.

Incorrect. You can address 1 megabyte in real mode. (using "far"
pointers, which in some memory models are the default for C). Ever
wonder why the addresses for the BIOS and hardware devices start
at 1 megabyte and reach down to 640k?
But yet it was possible to use
the whole < 640Kb (subtract tsr's of course) region using far pointers and
memory selectors.

NO, you're not using memory selectors. You're using the segment
register to refer to the base of a segment in terms of multiples
of 16-byte "pages". Selectors exist only in protected mode.

In real mode, physical address = 16*segment register + offset .
The segment register and offset are each 16 bits in 16-bit real
mode. Thus, you can address 1 megabyte. Well, ok, in some cases
1 megabyte plus almost 64k. But what you put in a segment register
in real mode is *NOT* a selector.

In real mode, 0x0000:0x0080 and 0x0008:0x0000 refer to the same
byte. In 16-bit protected mode, these addresses may be nowhere
near each other, as each refers to a different potentially 64k-long
segment.

Some extended memory managers use protected mode but run MS-DOS in
virtual 8086 mode, where segment registers act like they are in
real mode, but "windows" in memory (typically in the 640k - 1024k
range) can be remapped to memory above 1 meg ("extended memory").
Others use some kind of hardware remapping not provided by the CPU
("expanded memory"). It is also possible to escape from 16-bit
real mode into 32-bit mode using instruction prefixes on Intel
[3456]86 processors.

In protected mode, there are 2**13 (8192) possible local table
entries and 2**13-1 (8191) possible global table entries (table
entry 0 is not used in the global table). Without changing the
table, you can address, in 16-bit protected mode, 8192*2**16 = about
0.5 gigabytes local memory, and almost the same in global memory,
using a 32-bit "far" pointer (16-bit selector, 16-bit offset). In
32-bit protected mode (16-bit selector, 32-bit offset), that's about
32 gigabytes each, local and global, for a 48-bit "far" pointer.
(The processor may not be able to address that much *PHYSICAL*
memory, though).

Gordon L. Burditt
 
C

CBFalconer

Gordon said:
[snip]

My bet is that MS-DOS is not running in protected mode.
"selectors" only exist in protected mode. The context of this
thread is 16-bit or 32-bit protected mode, where loading an
invalid selector into a segment register will cause a trap.

No, this is c.l.c, and the presence or absense of MS-DOS,
selectors, protected mode etc. is not germane. Dereferencing NULL
is UB. Anything further is idle back room gossip about what form
that UB is likely to take.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Similar Threads

Alternative to Malloc in C 0
Open letter to Ian Collins 32
malloc 40
C + Malloc 8
Breakthrough 25
Fibonacci 0
malloc and maximum size 56
STRING - Remove small letters from string 1

Members online

Forum statistics

Threads
474,147
Messages
2,570,835
Members
47,382
Latest member
MichaleStr

Latest Threads

Top