Format of Pointers in Unix

Kevin Bracey · May 18, 2004

"Ralmin said:
I would have expected the implicit division of 1 by 527 to always round
down to zero in practise. It appears to produce zero on most of my
implementations to hand (lcc-win32, microsoft vc++, borland bcc32), but
surprisingly not on cygwin gcc, where it produces a ptrdiff_t value of
-2086359825. Weird.

<OT> Any explanations? </OT>

Yes. Because the compiler knows that b2 and b1 must be pointing to elements
of the same array, it knows that the difference must be a multiple of 527,
so the division is exact. It exploits this knowledge to turn the division
into a (faster) modulo multiplication, as described in a paper by Granlund &
Montgomery (1994).

In this case, it is performing a modulo-2^32 multiplication by 0x83A4ACEF
(-2086359825), which is equivalent to dividing by 527, but ONLY if the
division is exact.

An excellent example of a compiler optimisation taking advantage of being
allowed to produce undefined behaviour for incorrect code.

Dan Pop · May 18, 2004

In said:
And somewhere around the time of 05/17/2004 23:02, the world stopped and
listened as August Derleth contributed the following to humanity:

On the 8086, a memory page is 16 bytes. Every segment starts and ends
on a 16 byte boundary. This is why that 3186:3DE0 = 35640 & 3187:3DD0
also = 35640. It was much easier to send the CPU out into the weeds and
crash those machines because there was *NO* memory protection enforced
by the hardware. AFAIK, there was no swap either because the paging
mechanism doesn't exist on that ancient hardware.

You don't need any paging mechanism to implement swapping. PDP-11 Unix
supported swapping, despite the lack of a paging mechanism.

Dan

Gordon Burditt · May 18, 2004

and in

I agree with you, that's not flat.

But...

Lots of architectures with MMU's allow the kernel to map multiple addresses
in userland space to the same hardware address. Is this flat
or not flat?

I'd say it's not flat if the multiple mappings are visible to a
single userland program e.g. sharing the C runtime library between
multiple programs doesn't make it non-flat; mmap()ing the same file
into the same program multiple times makes it non-flat. However,
this is something the program does to itself normally; the OS or
the hardware wouldn't do this on its own.

Many designs without MMU's but with incomplete address decoding
will map two addresses to the same memory cell.

Ok, but the 16*segment+offset business is a bit more complex than
simply ignoring some bits. And you CAN reasonably come up with
pointers to the same place with different segment/offset by invoking
undefined behavior breaking C rules about out-of-bounds pointer
math.

I suspect that "flat" vs "not flat" is sometimes a continuum rather
than a one-or-the-other choice.

I'll agree here. One not-that-obvious non-flattedness present on
many x86-based protected-mode OSs is mapping code space and data
space into the same set of memory, but this is really required to
get what is ordinarily called a flat address space.

Gordon L. Burditt

Tim Shoppa · May 19, 2004

I'll agree here. One not-that-obvious non-flattedness present on
many x86-based protected-mode OSs is mapping code space and data
space into the same set of memory, but this is really required to
get what is ordinarily called a flat address space.

I think the C language specification should work fine on either a
Harvard or von Neumann architecture... I space and D space pointers
can clearly be very different animals. Given that the original
architectures with C compilers were all von Neumann maybe this is
just happy coincidence, but the more I think about it maybe it's really
a good clean well-thought-out design.

The later PDP-11's had split I/D, which didn't hurt, but none of the
C compilers at that time would complain at all if you did pointer
arithmetic between different spaces. That's not saying much because
they let you do just about anything without making useful warnings,
and there was no ANSI spec to say what was undefined behavior, that's for
sure

Tim.

James Kanze · May 20, 2004

|> On Mon, 17 May 2004, Nils O. Selåsdal wrote:

|> > > How many Unices for the 8086 and 286 have you seen?
|> > None. And I've neither touched AIX as someone else mentioned here.

|> Don't forget SCO's XENIX; that was available for the 80286, IIRC.

You mean Microsoft's XENIX, don't you. And it was available not only
for the 80286, but even for the 8086.

Gary Schmidt · May 20, 2004

James said:
|> On Mon, 17 May 2004, Nils O. Selåsdal wrote:

|> > > How many Unices for the 8086 and 286 have you seen?
|> > None. And I've neither touched AIX as someone else mentioned here.

|> Don't forget SCO's XENIX; that was available for the 80286, IIRC.

You mean Microsoft's XENIX, don't you. And it was available not only
for the 80286, but even for the 8086.

No, he means SCO Xenix.

As I recall, there was Microsoft Xenix, IBM Xenix, and SCO Xenix.
("SCO" in this context is the "Santa Cruz Operation," not the "SCO
Group" of recent infamy).

I used IBM Xenix and SCO Xenix on the 80286 way back in the dark ages of
1986.

SCO Xenix was quite a usable product, much better than the IBM variant
(which vague memory says was just a re-badge of MS Xenix).

It was strange coming from VAXen and 68000 systems to this weird
segmented stuff, but I had been used to the 64K limit (and 16-bit
addressing) on the PDP-11, so it was the lack of an overlaying linker
that left me scratching my head.

I think I've even got the 5 1/4 inch floppies somewhere...

Cheers,
Gary B-)

J. J. Farrell · May 21, 2004

Gary Schmidt said:
("SCO" in this context is the "Santa Cruz Operation," not the "SCO
Group" of recent infamy).

The SCO Group includes most of the legacy of the Santa Cruz Operation
along with the remains of Caldera. It was renamed back to the SCO
brand since the majority of its software products derive from the SCO
heritage.

And much of their code is written in C ...

Dan Pop · May 21, 2004

In said:
No, he means SCO Xenix.

As I recall, there was Microsoft Xenix, IBM Xenix, and SCO Xenix.
("SCO" in this context is the "Santa Cruz Operation," not the "SCO
Group" of recent infamy).

I used IBM Xenix and SCO Xenix on the 80286 way back in the dark ages of
1986.

SCO Xenix was quite a usable product, much better than the IBM variant
(which vague memory says was just a re-badge of MS Xenix).

Didn't SCO buy Xenix from Microsoft and then continued developing it
independently?

Dan

Jerry Feldman · May 31, 2004

And, I think, Microsoft made XENIX for the 8086. (One of my older UNIX
books mentions it, in a discussion of how far the *nix idea had gone
by the mid-80s.)

I worked on a Microsoft Xenix port to a Raytheon machine back in 1981.

The Horror of pointers...	5	Jan 11, 2025
Sizes of pointers	233	Jul 30, 2013
Regression testing for pointers	86	Mar 9, 2012
freeing pointers that are created in a function	4	Apr 22, 2014
differentiating between pointers - "primary"?	9	May 24, 2012
void pointers	36	Oct 5, 2010
Store information in pointers	29	Apr 28, 2010
Can I call a function with pointers of different type?	6	Jan 22, 2012

Format of Pointers in Unix

Kevin Bracey

Dan Pop

Gordon Burditt

Tim Shoppa

James Kanze

Gary Schmidt

J. J. Farrell

Dan Pop

Jerry Feldman

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads