Null pointers

J

Joe Wright

CBFalconer said:
Mabden wrote:

... snip ...



Because I have no idea what is actually at location 0 and what it
does, and no interest in finding out. I expect reading to be
non-harmful to my system. I have great qualms about writing
thereto.

Indeed. There is no correlation (that we can know) between a C
pointer value and a physical memory address. The C program is too
far from the metal.

The likelihood that NULL will actually point to physical address 0
in memory is remote if not impossible.

We are all talking about 'hosted' environments, right?
 
C

Christian Bau

It's not obvious to me. (char*)0 is a null pointer constant, cast to
a char*. (char*)zero is an int variable value cast to char*. zero
is not a constant here, it does not meet the C99 definition of a null
pointer constant (as it is not an integer constant expression, nor
such an expression cast to void*), and operations on it need not obey
the null pointer constant rules in any fashion.

1. Conversion is implementation defined
2. Converting a null pointer constant produces a null pointer.

I interpret this as "The implementation defines what the result of
conversion from integer to pointer types is. However, it is not
completely free to do this in arbitrary ways: Whatever rules the
implementation chooses, it must make sure that these rules convert null
pointer constants to null pointers. "

The other IMO incorrect interpretation would be "The implementation
defines what the result of conversion from integer to pointer types is.
It is free to do so any way it likes. However, when the value converted
happens to be a null pointer constant, that definition is overridden,
and instead the result must be a null pointer."
 
K

Keith Thompson

Christian Bau said:
1. Conversion is implementation defined
2. Converting a null pointer constant produces a null pointer.

I interpret this as "The implementation defines what the result of
conversion from integer to pointer types is. However, it is not
completely free to do this in arbitrary ways: Whatever rules the
implementation chooses, it must make sure that these rules convert null
pointer constants to null pointers. "

The other IMO incorrect interpretation would be "The implementation
defines what the result of conversion from integer to pointer types is.
It is free to do so any way it likes. However, when the value converted
happens to be a null pointer constant, that definition is overridden,
and instead the result must be a null pointer."

That's a good description. I agree with it completely -- except for
the minor detail that the first interpretation is incorrect, and the
second one is correct. :cool:}

I think I'll ask about this in comp.std.c, where the *real* language
lawyers hang out.
 
M

Mabden

I think I'll ask about this in comp.std.c, where the *real* language
lawyers hang out.

Holy Shit! There are worse ones than YOU!!! Well, what kind of chance does
that give the rest of us...
 
M

Malcolm

Richard Bos said:
You don't know that.
There is no inherent reason why it is impossible to engineer a system on
which location zero is just a normal memory location. However in practise
you always need some locations used for special purposes, and zero is always
chosen as one of those special locations.
 
K

Keith Thompson

Malcolm said:
There is no inherent reason why it is impossible to engineer a system on
which location zero is just a normal memory location.
Certainly.

However in practise
you always need some locations used for special purposes,

At least one (the null pointer value, whatever it is).
and zero is always
chosen as one of those special locations.

Always? I'm skeptical. Are you familiar enough with all existing
systems to be sure of that?

Several examples have been given on this thread of architectures where
address are represented as signed integers. On such a system, address
0 is right in the middle of the address space. (Admittedly, that
doesn't necessarily imply that it's not reserved, but I wouldn't bet
either way.)

If address all-bits-zero were always reserved on all systems, the C
standard might as well have mandated that a null pointer is always
represented as all-bits-zero. As we all know, it didn't.

But even if the all-bits-zero address happens to be reserved on all
past, present, and future systems, there's no reason for anyone
writing portable code to care. There is nothing you can portably do
with *any* specific address other than the null pointer (which of
course can vary from one system to another).
 
M

Michael Wojcik

1. Conversion is implementation defined
2. Converting a null pointer constant produces a null pointer.
Agreed.

I interpret this as "The implementation defines what the result of
conversion from integer to pointer types is. However, it is not
completely free to do this in arbitrary ways: Whatever rules the
implementation chooses, it must make sure that these rules convert null
pointer constants to null pointers. "

A plausible interpretation; however, "zero" in your example is not a
constant; hence it cannot be a null pointer constant (and indeed does
not meet the C99 definition of a null pointer constant); hence this
interpretation is irrelevant to the question at hand.

The null pointer constant rules need only apply to null pointer
constants, which are a special case of constants. And the standard
says nothing about converting non-constant integer expressions that
happen to have value 0.
 
M

Malcolm

Keith Thompson said:
At least one (the null pointer value, whatever it is).


Always? I'm skeptical. Are you familiar enough with all existing
systems to be sure of that?

Several examples have been given on this thread of architectures where
address are represented as signed integers. On such a system, address
0 is right in the middle of the address space. (Admittedly, that
doesn't necessarily imply that it's not reserved, but I wouldn't bet
either way.)
You'd have to know absolutely every platform out there to be sure, and if
there is an exception it is probably a signed address system. I would still
be surprised if location zero is not chosen for some special use, because it
is perfectly natural to map code to above zero and data to below, or put
fast RAM below and slow RAM above. The only exception would be if you want
to allow huge objects taking up more than half your address space, in which
case the only way is to treat zero as an ordinary memory location.
If address all-bits-zero were always reserved on all systems, the C
standard might as well have mandated that a null pointer is always
represented as all-bits-zero. As we all know, it didn't.
The special feature might be that a value of all bits zero automatically
traps when loaded into an address register, so obviously this would be a
poor choice of value for the null pointer. Alternatively execution may
always start at 0, which might mean that a function pointer set to main
would be all bits zero.
 
M

Michael Wojcik

There is no inherent reason why it is impossible to engineer a system on
which location zero is just a normal memory location. However in practise
you always need some locations used for special purposes, and zero is always
chosen as one of those special locations.

Always? There has never been and will never be a system where address
0 isn't used for some special purpose? There are no embedded systems
which would permit the designer to put ordinary RAM at location 0?

I think you'll find you're wrong about that. A little Google searching
turned up a couple of memory maps with RAM banks starting at address 0.
 
E

Emmanuel Delahaye

Michael Wojcik wrote on 08/08/04 :
A little Google searching
turned up a couple of memory maps with RAM banks starting at address 0.

The x86 Intel architecture is a good example. The boot address it at
FFFF0 (FFFF:0000) and the RAM starts at 0 (0000:0000).

But the data from 0 are interrupt vectors. At boot time, the data from
0 is just other data. But they are reserved by Intel to implement the
interrupt vectors.

It's different on a 68k where the address 0 holds the boot vector. Must
be a *ROM, not a RAM.
 
R

Richard Tobin

Christian Bau said:
1. Conversion is implementation defined
2. Converting a null pointer constant produces a null pointer.

I interpret this as "The implementation defines what the result of
conversion from integer to pointer types is. However, it is not
completely free to do this in arbitrary ways: Whatever rules the
implementation chooses, it must make sure that these rules convert null
pointer constants to null pointers. "

Your argument relies on the assumption that the implementation-defined
conversion must return the same result for a null pointer constant as
for non-constant zeros. Can you show that from the standard?

After all, if the standard's own definition of the conversion of
integers to pointers can make use of this distinction, why can't
the implementation's?

-- Richard
 
C

CBFalconer

Emmanuel said:
.... snip ...

It's different on a 68k where the address 0 holds the boot vector.
Must be a *ROM, not a RAM.

That does not necessarily follow. An 8080 starts execution at 0,
but that address also normally holds an interrupt vector. My
solution was to have the power on reset set a circuit which jammed
the high order address byte (from a dipswitch) onto the buss for
the first 3 cycles. Now I could set an initializing ROM anywhere
on 256 byte intervals, as long as the first instruction was a jmp
absolute.
 
R

Richard Bos

Christian Bau said:
Quite possible in C90, but most definitely not in C99. In C90, the
wording was such that in an assignment, or within an equality operator,
and probably some cases that I forgot, a null pointer constant was
replaced with a null pointer. (char*)0 was _not_ one if these cases and
in C90 not guaranteed to be a null pointer; in C99 they added that
_every_ conversion of a null pointer constant to a pointer produces a
null pointer.

Ok, but whatever makes you think that (char *)zero involves a null
pointer constant? A null pointer constant is not an integer expression,
it is an integer _constant_ expression.

Richard
 
K

Keith Thompson

Malcolm said:
The special feature might be that a value of all bits zero automatically
traps when loaded into an address register, so obviously this would be a
poor choice of value for the null pointer. Alternatively execution may
always start at 0, which might mean that a function pointer set to main
would be all bits zero.

Depending on the details of the architecture, trapping on loading
all-bits-zero into an address register might be a good argument in
favor of using all-bits-zero for the null pointer, assuming that
legitimate operations on a null pointer (such as assignment and
equality comparison) can be done without loading it into an address
register.

But you're right; even if all-bits-zero were always reserved on all
systems, that wouldn't necessarily mean it's always a good idea to use
all-bits-zero as the null pointer.

But I still maintain that we can't assume (and, more importantly,
shouldn't care) that all-bits-zero is always reserved on all systems.
Even if it is, there's nothing useful we can do with that information.
 
M

Malcolm

Michael Wojcik said:
Always? There has never been and will never be a system where address
0 isn't used for some special purpose? There are no embedded systems
which would permit the designer to put ordinary RAM at location 0?

I think you'll find you're wrong about that. A little Google searching
turned up a couple of memory maps with RAM banks starting at address 0.
The "system" is the hardware plus the C compiler. If the hardware doesn't
use all bits zero for something special the C compiler probably will, often
for the null pointer but maybe just as a control block for the beginning of
the heap or something similar.
 
K

Keith Thompson

Malcolm said:
The "system" is the hardware plus the C compiler. If the hardware doesn't
use all bits zero for something special the C compiler probably will, often
for the null pointer but maybe just as a control block for the beginning of
the heap or something similar.

If you're arguing that the all-bits-zero address is *usually* reserved
for something special (whether for the null pointer for for something
else), I completely agree.

If you're arguing, as I think you have been, that it's *always*
reserved, I suspect you're mistaken, but it doesn't really matter one
way or the other. It's not always reserved for the same thing on
different systems, so the assumption that it's always reserved doesn't
lead to any useful conclusions.

If you're arguing that it *should* always be reserved, I strongly
disagree. There's no point in imposing this kind of restriction.
 
M

Michael Wojcik

The "system" is the hardware plus the C compiler. If the hardware doesn't
use all bits zero for something special the C compiler probably will, often
for the null pointer but maybe just as a control block for the beginning of
the heap or something similar.

You claimed address 0 was *always* reserved. This "probably" is just
weaseling out.

And yes, I think this *is* important here. Some people like to make
pronouncements on c.l.c of the form "X isn't guaranteed by the
standard, but in practice it's always true". That sort of thing
leads to bad practice. I've known plenty of C programmers who think
that in practice the alpha characters are contiguous (in a single
case), but anyone who's had to port C to an EBCDIC machine, as I
have, knows that's not the case, and that there's plenty of code
which breaks on EBCDIC systems because of such assumptions.

It is rarely a good idea to declare that something is always true in
C, if it's not specified that way by the standard. And in a case
like this it wouldn't be useful anyway if it were true. So why make
that claim? It's more useful to be precise and say, "look, don't
count on it; write your code correctly".
 
M

Malcolm

Michael Wojcik said:
You claimed address 0 was *always* reserved. This "probably" is just
weaseling out.
No one has posted a counter example, which strongly suggests that there
isn't one.
And yes, I think this *is* important here. Some people like to make
pronouncements on c.l.c of the form "X isn't guaranteed by the
standard, but in practice it's always true". That sort of thing
leads to bad practice.
But also helps understanding. If a newbie thinks that something is likely
which he will never in fact encounter, that causes confusions. Humans are
not natural lawyers who can learn a rule divorced from practical examples.
I've known plenty of C programmers who think
that in practice the alpha characters are contiguous (in a single
case), but anyone who's had to port C to an EBCDIC machine, as I
have, knows that's not the case, and that there's plenty of code
which breaks on EBCDIC systems because of such assumptions.
A lot of code is like that. For instance IFF files have 4-byte ASCII
identifiers. It is natural to write if(!strcmp(tag, "HEAD")), but of course
this will break on non-ASCII machines. It is rarely much of a problem since
in most environments you can assume ASCII. If you put the ASCII codes in the
tag would become unreadable.
 
K

Keith Thompson

Malcolm said:
No one has posted a counter example, which strongly suggests that there
isn't one.

Maybe, but so what? There is nothing to be gained by assuming that
address all-bits-zero is always reserved.

I'd be willing to bet (but not much) that no real-world system will
have a C object at address all-bits-one, but that's no more or less
useful.

There can be some benefit, I suppose, in knowing that the
all-bits-zero address is *probably* reserved for something, even on a
system where a null pointer has some other representation. If you
examine a pointer variable in a debugger (or display it with printf's
"%p" format, depending on how that works), you see that its value is
all-bits-zero, and you happen to know that a null pointer isn't
all-bits-zero on the current system, then it's likely that something
has gone wrong. But if you're doing that kind of low-level
examination of pointer representations, you really should know
something about the system you're working on without reference to any
hypothetical univeral principle about all-bits-zero pointers. That
kind of system-specific knowledge is very different from your original
claim, that the all-bits-zero address is *always* reserved (for
something or other) on all systems.
 
F

Flash Gordon

No one has posted a counter example, which strongly suggests that
there isn't one.

I've used embedded systems where it was ordinary RAM from location 0.
These were embedded systems designed with RAM at the bottom of the
memory map and ROM at the top because that was the simplest way to do
it. In such circumstances it would definitely make sense to have NULL as
being something other than all bits zero to avoid wasting a location.
But also helps understanding. If a newbie thinks that something is
likely which he will never in fact encounter, that causes confusions.
Humans are not natural lawyers who can learn a rule divorced from
practical examples.

To be a good programmer you have to learn to understand and work with
abstractions, which is all you have to think of NULL, the null pointer
constant and null pointers as being. They are all just ways of
representing a guaranteed invalid pointer value.

You also have to learn to accept that some things which are incorrect
will work 99.9% of the time, but the time they fail is almost certainly
going to be the worst possible time for you.
A lot of code is like that. For instance IFF files have 4-byte ASCII
identifiers. It is natural to write if(!strcmp(tag, "HEAD")), but of
course this will break on non-ASCII machines.

No, you write something like:
if(!strcmp(tag, IFF_HEAD_TAG))

with a #define in an appropriate header.
It is rarely much of a
problem since in most environments you can assume ASCII. If you put
the ASCII codes in the tag would become unreadable.

If you look at the Chinese character sets you might find this is not
true. Also, if you look at, for example, the character sets as used by
Germans you will find than not all characters counted as letters are
in the caught by:
if ((ch>= 'a' && ch <= 'z') || (ch >= 'A' && ch <= 'Z'))
since you have various accented letters.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,145
Messages
2,570,828
Members
47,374
Latest member
anuragag27

Latest Threads

Top