Null pointers

C

Chris Dollin

Mabden said:
I assume that is sarcasm? I would LOVE to know what the R thinks, if you
can do that. I suspect it was a joke, and the rest of use emoticons to
make that clear

[sudden snip]

No, we don't. At least, *this* one doesn't.

[Very, very, rarely, and not in newsgroups.]
 
M

Michael Wojcik

I'm trying to let it go, but you just keep baiting me...

No one (unless it's someone I've killfiled) in this thread is baiting
you. They're pointing out, in explicit and excruciating detail, how
and why you're wrong - wrong about the facts of the C language, and
wrong to advocate a change to the language.

If you feel you're being baited, that's your problem, not ours.
I know all about zero, pointers, etc. My opinion about location zero is not
due to any ignorance in programming C - although I have freely admitted I do
not own a copy of any "Standards" documents other than K&R and K&R2.

You're certainly ignorant about many C implementations - probably the
majority of implementations, which are freestanding ones for embedded
processors.

If you haven't read the standard, then you're ignorant about how the
C language is actually defined.

And while you claim you've read K&R2, you apparently don't understand
the difference between the tutorial portions and the language specifi-
cation in the appendix, nor do you show an understanding of what's
contained in the latter.
Well, then, you admit I am creative, if nothing else. Brand new? You mean
nobody else read the K&R?

I expect everyone else interpreted the passage in question either in
accordance with the standard (that is, that a zero pointer constant
never refers to a valid object), or misinterpreted it in the common
way (that is, believing that the all-bits-zero address is the same as
a null pointer). Your apparent novelty is in believing that the
authors intended to make the all-bits-zero address forbidden in all C
implementations, and the standard committee somehow neglected to
honor this desire.
K&R is known to be subtle

It's known to be wrong on various points. See Dennis Ritchie's own
list of errata, the link to which has been posted in this very thread.
and I think this is
merely an overlooked sentence that the "Standard" has ignored for their own
purposes.

It's an incorrect, or at least misleading, sentence that the committee
wisely chose to ignore, since that would have broken existing implemen-
tations and made (standard) C unusable on a wide range of platforms.
That seems like an excellent purpose.
Can you point me to a web site that discusses whether location
zero should be considered a special case?

No, because there isn't one, if Keith is correct and this is a novel
misconception on your part.
I think it should. Why do you have
to argue so fiercely against this idea?

I've yet to see anyone in this thread be particularly fierce. Even
the more aggressive c.l.c regulars are being remarkably restrained.
And, of course, they're arguing against your proposal because it's
a terrible idea - it would break existing implementations and make
standard C unusable on a wide range of platforms. (Repeat that to
yourself until you get it.)
Are you on the Standards committee?

How would this be relevant?
Am I a C heretic?

Let's see: you're incorrect about the existing definition and how
it's established; you're advocating a change to the language that
would reduce its utility for no benefit; you're complaining of
persecution... What would you call it?
Even if I am wrong-headed, isn't it still true that location zero cannot be
written to or read from on YOUR machine.

At least one poster (Jack Klein, IIRC) has recently noted that he's
worked on systems where location zero is a valid address. I believe
that was in this thread.
All my critics, please just write a simple program to read from location
zero in memory. I you can do it, post what you find there.

Try it on an older VAX VMS C implementation, and you'll find a zero
there. Because some old C programs relied on this, there are more
recent implementations which support the "zero-address hack" as an
option - an option which does NOT render them non-conforming. IBM's
C for AIX had this option (actually implemented by the linker); I
don't know if it still does, but I have a machine sitting around with
a sufficiently old version of AIX to still provide it.
I am saying it *should not be* a valid address. IN MY OPINION.
I am not saying that any machine or any compiler or any standard does this.
I am stating that this is how *it should be*.

And we're stating that your opinion is wrongheaded, since making
address all-bits-zero invalid on all platforms would be a bad idea.
Name one.

Besides the two hosted implementations I mentioned above, there are
many freestanding implementations - as Jack Klein HAS ALREADY NOTED -
which put something useful at address all-bits-zero.

Embedded processors *greatly* outnumber general-purpose processors.
Many of those embedded processors have freestanding C implementa-
tions, and there are many C programs running on them. I haven't seen
any statistics in this regard, but it's plausible that there are more
C programs running on embedded systems than there are on general-
purpose systems, or more freestanding C implementations than hosted
implementations.

Now, I have no idea how many of those freestanding implementations
make use of address all-bits-zero, or are used on platforms where the
use of address all-bits-zero is determined by the hardware design (so
it might be useful to a C program). However, it is certain that quite
a number of them do.
I'm not sure how you can know that "each platform has a well-defined unique
pointer value that denotes no object (the null pointer)".

Keith knows that's true for all platforms with a conforming C implemen-
tation, because all such implementations must define such a value. In
some cases the platform itself may provide a suitable location; in
others, the implementation may have to pick a location and make sure
that no C object is ever placed there.
I assume that is sarcasm? I would LOVE to know what the R thinks, if you can
do that.

Dennis Ritchie reads Usenet, including sometimes c.l.c (he's been
active in alt.folklore.computers lately); he reads and responds to
email; and he maintains a list of known errors in K&R2, *as has
already been noted in this thread*. Did you look at the errata
list that someone posted a link to?
 
C

CBFalconer

Mabden said:
.... snip ...

All my critics, please just write a simple program to read from
location zero in memory. I you can do it, post what you find there.

That's simple, isn't it?

Yup. May not be functional on your system.

c:\c\junk>cat junk.c
#include <stdio.h>

int main(void)
{
long p;

for (p = 0; p < 16; p++) {
printf("{%p) = %x\n", (void*)p, *((char*)p));
}
return 0;
}

c:\c\junk>gcc junk.c

c:\c\junk>.\a
{0) = 50
{1) = ffffff8b
{2) = 45
{3) = 8
{4) = 50
{5) = ffffffe8
{6) = ffffffa6
{7) = fffffff6
{8) = ffffffff
{9) = ffffffff
{a) = ffffff8b
{b) = 45
{c) = ffffffec
{d) = ffffff83
{e) = ffffffc4
{f) = 10
 
C

Chris Torek

I have a number of embedded-systems boards accessible to me that have
ordinary RAM at *(char *)0. Put something in there and it is in there;
read from it and you get the same data back:

void showZeros(void) {
char *p = 0;
int i = 0;

*p = 2;
printf("p = %p, *p = %d\n", (void *)p, *p);
p = i;
*p = 3;
printf("p = %p, *p = %d\n", (void *)p, *p);
}

Compiled and run (from the target shell) under vxWorks on various
single-board machines, this produces:

-> showZeros()
p = 0x0, *p = 2
p = 0x0, *p = 3
->

(Both Diab and GNU use all-bits-zero for NULL, even on these
machines that have RAM at location 0.) On other single-board
machines RAM starts "above" 0; where there is ROM at 0 this
produces things like:

p = 0x0, *p = -72
p = 0x0, *p = -72

and where there is nothing at all at zero the task gets a fault
and gets suspended.

Try it on an older VAX VMS C implementation, and you'll find a zero
there. Because some old C programs relied on this, there are more
recent implementations which support the "zero-address hack" as an
option - an option which does NOT render them non-conforming. IBM's
C for AIX had this option (actually implemented by the linker); I
don't know if it still does, but I have a machine sitting around with
a sufficiently old version of AIX to still provide it.

4.1BSD on the VAX also had *(char *)0 == 0 (in fact there was a
short-word, 16 bits long, of all-zero-bits at 0). I thought most
VMS systems mapped page zero away, though.

Version 6 Unix on the PDP-11 (other than split I&D) had the magic
number for the a.out format at 0; for OMAGIC binaries (mode 0407)
we had *(char *)0 == 7 (remember that the PDP-11 was "PDP-endian",
little-endian within 16-bit words and big-endian for "long"s stored
as two 16-bit little-endian words). Since C had its main development
on the PDP-11, one might even say that *(char *)0 == 7 was the
*expected* result. :)

Most peculiar of all, I think, was some code that crept into
System III or System V Unix (not sure which). This is a paraphrase
(I have no recollection of the actual second argument to strcmp()):

if (strcmp(p, "#\307x") == 0)
...

The reason this code got in was that the 3B system on which
the programmer wrote it happened to have an odd sequence of bytes
at *(char *)0, and he wrote that strcmp() call instead of the
correct test:

if (p == NULL)

In other words, whoever wrote this Unix utility, whichever utility
it was (cpio?), thought that *(char *)0 was supposed to contain a
weird string! (I believe I heard this story from Doug Gwyn, who
might remember which utility it was and what the code was. It may
have been Guy Harris. Whoever it was, found the problem by porting
the particular utility and discovering that it did not work on the
new machine.)

The history and "current state of the world" is clear enough, and
mabden@sbc_global.net is simply wrong. The C Standards are a
little tougher to read and interpret, but overall, the facts are:

- The NULL macro, and the null pointer constants, are source
code constructs.

- A compiler's job is to convert source code constructs to
suitable machine code. This allows the compiler to change
"what you see in the source" to "what you will see if you
disassemble the machine code". That is, there is some
mapping between "external" representation -- what you type
into a C program, or see when you printf() -- and "internal"
representation, as used by machine-level code.

- The Standards allow the machine code to use any particular bit
pattern(s) the implementor chooses to represent various null
pointers internally, *provided* that no valid C object's address
(nor function pointer) compares equal to any such null pointer.

- Most implementors use all-bits-zero for internal null pointers.
It is usually the easiest thing to do, and most people usually
do the easiest thing. But other bit patterns are allowed, and
some machines have particularly complicated pointers (e.g.,
IBM AS/400) so that all-bits-zero is not easiest after all.

- Many machines without virtual memory, and even some with, have
ROM or RAM at physical address 0.

- C implementations that use all-bits-zero for all their internal
null pointers *and* that have useable RAM at address 0 must
make sure not to put any C object or function at address zero,
which is typically easily achieved by putting some "non-C"
thing there, such as startup code or a "shim".

- Many implementations with virtual memory simply map out address
0 so that improper attempts to access it are caught right away.
This is a good thing, but is not required by the C Standards.

- C implementations that use something other than "hardware
address 0" for their internal null pointers *and* that have
useful stuff at "hardware address zero" are not actually
obligated to let you get at the useful stuff -- nobody ever
said C *has* to be useful for systems programmers -- but will
likely have some trick(s) you can use to do that, because
most systems programmers *like* their C systems to be useful
to them. (After all, why buy a C compiler if you cannot *use*
it?)
 
K

Keith Thompson

Mabden said:
I'm trying to let it go, but you just keep baiting me...

You persist in making inaccurate statements. Nobody is baiting you;
we're only interested in accuracy.

BTW, Michael Wojcik and Chris Torek have answered several of your
questions; I've read their responses and I believe they're entirely
accurate.
I just haven't stated my point correctly using lawyer-like, perfect
semantics.

Now Keith, I was enjoying the fact that you didn't start assuming
ignorance on my part because you don't agree with my opinion, but
you are skating really close to the line here.

I know all about zero, pointers, etc. My opinion about location zero
is not due to any ignorance in programming C - although I have
freely admitted I do not own a copy of any "Standards" documents
other than K&R and K&R2.

You could if you wanted to. Visit
<http://www.open-std.org/jtc1/sc22/wg14/www/docs/n869/>
and you can get a copy of the latest public draft of the C99 standard
in PDF, Postscript, or plain text. The actual standard differs in
some areas, but I don't believe there are any differences relevant to
the current discussion.

Or you could pay $18 or so for a copy of the actual standard, but
that's not necessary for the current discussion.

In the above quoted paragraph where I wrote that "it's easy to assume
that you share the more common misconception and aren't expressing it
very well", I was not accusing you of expressing yourself unclearly.
I was trying to explain why, due to (what I see as) the unusual nature
of your misconception, a lot of us might have assumed that you're
expressing yourself unclearly.

My strong opinion is that, on this particular point, I'm right and
you're wrong, and that my opinion is based on better sources of
information than yours. I'm not implying that you're a generally
ignorant person, only that you happen to be ignorant on this
particular point. It's nothing personal. Let's both continue working
to keep it that way.

[...]
Even if I am wrong-headed, isn't it still true that location zero cannot be
written to or read from on YOUR machine. I assume you write C programs, so
just do it and tell me I'm wrong!

All my critics, please just write a simple program to read from location
zero in memory. I you can do it, post what you find there.

That's simple, isn't it?

Others have done so in this thread. I'm unable to do so myself,
because all the C implementations to which I currently have access
happen to use all-bits-zero to represent null pointers, and happen to
trap on attempts to dereference a null pointer. (I don't work with
embedded systems other than as an end user.)

Note that C doesn't require a trap for *any* pointer deference,
whether it's null, all-bits-zero, deallocated, uninitialized, or
random. Most such dereferences invoke undefined behavior. One of the
infinitely many possible results of undefined behavior is to quietly
fetch a valid value from the specified location.

(BTW, there's a point that I've ignored so far. A null pointer
doesn't necessarily have the same representation for all pointer
types. Not all pointer types even have to have the same
representation; for example, function pointers might be bigger than
data pointers. This isn't critical to the current discussion, and we
can safely restrict the discussion to systems where all data pointers
look alike and have a unique representation for the null pointer.
This restriction still lets us talk about systems where the
representation of a null pointer is not all-bits-zero.)
Christ, you really are a semantics nitpicker. Yes, not "necessarily" at
zero. Do you want this thread to end or are YOU the troll?

We're talking about a fairly subtle point, so precision is
extraordinarily important. We all make mistakes; whining when someone
corrects them is not constructive.

[...]
I am saying it *should not be* a valid address. IN MY OPINION. I am
not saying that any machine or any compiler or any standard does
this. I am stating that this is how *it should be*.

A serious question: if that sentence on page 102 of K&R2 weren't
there, would you still hold this opinion? What if the errata page
were updated to clarify that there's nothing special about an
all-bits-zero address (unless that happens to be the representation of
a null pointer)?

I have a lot of opinions about the way C *should be* (or, more
accurately, about the way it should have been). Many of them directly
contradict the actual language as defined by the standard and
described by various books including K&R2. I try to make it very
clear that these are just my opinions, and they're not about what the
language *is*, merely about what I might like it to be in an ideal
world.

I think you've been saying one (or both) of two things:

(1) The C language actually requires an attempt to read or write to the
all-bits-zero address to fail; or

(2) In your opinion, the C language *should* require an attempt to read
or write to the all-bits-zero address to fail.

I've been unsure about which of these things you're actually saying.
(That may very well be my fault.) Can you clarify, even if you think
you already have?

If your point is (1), my response is that you are mistaken, and that
this is easily documented. If your point is (2), I suppose I can't
say that you're mistaken, but I strongly believe that such a
requirement would be a bad idea; it would add confusion to an already
confusing area of the language with no real benefit that I can think
of. Systems that should trap on references to the all-bits-zero
address already do; systems that don't, probably shouldn't.
Name one.

I can't, but others have provided concrete examples.

[...]
I'm not sure how you can know that "each platform has a well-defined
unique pointer value that denotes no object (the null
pointer)". What is that value on the Palm OS? What did NeXT use?

On Palm OS, that value is the null pointer. On NeXT, that value is
the null pointer. I don't know how that value is represented, but
it's likely that it's all-bits-zero in both cases.

I should have said "each C implementation" rather than "each
platform". I know that each C implementation has a null pointer value
because the standard requires it.

(Again, I'm glossing over the possibility of different null pointer
values for different pointer types. Strictly speaking, there's a
distinct null pointer value for each pointer type in each C
implementation. Realistically, it's likely to have the same
representation for void* and all object pointer types for almost all
implementations, even those where a null pointer is something other
than all-bits-zero, but I wouldn't be astonished if there were
counterexamples.)
I assume that is sarcasm? I would LOVE to know what the R thinks, if
you can do that. I suspect it was a joke, and the rest of use
emoticons to make that clear, so a ;-) would have been appropriate.

No, I'm completely serious. Why do you assume that I'm being
sarcastic?
 
M

Malcolm

Old Wolf said:
What has that got to do with the location zero?
Read all of a post before hitting the reply button. The answer is of course
nothing, according to a literal interpretation of the modern standard,
though historically 0 represents the null pointer because it was always
absolute location zero.
 
C

Christian Bau

Keith Thompson said:
I'm going to use the notation $12345678$ to refer to a pointer whose
internal representation is the same as the integer 0x12345678.

Even an implementation with the null pointer represented as
$12345678$, integer to pointer conversions could still leave the bits
unchanged, except in the case of converting a null pointer constant to
a pointer type. Remember that a null pointer constant is a source
construct, and the conversion of a null pointer constant to a pointer
value takes place during compilation.

No, it does not, at least not in C99.

For example, in the definition of "equality operator" it says

Constraints: (three other possibilities and)
One operand is a pointer and the other is a null pointer constant.

Semantics: (two other possibilities and)
Otherwise, at least one operand is a pointer. If one operand is a
null pointer constant, it is converted to the type of the other operand.

Conversion is done according to exactly the same rules as all other
conversions; since a null pointer constant is either of an integral type
or of type void*, that conversion is either a conversion from an
integral type to a pointer type or from void* to another pointer type.
But there is no difference between this conversion and any other
conversion. It is conceptually not a compile time construct. (Of course
an optimiser will determine the result of the conversion at compile
time, just as 2+3 will be determined at compile time by practically
every compiler).
There's no requirement to
duplicate that conversion at run time.

So we could have:

(char*)0 --> $12345678$ (null pointer)
int zero = 0;
(char*)zero --> $00000000$ (non-null pointer)
(char*)0x12345678 --> $12345678$ (happens to be a null pointer)

Quite possible in C90, but most definitely not in C99. In C90, the
wording was such that in an assignment, or within an equality operator,
and probably some cases that I forgot, a null pointer constant was
replaced with a null pointer. (char*)0 was _not_ one if these cases and
in C90 not guaranteed to be a null pointer; in C99 they added that
_every_ conversion of a null pointer constant to a pointer produces a
null pointer.
 
M

Mabden

Keith Thompson said:
My strong opinion is that, on this particular point, I'm right and
you're wrong, and that my opinion is based on better sources of
information than yours.

Well then I guess I had better change my opinion on this matter then. Thank
you for the enlightenment. I will leave this thread a wiser mabden.
Others have done so in this thread. I'm unable to do so myself,
because all the C implementations to which I currently have access
happen to use all-bits-zero to represent null pointers, and happen to
trap on attempts to dereference a null pointer.

Aha, gotcha! Just kidding.

(BTW, there's a point that I've ignored so far. A null pointer
doesn't necessarily have the same representation for all pointer
types. Not all pointer types even have to have the same
representation; for example, function pointers might be bigger than
data pointers. This isn't critical to the current discussion, and we
can safely restrict the discussion to systems where all data pointers
look alike and have a unique representation for the null pointer.
This restriction still lets us talk about systems where the
representation of a null pointer is not all-bits-zero.)

This is a good point, tho, and really caused me to change my mind about the
zero location issue. I realized I had no answer for the question, "How BIG
should a zero location null pointer be, exactly?" Especially considering
moving into C++ objects; how would that be backwards-compatible? I actually
wish you HAD brought this up earlier.
[...]
I am saying it *should not be* a valid address. IN MY OPINION. I am
not saying that any machine or any compiler or any standard does
this. I am stating that this is how *it should be*.

A serious question: if that sentence on page 102 of K&R2 weren't
there, would you still hold this opinion? What if the errata page
were updated to clarify that there's nothing special about an
all-bits-zero address (unless that happens to be the representation of
a null pointer)?

No. My opinion is from that statement alone. I see^h^haw it as a Good Thing,
and believed it to be true. In fact, in my experience it HAS been true. I
don't work on embedded devices (other than Palms, if they are considered
such) or VAXen so my world view encompassed this notion fully.
I have a lot of opinions about the way C *should be* (or, more
accurately, about the way it should have been). Many of them directly
contradict the actual language as defined by the standard and
described by various books including K&R2. I try to make it very
clear that these are just my opinions, and they're not about what the
language *is*, merely about what I might like it to be in an ideal
world.

I think you've been saying one (or both) of two things:

(1) The C language actually requires an attempt to read or write to the
all-bits-zero address to fail; or

Yes, I believed this, but modified by belief to the next one when I was told
I was mistaken and it does not. I thought it was a Rule.
(2) In your opinion, the C language *should* require an attempt to read
or write to the all-bits-zero address to fail.

As a general "fail-safe", no matter what platform, no matter what compiler,
lifeline. An assurance that I could KNOW where a null pointer is, since as
we've discussed (void *)0 can be anywhere the compiler decides. I see the
light now, and repent my ways. No Real Programmer needs such a construct.
If your point is (1), my response is that you are mistaken, and that
this is easily documented. If your point is (2), I suppose I can't
say that you're mistaken, but I strongly believe that such a
requirement would be a bad idea; it would add confusion to an already
confusing area of the language with no real benefit that I can think
of. Systems that should trap on references to the all-bits-zero
address already do; systems that don't, probably shouldn't.
Gotcha.


I can't, but others have provided concrete examples.

More like sandstone.
Some have read from the location, which may be system code. I didn't see
anyone write a value and read it back out, but I may have missed something.
Chris did this:
void showZeros(void) {
char *p = 0;
int i = 0;

*p = 2;
printf("p = %p, *p = %d\n", (void *)p, *p);
p = i;
*p = 3;
printf("p = %p, *p = %d\n", (void *)p, *p);
}
But that's just a 0 in memory, isn't it? Especially p=i; Anyway it's run on
an emulator, so how valid is that?

CBFalconer just reads. I've never said that zero location can't be system
code, altho I desired it to be unreadable in C (before my conversion). But
my main point was for zero to be unwritable (which is wrong, which is wrong,
which is wrong!).
No, I'm completely serious. Why do you assume that I'm being
sarcastic?

I guess it's like talking to God or something. Unbelievable, that mere
mortals could engage The Writer of the Holy K&R.

Well, thanks again for your patient explanations. I have been converted and
I will try not to spout nonsense about Location Zero again.

Of course, other nonsense may slip through my fingers from time to time...
;)
 
K

Keith Thompson

Mabden said:
Well, thanks again for your patient explanations. I have been converted and
I will try not to spout nonsense about Location Zero again.

But, but ... you mean you've changed your mind, and admitted you were
wrong? This is Usenet; we don't do that here! Arguments are supposed
to continue without resolution until everybody has killfiled everybody
else and we're all left with nothing but simmering contempt.

Seriously, I'm glad to hear it. It's nice to see that sticky
questions can actually be resolved sometimes.
Of course, other nonsense may slip through my fingers from time to time...
;)

And mine.
 
C

Chris Torek

Chris did this:
void showZeros(void) {
char *p = 0;
int i = 0;

*p = 2;
printf("p = %p, *p = %d\n", (void *)p, *p);
p = i;
*p = 3;
printf("p = %p, *p = %d\n", (void *)p, *p);
}
But that's just a 0 in memory, isn't it?

What do you mean by "a 0 in memory"? These machines (PowerPC, MIPS32,
ARM32, Pentium, SPARC32) all have 32-bit-integer 32-bit-pointer flat
address space architectures (as configured anyway; some are capable of
other arrangements). Some of them have RAM at hardware address zero,
some have ROM, some have nothing at all.
Especially p=i; Anyway it's run on an emulator,

What makes you think it is "run on an emulator"? While we (Wind
River) do have two "vxsim" systems, one for Linux and one for SPARC,
we also supply code for MIPS, PowerPC, ARM, and other machines.
I have a PowerPC-based single-board computer here (the "wrsbc8260"),
and, via the "virtual lab manager", access to various other machines.
(I happen to *prefer* using vxsim on the Linux box, as it is fast
and convenient. But we have to test everything on everything, as
it were -- Diab and gcc, on simulators and real hardware, compiled
with each of the various supported "bsp"s as they are called.)
 
R

RCollins

Christian said:
No, it does not, at least not in C99.

For example, in the definition of "equality operator" it says

Constraints: (three other possibilities and)
One operand is a pointer and the other is a null pointer constant.

Semantics: (two other possibilities and)
Otherwise, at least one operand is a pointer. If one operand is a
null pointer constant, it is converted to the type of the other operand.

Conversion is done according to exactly the same rules as all other
conversions; since a null pointer constant is either of an integral type
or of type void*, that conversion is either a conversion from an
integral type to a pointer type or from void* to another pointer type.
But there is no difference between this conversion and any other
conversion. It is conceptually not a compile time construct. (Of course

OK, you lost me here. When talking about the null pointer constant (or
any constant for that matter), doesn't that mean that they are available
(as constants) at compile time? Does it make sense to talk about a
non-compile-time constant?
an optimiser will determine the result of the conversion at compile
time, just as 2+3 will be determined at compile time by practically
every compiler).




Quite possible in C90, but most definitely not in C99. In C90, the
wording was such that in an assignment, or within an equality operator,
and probably some cases that I forgot, a null pointer constant was
replaced with a null pointer. (char*)0 was _not_ one if these cases and

I thought that casting constant 0 to any pointer type produced the
null pointer; is this not the case?
 
K

Keith Thompson

Christian Bau said:
No, it does not, at least not in C99.

You're right that the language doesn't require the conversion to be
done during compilation; I was sloppy there. (But of course it can
be, and it typically is.)
For example, in the definition of "equality operator" it says

Constraints: (three other possibilities and)
One operand is a pointer and the other is a null pointer constant.

Semantics: (two other possibilities and)
Otherwise, at least one operand is a pointer. If one operand is a
null pointer constant, it is converted to the type of the other operand.

Conversion is done according to exactly the same rules as all other
conversions; since a null pointer constant is either of an integral type
or of type void*, that conversion is either a conversion from an
integral type to a pointer type or from void* to another pointer type.
But there is no difference between this conversion and any other
conversion. It is conceptually not a compile time construct. (Of course
an optimiser will determine the result of the conversion at compile
time, just as 2+3 will be determined at compile time by practically
every compiler).

Conversion of a null pointer constant to a pointer type is explicitly
a special case, at least in C99. (I think that was the intent in C90
as well, but C99 expresses it better; this is speculation on my part.)
Quite possible in C90, but most definitely not in C99. In C90, the
wording was such that in an assignment, or within an equality operator,
and probably some cases that I forgot, a null pointer constant was
replaced with a null pointer. (char*)0 was _not_ one if these cases and
in C90 not guaranteed to be a null pointer; in C99 they added that
_every_ conversion of a null pointer constant to a pointer produces a
null pointer.

Here's the C90 wording, with underscores denoting italics:

An integral constant expression with the value 0, or such an
expression cast to type void *, is called a _null pointer
constant_. If a null pointer constant is assigned to or compared
for equality to a pointer, the constant is converted to a pointer
of that type. Such a pointer, called a _null pointer_, is
guaranteed to compare unequal to a pointer to any object or
function.

Two null pointers. converted through possibly different sequences
of casts to pointer types, shall compare equal.

C99 says:

An integer constant expression with the value 0, or such an
expression cast to type void *, is called a null pointer constant.
If a null pointer constant is converted to a pointer type, the
resulting pointer, called a null pointer, is guaranteed to compare
unequal to a pointer to any object or function.

Conversion of a null pointer to another pointer type yields a null
pointer of that type. Any two null pointers shall compare equal.

In my opinion, C99's statement that a null pointer constant yields a
null pointer when converted to a pointer type does not imply that all
expressions of type int with value 0 yield a null pointer when
converted to a pointer type.

In my example above, I presume the part you disagree with is my
assertion that, given "int zero = 0;", the expression "(char*)zero"
needn't yield a null pointer. Since "zero" is not a null pointer
constant

Your assumption, I think, is that conversion of a given value from a
given type to another given type (in this case, respectively, the
value 0, the type int, and the type char*) must always yield the same
result. That's not a reasonable expectation, but I think it's
overridden by the fact that conversion of a "null pointer constant" is
explicitly a special case.

More concretely:

#include <stdio.h>
#include <string.h>

static int equal(char *x, char *y)
{
return memcmp(&x, &y, sizeof(char*)) == 0;
}

int main(void)
{
char *null_pointer = 0;
char *all_bits_zero_pointer;
int zero = 0;
char *maybe_null_pointer = (char*)zero;

memset(&all_bits_zero_pointer, 0, sizeof all_bits_zero_pointer);

if (equal(null_pointer, all_bits_zero_pointer)) {
printf("A null pointer is all-bits-zero\n");
}
else {
printf("A null pointer is not all-bits-zero\n");
}

if (equal(null_pointer, maybe_null_pointer)) {
printf("Conversion of int zero yields a null pointer\n");
}
else {
printf("Conversion of int zero does not yield a null pointer\n");
}

return 0;
}

I'm using memcmp rather than direct pointer comparison to avoid any
possibility of undefined behavior on attempts to access the value of
an invalid pointer.

I think we all agree that the first line of output from this program
may be either
A null pointer is all-bits-zero
or
A null pointer is not all-bits-zero

I assert that, regardless of the first line of output, the second line
can be either
Conversion of int zero yields a null pointer
or
Conversion of int zero does not yield a null pointer
under a conforming implementation -- though an implementation would
have to be particularly perverse to produce
A null pointer is not all-bits-zero
Conversion of int zero does not yield a null pointer

(The output happens to be
A null pointer is all-bits-zero
Conversion of int zero yields a null pointer
on all systems I'm familiar with; that's not very illuminating.)

To put it another way, I believe that choosing a value other than
all-bits-zero for the null pointer does not imply that
integer-to-pointer conversion has to do anything other than a bitwise
copy *except* in the case of a null pointer constant.
 
K

Keith Thompson

RCollins said:
OK, you lost me here. When talking about the null pointer constant (or
any constant for that matter), doesn't that mean that they are available
(as constants) at compile time? Does it make sense to talk about a
non-compile-time constant?

The null pointer constant itself exists only in the C program source.
Christian's point, I think, is that the *conversion* doesn't
necessarily take place at compile time (and he's correct).

For example, given:

char *ptr = 0;

the expression 0 is implicitly converted to char*. If this conversion
is non-trivial (something other than just copying the bits), it could
easily take place at run time, as long as it yields a null pointer
value.
I thought that casting constant 0 to any pointer type produced the
null pointer; is this not the case?

Yes, that is the case. The question is whether casting a non-constant
value 0 to a pointer type necessarily yields a null pointer.

Given "int zero = 0;", if you assume that all conversions are
performed at run time, it's difficult to imagine that (char*)0 and
(char*)zero could yield different results; it seems obvious that they
should be equivalent. Likewise, if you assume that all conversions
are done *as if* they were performed at run time, the situation is the
same. The question is whether the standard's treatment of conversion
of a null pointer constant as a special case is enough to allow this
equivalence to be broken. I think it is; Christian thinks it isn't.

I'm beginning to think this may call for a DR.
 
M

Mabden

Keith Thompson said:
But, but ... you mean you've changed your mind, and admitted you were
wrong? This is Usenet; we don't do that here! Arguments are supposed
to continue without resolution until everybody has killfiled everybody
else and we're all left with nothing but simmering contempt.

<best Homer Simpson voice> Doh!

But, not to worry, I think I saw someone else pick up the "(void *)0 isn't
location zero" argument again, so we can sit back and watch the
merry-go-round again, and again, and again, and again, and again, and again,
and again. <falls over dizzy>

Is there any magic way to say, "Hey! I started this thread, and I'm done, so
ya'll stop now, y'hear."?

Wait! I know how!!!!

You're all a bunch of Language Nazis!!!

That should to do it!
 
R

RCollins

Keith said:
The null pointer constant itself exists only in the C program source.
Christian's point, I think, is that the *conversion* doesn't
necessarily take place at compile time (and he's correct).

Ah ... gotcha. I was mis-reading the earlier posts about where the
conversion takes place.
 
C

CBFalconer

Mabden said:
.... snip ...

CBFalconer just reads. I've never said that zero location can't
be system code, altho I desired it to be unreadable in C (before
my conversion). But my main point was for zero to be unwritable
(which is wrong, which is wrong, which is wrong!).

Because I have no idea what is actually at location 0 and what it
does, and no interest in finding out. I expect reading to be
non-harmful to my system. I have great qualms about writing
thereto.
 
M

Mabden

CBFalconer said:
Because I have no idea what is actually at location 0 and what it
does, and no interest in finding out. I expect reading to be
non-harmful to my system. I have great qualms about writing
thereto.

Not to worry, it's guaranteed not to work!
Doh! Backsliding...!
Sorry, I had to do that... :)

It'd probably just overwrite some lowlevel boot loader or something. Here's
an idea, tho, read from Location Zero and write back the same value. Or
better yet write back something else, to show it can change, then put back
the original value. Then reboot.
 
C

Christian Bau

RCollins said:
I thought that casting constant 0 to any pointer type produced the
null pointer; is this not the case?

In C90 it was _not_ the case. There was _no_ rule in C90 that said that
casting constant 0 to any pointer type produced a null pointer. There
were rules saying that in certain situations a null pointer constant is
replaced by a null pointer, casting 0 to char* was not one of these
situations. In C99, this has been changed. A null pointer constant is
not replaced by a null pointer anymore, a conversion takes place
instead. Some other rule then guarantees that _all_ conversions from
null pointer constants to pointer types yield null pointers.
 
M

Michael Wojcik

Given "int zero = 0;", if you assume that all conversions are
performed at run time, it's difficult to imagine that (char*)0 and
(char*)zero could yield different results; it seems obvious that they
should be equivalent.

It's not obvious to me. (char*)0 is a null pointer constant, cast to
a char*. (char*)zero is an int variable value cast to char*. zero
is not a constant here, it does not meet the C99 definition of a null
pointer constant (as it is not an integer constant expression, nor
such an expression cast to void*), and operations on it need not obey
the null pointer constant rules in any fashion.

Consider in particular:

{
extern int zero;
char *p;

p = (char*)zero;
}

The value of zero when this block is executed may be 0, but that
can't be known at compile time, which would mean that if Christian
is correct the implementation would *have* to generate code to
check if the current value of zero is 0 before performing the
conversion. It's obvious *to me* that such a requirement is
unnecessarily burdensome, and I don't see it anywhere in the n869.
Perhaps it's in the final standard, but I haven't seen any quoted
text from that in this thread that would lead me to believe that,
either.

What it comes down to is simply that no expression involving a
variable is a null pointer constant, because null pointer constants
are always integer constant expressions. More specifically:

[#6] An integer constant expression87) shall have integer
type and shall only have operands that are integer
constants, enumeration constants, character constants,
sizeof expressions whose results are integer constants, and
floating constants that are the immediate operands of casts.
(n869 6.5.17)

So no expression containing a variable can be a null pointer
constant. Further, my interpretation of n869 6.5.4 #4 is that
an implementation would be non-conforming if it treated a non-
constant integer expression with value 0 as a null pointer
constant, if that implementation did not use all-bits-zero as
the null pointer value, because a cast "converts the value of
the expression to the named type". If the value of the integer
variable zero is 0, then casting it to char* should produce a
char* type object with value all-bits-zero.

--
Michael Wojcik (e-mail address removed)

There are many definitions of what art is, but what I am convinced art is not
is self-expression. If I have an experience, it is not important because it
is mine. It is important because it's worth writing about for other people,
worth sharing with other people. That is what gives it validity. (Auden)
 
K

Keith Thompson

It's not obvious to me. (char*)0 is a null pointer constant, cast to
a char*. (char*)zero is an int variable value cast to char*. zero
is not a constant here, it does not meet the C99 definition of a null
pointer constant (as it is not an integer constant expression, nor
such an expression cast to void*), and operations on it need not obey
the null pointer constant rules in any fashion.

Note the qualification: "if you assume that all conversions are
performed at run time ...". With that assumption, it follows, I
think, that (char*)0 and (char*)zero would yield the same value (which
must be a null pointer).

In fact, I don't make that assumption. In my opinion, an
implementation could legitimately perform the conversion in (char*)0
at compilation time (yielding a null pointer), and perform the
conversion in (char*)zero at run time (yielding, most likely, and
all-bits-zero pointer).

I was trying to present the other side of the argument. It looks like
I inadvertently convinced you that I believe it. At least I didn't
convince you that it's correct. :cool:}
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,145
Messages
2,570,828
Members
47,374
Latest member
anuragag27

Latest Threads

Top