C Standard Regarding Null Pointer Dereferencing

Seebs · Jul 24, 2010

You may have conceded his point by admitting to the specialized
exceptions. Evaluating *(char *)0 is undefined; none-the-less it
has a defined type. The statement

sizeof(*(char *)0);

does not (or at least should not) invoke undefined behaviour
because the expression is not evaluated.

Right.

But that's because sizeof() is magical in that it uses the type, not the
value. The expressions under discussion *are* evaluated. Basically,
sizeof()'s special language is the exception that proves the rule -- and
also explains why you might care about the type of an expression which
cannot be evaluated without producing undefined behavior. It's perfectly
reasonable to use such an expression *as the operand of sizeof* -- because
that uses the type rather than the result.

-s

Shao Miller · Jul 24, 2010

You may have conceded his point by admitting to the specialized
exceptions. Evaluating *(char *)0 is undefined; none-the-less it
has a defined type. The statement

sizeof(*(char *)0);

does not (or at least should not) invoke undefined behaviour
because the expression is not evaluated.

Thanks for suggesting that, Richard. That seems reasonable.

It's really unfortunate that on-topic, civil discussion resulted in
upset, here. Evidently the word "invented" can be perceived by some
folks as un-civil. I wish that I'd known of that possibility before-
hand. I honestly meant it in the context of definitions without
references, either made-up on-the-spot or not helping due to a lack of
citation. Also, there should be the possibility of "mistakenly
interpreting" and "mistakenly remembering" versus "intentionally
misrepresenting."

Communications can be challenged when one doesn't understand another
person's frame of reference. For Peter, he claims to have a wealth of
knowledge and experience. For me to boldly state that parts of his
discussion were invented might have come as a most unexpected and most
unlikely circumstance. I did not anticipate that possibility and now
suffer the loss of any future valuable discussion from him.

I try to interact much of the time through questions-only. This often
works out extremely effectively. In this discussion, I was a bit
desperate to understand the situation, especially after many
unsatisfactory (to me) responses, and must admit having neglected this
nice strategy, which has traditionally kept things civil, in my
experience. This lesson will stick with me.

Now you are quite right by me Richard, when you suggest that if anyone
at all had ever managed to recognize that my frame of reference
included the point that a result need not have both value and type,
then a simple response of "Your result has a defined type but no
defined value. Defined results must have both. Here's the
reference..." This might have shortened the path to a question:

Is the result of evaluating an expression with void type defined to
have all three of: 1. A result, 2. a type, 3. a value, given that all
constraints are met and the evaluation can be performed entirely
according to the semantics without invoking undefined behaviour?

Also, to be fair, there really was some ambiguity in the definitions
used by the discussants, due to ambiguity in the referenced draft. At
some point along I had assumed that there might be an oversight found
in some implementations and peoples' interpretations. Instead, I
discovered that these things, in fact, define the reality for C. The
draft standard is perhaps a possible _target_ for conformance, but
perhaps not the best definition to discuss in regards to adherence,
without some guess-work or asking around.

Having said that, it only confused me further when kind responders
both suggested that the draft is clear/obvious/unambiguous and that is
isn't. It really would have been better (for my benefit, anyway) to
agree on one or the other as soon as possible. Perhaps this feedback
will benefit future discussion, with good fortune.

Shao Miller · Jul 24, 2010

Well I might as well document this bit of trivia, since there've been
a couple of other bits of trivia mentioned.

01. void foo(void) {
02. return;
03. }
04.
05. int main(void) {
06. int i = 13;
07. void *v = &i;
08. (void)13;
09. foo();
10. *v;
11. return 0;
12. }

Does evaluation of the cast operator on line 08 yield a defined result
with a defined type and a defined value?

Does evaluation of the function call operator on line 09 yield a
defined result with a defined type and a defined value?

Do all defined results require a defined type and a defined value?

Does the evaluation of the indirection operator on line 10 yield a
defined result with a defined type?

Only posted as a trivial reference. Feel free to respond or ignore,
at your capable discretion.

Tim Rentsch · Jul 24, 2010

Shao Miller said:
After your fine reference to the text below, I'd have to agree.

I am not aware of anyone who's reading it like a math textbook and I'd
have to agree. It could be worth-while reading its fine detail and
discussing and resolving perceived ambiguities, for the case where one
might be interested in developing a translator for C.

Indeed I did.

This is to me an extremely valuable reference to the text of
'n1256.pdf'. I agree that with this reference in mind, it's
nonsensical to treat "If an invalid value has been assigned to the
pointer" as being intended to mean anything other than "If the operand
has an invalid value"... If only the text said "operand." It
doesn't. It says "pointer."

Is there any doubt that the operand has a value? We can assign '(char
*)0' or even '(void *)0' to an object. I don't think there's any
doubt that the operand has a value.

This could potentially be a cause for confusion, since sentences 2 and
3 explicitly use "operand" and "points to" and "has type". The next
sentence could very well mean, "if the value of the operand _is_ an
invalid value..." (Emphasis mine.) It could also mean, "if the value
of the operand was an invalid value assigned to the operand..."

Do you understand why I am asking about all of this? In the execution
environment if we attempt to access an object at an invalid location,
it should be undisputed as undefined behaviour. But expression
evaluation != execution. Evaluation of a constant scalar expression
such as '(char *)0' need not be "executed" at all. That is to say,
the text defines an attempted object access to an invalid location as
undefined behaviour. It could even be trapped by the best
implementation. But evaluation of an expression which is an
application of the unary '*' operator does noes necessitate an object
access to any location. If it did, the text should include something
like:

"The result of evaluation of the unary '*' operator shall be the value
of an object pointed to by the operand, if the operand point to an
object."

But that might not be the case. Consider these:

(*p).f();
(*q)->x = 10;
*r = 11;
(*s)();

For 'p', 'q' and 'r', if they point to an object, the result is an
lvalue. It's not a "value". There's no need to "fetch" the "value"
during the indirection at all, is there? Thus we only get undefined
behaviour if they _don't_ point to an object, which is a determination
that might only be possible during execution.

For 's', the indirection is intended to result in a function
designator. Not an lvalue. Not a "value".

It is clear that many people have tied evaluation of the unary '*'
operator to "yielding an object, pointed-to by the operand" in their
thinking. But this is not the case.

Also consider a Turing machine implementation with a tape and a head.
In the 'q' example above, if 'q' were assigned the value '(struct foo
*)0', the head might move to position zero, where "read" and "write"
are invalid. No read nor write is attempted. Then the head moves by
the offset of the 'x' member. At last, we attempt a write when we
assign, assuming that reads and writes are valid at that position.
Why should there be undefined behaviour by moving the head to position
0 any more so than to any other location which is invalid for objects
or for which the validity is not guaranteed?

Does anyone understand why "has been assigned" could be important?

char *p;
*p = 'Y';

If the Turing machine's head attempts to move to the location as per
'p', that location might not be a valid location for the head to move
to. Undefined behaviour. But how can you have _an_expression_ with a
_constant_scalar_value_ at _translation_time_ (let alone during
execution) possibly represent an invalid location for the head to move
to?

Your thinking seems very confused. I suggest you stop
thinking about execution on real machines or Turing
machines, and focus on reading the Standard to understand
what it says about semantics on the abstract machine. The
questions you've asked can be answered by reading the
Standard carefully, not just isolated sections but all of
sections 1-6, and considering what it's trying to say about
semantics on the abstract machine, which is the only one
that really counts. I'm not inclined to spend any more
effort responding to someone who seems to want other people
to do the work for something that he appears to be capable
of doing himself, if only he would put in more effort on
that and less on raising captious objections.

Ben Bacarisse · Jul 24, 2010

Shao Miller said:
On Jul 23, 11:36Â pm, Ben Bacarisse <[email protected]> wrote:
Ok. But I'd rather that even this was clearer.
1. There is a sentence which specifies a value for the result.

Check. 6.5.3.2 p4 "If the operand points to a function, the result is a
function designator; if it points to an object, the result is an lvalue
designating the object."

2. There is a sentence which specifies a type for the result.

Check (well there are two, in fact). 6.5.3.2 p2 "The operand of the
unary * operator shall have pointer type." And 6.5.3.2 p4 "If the
operand has type 'pointer to type', the result has type 'type'."

3. If the sentence regarding the value does not apply, the sentence
regarding the type is _insufficient_ to define a whole result.

Check. 4 p2 "Undefined behavior is otherwise indicated in this
International Standard by the words 'undefined behavior' or by the
omission of any explicit definition of behavior."

Underlying the specific issue you have is a problem that the standard
has never quite managed to resolve. There are three attributes that
matter about an expression and/or its "result": (a) the quantity (which
character, which integer, etc.), (b) the type, and (c) whether it is an
lvalue (there is a detail about whether its is also a modifiable lvalue
but lets simplify for the moment).

(b) and (c) can be determined from the syntactic form of the expression
along with some type analysis whereas (a) is a dynamic property of the
expression at run time. To use "result" for all of these clouds this
distinction and has led you to think that a "result" can be defined when
only the type is known.

I'd prefer the wording to be done like this:

Form: *E
Constraints: The operand, E, must have type 'pointer to T'.
Type: An expression of the form *E has type T and is an
lvalue if T is an object type.
Result: If the result of evaluating E is a pointer to a function,
the result is a function designator denoting the pointed
to function. If E the result of evaluating E is a
pointer to an object, the result denotes that object.

It would then be clear that the type is not really "part of the result"
but a property of the expression form -- something essentially static
and not associated with the evaluation. I'm not suggesting it -- the
work would be monstrous and there would be endless details to get right
(variably modified array types spring to mind) but this highlights what
the current wording is dealing with.

Of course, the way it is done now is much more intuitive. For most
expressions, it suggest that the result is a quantity tagged with a type
and lvalue-ness. But this does not work for sizeof, for example. It
does not (usually) evaluate it's result so the dynamic view of a
type-tagged result has no meaning. People know that the type can be
determined without evaluation so they apply common sense to understand
the sizeof operator. It's a shame that the wording is not perfect, but
it is not nearly as confusing as you seem to think.

<snip another discussion about void expressions. I don't want to get
into that here>

Agreed conditional upon acceptance of at least one of:
1. "...has been assigned..." really means something more like "is an
invalid value"

That is what most people take it to mean. Why? Because making special
provision for when a pointer is an object that has been assigned to
makes no sense when taken literally. Given:

const int *ip = 0;

*ip would not be covered but it would be after:

int *ip;
ip = 0;

Both would remain undefined by omission, so what value would the literal
interpretation serve?

OR:

2. Casting to 'void' and application of the unary '*' operator are
treated differently. Both may fail to define a value for the result
of an evaluation, but the cast is permitted as defined behaviour.

I don't see how this makes any difference but it does not matter because
I'm choosing (1) not (2)!

Each of these points feels like a blow, including any failure on my
part to treat the referenced draft as anything more than a guide to be
supplemented by popular consensus.

Hmmm. Now I doubt your sincerity again. Read what you wrote. You are
suggesting that I (and indirectly Tim) want you to treat the largely
unpaid work of dozens of experts over more than two decades as no more
than a guide to the language.

Further more, neither of us is suggesting that "popular consensus" is
the main tool to be used when there is ambiguity. That would be absurd.
Do you see how that comes over?

"Common sense" meaning "popular interpretation" to me. Very well;
accepted.

I think you need to refine your understanding of the term "common
sense".

If writing a translator, I might have a 'struct result' with a pointer
to a type and a pointer to a value. I might initialize these with
NULL each. If an "operator" for a 'struct result' demanded one of
these properties but it was not defined, I might diagnose undefined
behaviour.

That's hard to get right though it can be done. You need to make sure
that sizeof (1/0) works properly and that you distinguish between
"plain" values and lvalues. Your eval function needs flags to say what
sort of "evaluation" to do. In the example, 1/0 needs to be "evaluated"
for its type alone. I put evaluation in quotes because it is not C's
notion of evaluate but one that comes from the interpreter you are
writing.

It simply seemed to me that there were circumstances in
which some code path for "evaluation" might not ever use one of the
properties, which would lead me to question the validity of diagnosing
as UB if one compares as NULL but there was no expectation for it to
be non-NULL... Such as a void expression, which appears to be more
limited than I thought (casts to void and functions returning void,
for example).

I don't follow all of this. Yes, there are cases where one would not
ever use one of the properties such as in sizeof (1/0), but in
(void)(1/0) you need to evaluate 1/0 for its value property so that you
can throw it away.

<snip>

| It would be nice if a constraint for the
| unary '*' operator were that the pointer must either point to an
| object or to a function. Perhaps some kind reader could introduce
| such a constraint into a future standard.

A constraint that: "except when the '*' operator is used as the
operand to the 'sizeof' operator, an expression evaluating to a null
pointer constant or to a null pointer constant cast to any pointer
type shall not be the operand," might do, mightn't it?

Either this can't be a constraint (because you mean to include something
that can't be tested at compile time) or you have now changed the
suggestion to catch only a few cases. It all depends on what you mean
by "evaluating to a null pointer constant or to a null pointer constant
cast to any pointer type".

Yes. The function call assigns the value of an argument to the 'ip'
parameter. Passing in invalid value would result in UB.

You proposed a constraint (which I have put back since you cut it)
"were that the pointer must either point to an object or to a
function". My example (yes, we both agree it is UB) shows that you
can't tell *at compile time* if the constraint you originally proposed
is or is not violated.

Well actually, it does explain what you can do with the results. I
had made earlier references to these. "Cast operators"' first
constraint says "Unless...the operand shall have scalar type". Its
first semantic point talks about "the value of the expression."
"Simple assignment" talks about "type" and "value" for the
"operands". That explicitness (along with void expressions) was part
of why a result was not required to have both, against the consensus
here.

There is no mention of your suggested type-only results. They would be
a major part of the language. How do they work? What can we do with
them?

However it was the _consumers_ of the results that I was taking
to give meaning to constraint-valid and semantically valid
expressions. The consensus appears to be that the results are defined
or not, regardless of the consumers or their properties, except for
'sizeof' (which nobody has disputed).

If it is a consensus, it is born out of the wording. An evaluation of
*E is defined when E points to an object or a function. (char *)0 does
neither. To allow the fact the expression form has a type to mean that
it also as some sort of valueless, type-only result is to invent a whole
new language.

<snip>

Ben Bacarisse · Jul 24, 2010

Shao Miller said:
Well I might as well document this bit of trivia, since there've been
a couple of other bits of trivia mentioned.

01. void foo(void) {
02. return;
03. }
04.
05. int main(void) {
06. int i = 13;
07. void *v = &i;
08. (void)13;
09. foo();
10. *v;
11. return 0;
12. }

Does evaluation of the cast operator on line 08 yield a defined result
with a defined type and a defined value?

Does evaluation of the function call operator on line 09 yield a
defined result with a defined type and a defined value?

Do all defined results require a defined type and a defined value?

Does the evaluation of the indirection operator on line 10 yield a
defined result with a defined type?

Only posted as a trivial reference. Feel free to respond or ignore,
at your capable discretion.

Let me propose something else. Post an example where you think that
some significant part of the meaning of the program depends on the
answers to your questions. I.e. find an example that matters. This
will interest people.

Everyone here (I am guessing) has their favourite examples of where the
literal wording in the standard falls short of giving an answer to some
question or other but most people want to write effective well-defined C
programs and they somehow manage to that despite these details.

Shao Miller · Jul 24, 2010

The word "claim" is rather loaded. Peter /does/ have a wealth of
knowledge and experience. That doesn't mean he is necessarily right, but
it does mean that the probability of his being right is significantly
high. When I find myself disagreeing with Peter, it always gives me
pause for thought.

Is it possible that "claims" could be injected with additional meaning
by the reader but not by the writer? If I don't fully know the whole
picture regarding everyone's status, can I reasonably say "has"
instead of "claims to have"? In other words, I am a newcomer, here.
I don't know any of you. It might be beneficial to my understanding
of C to get to know some of you.

But can I possibly know before-
hand what magic words will set people off, such as "invent" and
"claims"?

Is it reasonable for me to simply take note of these words and avoid
them in the future? Can I do so without feedback like yours,
Richard? My answer would be "no." And so I thank you.

And also you have provided some evidence for the claim; this is
recorded. Would you have offered that evidence without my use of
"claims"? Does that last question suggest that I used "claims"
intentionally towards that end? I didn't. It was meant as a simple
statement of fact.

"Claims" will be dropped from my vocabulary here now, surely. Do you
happen to know of a nice list of words to avoid, like these ones?
That would be great.

Shao Miller · Jul 24, 2010

No. Operators yield values, but they are not themselves evaluated.
Therefore, there is no "evaluation of the cast operator".

No, for the same reason.

What do you mean by "result" in this context? Is a successful write to
stdout a "result"? Some would say yes.

Previous discussion led me to believe that there was a common
understanding of the term "result". If that's so, that common
understanding is what I intended to ask about, here.

No - see above.

By your correction above, it would appear that these questions are
broken. I shall attempt to ask different ones, instead. Thank you!

01. void foo(void) {
02. return;
03. }
04.
05. int main(void) {
06. int i = 13;
07. void *v = &i;
08. (void)13;
09. foo();
10. *v;
11. return 0;
12. }

When line 08 is evaluated by a conforming implementation, is the
behaviour well-defined to produce a result for the expression
'(void)13'? Does that definition define both a type for the result as
well as a value for the result?

When line 09 is evaluated by a conforming implementation, is the
behaviour well-defined to produce a result for the expression
'foo()'? Does that definition define both a type for the result as
well as a value for the result?

Using a conforming implementation, is the evaluation of every
expression within a strictly conforming program well-defined to
produce a result? In this same circumstance, is that result well-
defined to possess both a type as well as a value?

When line 10 is evaluated by a conforming implementation, is the
behaviour well-defined to produce a result for the expression '*v'?
Does that definition define a type for the result?

Shao Miller · Jul 24, 2010

The standard only mentions values for expressions of object type.

The expression in the expression statement
on line 08 is of type void.

Agreed. How can a conforming implementation make that very
determination?

The value of such an expression
is described by the standard as being "nonexistent".

N869
6.3.2.2 void
[#1] The (nonexistent) value of a void expression (an
expression that has type void) shall not be used in any way,
and implicit or explicit conversions (except to void) shall
not be applied to such an expression.

Thanks, pete!

Shao Miller · Jul 24, 2010

Your thinking seems very confused. I suggest you stop
thinking about execution on real machines or Turing
machines,

Turing machines were only mentioned as a possible explanation for why
the author of the "has been assigned..." piece of semantic for unary
'*' might have intentionally meant the text to be taken literally.

and focus on reading the Standard to understand
what it says about semantics on the abstract machine. The
questions you've asked can be answered by reading the
Standard carefully, not just isolated sections but all of
sections 1-6, and considering what it's trying to say about
semantics on the abstract machine, which is the only one
that really counts.

Have you taken my Turing machine example to mean that I am worried
about anything but the abstract machine?

I'm not inclined to spend any more
effort responding to someone who seems to want other people
to do the work for something that he appears to be capable
of doing himself, if only he would put in more effort on
that and less on raising captious objections.

If one perceives a gross ambiguity in the draft of a standard for C,
what is the best course of action to gain clarity on it after
exhausting the material in that draft? Do the references cited
through the original post and thereafter by its author suggest that
the draft material was read throughly? How many references by other
posters appear in the original poster's initial handful of posts? How
many do not? How many times should a person read something before
asking for others to help by sharing their interpretations and their
reasoning for those interpretations?

I can appreciate that you have made your judgment about me here and I
thank you for what you did contribute, Tim.

Seebs · Jul 24, 2010

The word "claim" is rather loaded. Peter /does/ have a wealth of
knowledge and experience. That doesn't mean he is necessarily right, but
it does mean that the probability of his being right is significantly
high. When I find myself disagreeing with Peter, it always gives me
pause for thought.

Which, given who's saying it, I find quite flattering.

I guess that word, again, strengthens my point: If those remarks really
aren't intended as offensive (and "claim" has the same sorts of connotations
of dishonesty that "invented" does), then that implies a level of familiarity
with English inconsistent with arguing with more fluent speakers when they
tell you that a given text is clear in its meaning.

-s

Shao Miller · Jul 24, 2010

Check. 6.5.3.2 p4 "If the operand points to a function, the result is a
function designator; if it points to an object, the result is an lvalue
designating the object."

Thank you. It does appear that this point defines a value for the
result under certain circumstances.

Check (well there are two, in fact). 6.5.3.2 p2 "The operand of the
unary * operator shall have pointer type." And 6.5.3.2 p4 "If the
operand has type 'pointer to type', the result has type 'type'."
Agreed.

Check. 4 p2 "Undefined behavior is otherwise indicated in this
International Standard by the words 'undefined behavior' or by the
omission of any explicit definition of behavior."

Agreed, and originally intended as the meaning.

Underlying the specific issue you have is a problem that the standard
has never quite managed to resolve. There are three attributes that
matter about an expression and/or its "result"...
... ... ...
I'd prefer the wording to be done like this:

Form: *E
Constraints: The operand, E, must have type 'pointer to T'.
Type: An expression of the form *E has type T and is an
lvalue if T is an object type.
Result: If the result of evaluating E is a pointer to a function,
the result is a function designator denoting the pointed
to function. If E the result of evaluating E is a
pointer to an object, the result denotes that object.

It would then be clear that the type is not really "part of the result"
but a property of the expression form -- something essentially static
and not associated with the evaluation...
... ... ...

Agreed. Could your suggestion be met with criticisms for your simple
preference? Would it be enjoyable if anyone suggested that your
opinion on perceived ambiguity for readers of this material is not
worth offering? I thank you for it.

Of course, the way it is done now is much more intuitive...
... ... ...
It's a shame that the wording is not perfect, but
it is not nearly as confusing as you seem to think.

What is it exactly that causes you to believe that I find the wording
confusing?

<snip another discussion about void expressions. I don't want to get
into that here>

That's unfortunate for me, but I accept it.

That is what most people take it to mean. Why? Because making special
provision for when a pointer is an object that has been assigned to
makes no sense when taken literally. Given:

const int *ip = 0;

I see the truth of it. This initialization specifies the value
initially stored in the object but there is no assignment expression.
If we take the text to mean "...assigned to the pointer by an
assignment expression...", then "the pointer" could not have been
assigned-to.

*ip would not be covered but it would be after:

int *ip;
ip = 0;

Both would remain undefined by omission, so what value would the literal
interpretation serve?

I don't follow you here. You said '*ip' "would be" covered "after"
and then said both would remain undefined by omission.

I don't see how this makes any difference but it does not matter because
I'm choosing (1) not (2)!

I will choose (1) as well, as that's the overwhelming consensus, even
though I perceive the possibility of a different intention by the
author. I won't bore anyone by repeating it. Consider me convinced.

Hmmm. Now I doubt your sincerity again. Read what you wrote. You are
suggesting that I (and indirectly Tim) want you to treat the largely
unpaid work of dozens of experts over more than two decades as no more
than a guide to the language.

Please excuse me. Have I said suggested incorrectly? Have I
suggested that this "unpaid work of dozens of experts over more than
two decades" is not a valuable resource? If so, what would cause you
to suggest that have implied/meant/stated that? I believe this
resource to be _the_ most valuable resource for C. Is it a guide? In
what way(s) is it more than guide? Is it a math textbook? I would
answer that with "no."

Further more, neither of us is suggesting that "popular consensus" is
the main tool to be used when there is ambiguity. That would be absurd..

Do you see how that comes over?

Then I have misinterpreted and apologize. What should be the main
tool in case of ambiguity?

I think you need to refine your understanding of the term "common
sense".

An accepted possibility. So do you suggest that common sense and
perusal of the draft/standard is all that is required to develop a
conforming implementation? If so, do you see how that might come over
to someone? Would you be willing to help to refine an interpretation
of the term "common sense"?

That's hard to get right though it can be done. You need to make sure
that sizeof (1/0) works properly and that you distinguish between
"plain" values and lvalues. Your eval function needs flags to say what
sort of "evaluation" to do. In the example, 1/0 needs to be "evaluated"
for its type alone. I put evaluation in quotes because it is not C's
notion of evaluate but one that comes from the interpreter you are
writing.

Good tips.

I don't follow all of this. Yes, there are cases where one would not
ever use one of the properties such as in sizeof (1/0), but in
(void)(1/0) you need to evaluate 1/0 for its value property so that you
can throw it away.

You have explained that you are not interested in discussing 'void' at
this time. I accept this and won't trouble you, here.

Either this can't be a constraint (because you mean to include something
that can't be tested at compile time) or you have now changed the
suggestion to catch only a few cases. It all depends on what you mean
by "evaluating to a null pointer constant or to a null pointer constant
cast to any pointer type".

Can an expression at translation-time evaluate to a null pointer
constant? Can an expression at run-time evaluate to a null pointer
constant? 6.3.2.3 p3 includes detail concerning an "integer constant
expression". Could the value of an object be an integer constant
expression? If everyone knows or at least is adamant that "you can't
dereference a null pointer," could this constraint hurt the standard
for C? Could it help to prevent questions like mine from accumulating
more responses than a single response with a citation?

You proposed a constraint (which I have put back since you cut it)
"were that the pointer must either point to an object or to a
function". My example (yes, we both agree it is UB) shows that you
can't tell *at compile time* if the constraint you originally proposed
is or is not violated.

Agreed. I have abandoned that constraint in favour of the "null
pointer constant" constraint mentioned.

There is no mention of your suggested type-only results. They would be
a major part of the language. How do they work? What can we do with
them?

Could a void expression be considered a type-only result? 6.3.2.2 p1
describes an expression with no value (an empty set) and type 'void'.
Did we agree that such an expression is evaluated? They way I
perceive them to work is: If an expression's type is defined to have a
type "T" and a value is not defined for evaluation, and that type "T"
is 'void', the expression is a void expression. Is this a reasonable
perspective? Sorry that we wound up with 'void', here. I apologize
but wished to answer your questions.

If it is a consensus, it is born out of the wording....

Agreed.

Thanks, Ben!

Shao Miller · Jul 24, 2010

Let me propose something else. Post an example where you think that
some significant part of the meaning of the program depends on the
answers to your questions. I.e. find an example that matters. This
will interest people.

Are you suggesting that the above example doesn't matter?

Everyone here (I am guessing) has their favourite examples of where the
literal wording in the standard falls short of giving an answer to some
question or other but most people want to write effective well-defined C
programs and they somehow manage to that despite these details.

One such well-defined program might even be itself a C
implementation. It would be good to do it right by everyone here.

Shao Miller · Jul 24, 2010

Which, given who's saying it, I find quite flattering.

I guess that word, again, strengthens my point: If those remarks really
aren't intended as offensive (and "claim" has the same sorts of connotations
of dishonesty that "invented" does), then that implies a level of familiarity
with English inconsistent with arguing with more fluent speakers when they
tell you that a given text is clear in its meaning.

There's no pleasing all of the people, all of the time. If someone is
misinterpreting my words to carry connotations that weren't intended,
I can only try to address such in the future based on feedback, but
it's really _their_ constraint on interpretation. If someone chooses
to interpret negative connotations, I don't know exactly where that
comes from. Perhaps it is congruent with the same perspective that
makes it all right to presume homogeneity amongst English speakers'
use of English in discussion of a subject matter.

There is nothing non-objective about using "claims," but it'll be
dropped nonetheless in the future.

Please have tolerance for other people and they way they might write,
even if you've previously been worn down by abusers. If Bayes tells
you an abuser is likely at hand, by all means, fine. That's entirely
reasonable.

Keith Thompson · Jul 25, 2010

Shao Miller said:
Agreed. How can a conforming implementation make that very
determination?

I don't understand the question. The expression statement in question
was
(void)13;
Determining the type of (void)13 is just one of the thousands
of things an implementation is required to do.

Seebs · Jul 25, 2010

You get to know, after a while, who tends to be right more often than
not. Seebs is right more often than not. So is Keith Thompson. So is
Eric Sosman. So is David Thompson. (Non-exhaustive list.) If you are
ever lucky enough to see the return of Lawrence Kirby or Chris Torek or
Steve Summit, you will find that they are hardly ever wrong (but "hardly
ever" is not "never").

My boss once found a bug in some of Chris Torek's code!

.... once.

I haven't yet, although I have now at least once managed to sneak a bug
past his code review. (That said, my ability to sneak retroactively-obvious
bugs past code review has become something of a local legend.)

You can't, I suppose. But what you /can/ do is learn.

That said, I'm pretty sure the usual convention of using "invented" to mean
"made up" (and thus, not derived from reality, and thus implicitly dishonest)
is sufficiently widespread to not need special newsgroup-specific knowledge.

It isn't so much a word-list as an exercise in objectivity. It can be
quite difficult to stand back from the text you write and see it as
others will see it. But it is a worthwhile exercise, nonetheless.

Very much so.

As a quick starting point, just check the things you say to see whether they
would make any sense at all if you were confident the people you're talking
to were being honest. If they wouldn't, people will reasonably assume you
to be implying that they are dishonest.

Accusing someone of having "invented" something right after they've said
that a given text communicates it implies clearly that the text didn't
communicate that at all -- meaning it implies that they're being dishonest.

Interesting side note: Several of our protagonist's problems with the
Standard reflect a similar problem -- not being aware of the logical
implications of what is said and what is unsaid. If I were going to try to
address such a thing, I'd start by studying the Gricean Maxims, because
they're the underlying substrate over which words create meaning.

-s

Shao Miller · Jul 25, 2010

I don't understand the question. The expression statement in question
was
(void)13;
Determining the type of (void)13 is just one of the thousands
of things an implementation is required to do.

Absolutely agreed that it is just one of those things. The challenge
I perceive from this code is:

Why is a cast to 'void' well-defined but "dereferencing" a 'void *'
not well-defined?

If I'm not mistaken, an expression with 'void' type has no value
(6.3.2.2,p1 "nonexistent" , 6.2.5,p19 "empty set"). When we read
about the "Cast operators" (6.5.4), what I directly observe from its
text is that this operator "converts the value of the expression to
the named type." The named type in '(void)13' is, of course, 'void'.
We know that an expression with 'void' type has no value, so that must
mean that the value is discarded during the conversion of the value,
would you agree?

Now we look at the text for the unary '*' operator (6.5.3.2,p4).
There we see a definition for the type of the result (of an
evaluation), just as we do for casting, would you agree? Thus if the
operand has type pointer-to-void, it suggests that the result has type
'void'. I would suggest that we must use the very same reasoning as
we do in our interpretation of cast operators to conclude that the
result is thus a void-expression (6.3.2.2,p1). The sentences
describing the result if pointing to an object and pointing to a
function do not apply. Nonetheless, the text appears to define the
result of evaluating application of unary '*' to an expression with
type pointer-to-void as being a result with type 'void'. A result
with type 'void' can be considered a void expression just as much as
the cast can be, can it not?

Same thing with a "function returning void" (6.5.2.2,p1). Its
evaluation is defined to be a result with type 'void' (6.5.2.2,p5).

Why is it that at least the C implementation named "GCC" appears to
distinguish between these three scenarios? Do other implementations,
as well?

My experience and my "common sense" suggests that "you cannot
dereference a 'void *'." Unfortunately, the text of the referenced
draft does not make that explicit (as far as I've yet been able to
determine; 6.5.3.2). Then thinking it through, it appears that
there's really _no_need_ for such indirection to yield UB. It could
simply be another form of void expression, like the two others.

If we can agree on this, what implications might there be for existent
implementations? I can only perceive a change to treat the behaviour
(of dereferencing a pointer-to-void) as well-defined, rather than
undefined. That really doesn't seem like a big deal, to me.

If we forget about _any_ of the debate regarding the "...has been
assigned..." business and accept it to mean what a quick glance might
suggest, we would _still_ have UB if the operand was a null pointer
value, if we additionally accept the non-normative footnote regarding
a null pointer being an invalid value. No gigantic implications for
implementations, there.

I fear that someone might respond as though the "points to and object"
and "points to a function" combined sentence somehow has some type of
priority over the following "has type" sentence... But _please_ note
the lack of "shall"s and "shall not"s.

I also fear that someone might respond that any 'void *' value would
be an invalid value, because such a value neither points to an object
nor to a function. That argument would also ignore the equal
precedence of the sentence regarding "has type".

Direct comparison with the "cast operators" might be useful. The text
does not define a value when the conversion is to type 'void'. But
rather than being undefined behaviour, we know from other parts of the
draft that 'void' represents no value(s). So defining the type of an
expression to be 'void' essentially defines the expression to be a
void expression. I cannot see how this could be any different for the
unary '*' operator (which we might have to temporarily detach our
familiarity with in order to study in this regard).

Please take your time to consider the implications before responding.
I am very hopeful for an agreement here, but assume that if major
overhauls and negative implications might be perceived, that an
agreement is less likely to happen. I really don't see how:

int x;
void *p = &x;
*p;

could be a huge deal as well-defined behaviour.

It is also possible that someone will point out a reference in the
text that I've read several times but have interpreted differently,
which makes the above proposal impossible. I'm happy either way.

Shao Miller · Jul 25, 2010

... ... ...

Nope. No point in rebutting a non-existent statement.

You might have misunderstood me, here. The "evidence" I meant to
refer to was another person offering that Peter _does_ have a wealth
of knowledge and experience for C. Regardless of this evidence, to be
honest, it was never a question to me. I have been operating under
the _assumption_ that _any_ intelligent response received in this
forum (as Peter Seebach's posts after his first are) are those of
seasoned C developers. The benefit of the doubt is there; my
expectation would be for a responder to have to establish their self
as _less_than_that_ before I would begin to associate less
credibility. Additionally, I shall not mistakenly associate personal
incompatibilities with a lack of C "seasoning".

... ... ...

I agree 100% with the entirety of this response post of yours,
Richard. It is, in my opinion, a worth-while read for any newcomer in
my situation. I will try to keep what you have stated in mind and
consider your references. I can't thank you enough, really. I did
not particularly anticipate these experiences when posting about
interests in C.

I'll also not be drawn into personal back-and-forth, because my focus
here is C.

It's possible that part of what has landed me in "hot water" here is
my expectation that discussion could take the form of directly
addressing a response's points in a manner of debate. Also, that I
sometimes attempt to calibrate my responses based on inferences about
the posters. This can obviously backfire, big-time. For example, one
way to establish rapport might be to mimic some of the attitudes
perceived as being possessed by a discussant. If a poster appears to
me to demonstrate authority without references, I might respond in the
same tone, insofar as it doesn't violate any of my personal
constraints for civil discussion. It's quite possible to fail to
determine what constitutes civil discussion in the perspective of the
other poster.

Anyway, back to C... Because that's what the subject-at-hand is...

Shao Miller · Jul 25, 2010

Because the Standard defines the one but not the other. It's not clear
what you're trying to get at here.

I cannot refer to the standard at this time, I'm afraid. I can only
refer to the draft with filename 'n1256.pdf'. In the rest of my post
I have provided quite a bit of detail regarding my interpretation that
this draft _does_ fully define the result of evaluating the (sole)
unary-expression on the third line below of something like:

int i;
void *v = &i;
*v;

13 has a value. (void)13 has no value. Neither, it appears, does your point.

Did you read the rest of the post? If so, it would be beneficial to
me if you would address exactly where you perceive faults in my
reasoning about the subject matter.

"The Standard defines the one but not the other" and the implication
that my point has no value really do not help me. My post provides
detail for how the draft _does_ define both. I have even met with
agreement on this point by another discussant in another C-devoted
forum, whose pedantry and accuracy I have historically valued.

Why is it that you have snipped so much of my post instead of pointing
out statements you disagree with?

My point is that there is a form of void expression in C which is not
commonly considered. I have observed at least one implementation to
print a warning, where I see no trouble. I have reasoned that this
and any other implementation's treatment of this form of void
expression may need addressing.

Is that clear?

Shao Miller · Jul 25, 2010

Yes, but you are rapidly using it up.

I am sorry to report that I do not understand this statement. Could
you please clarify what you mean here?

No, that's the usual way of things in clc, so that is not the
explanation for the high temperature of your entry into the group.

You have not provided any alternative explanation, but that's fine. I
shall continue to focus on my concerns with C.

HELP:function at c returning (null)	4	Mar 21, 2024
possible NULL && dereferencing NULL pointer	8	Jan 31, 2012
Arithmetic will null pointer	19	Jun 16, 2010
Pointer-to-Object type error	0	Mar 26, 2022
Array of structs function pointer	10	Jul 16, 2023
Comparison of Integer and Pointer (that's supposed to be an Integer). Where did I go wrong?	0	Nov 19, 2022
Question regarding array assignment	41	Dec 8, 2013
Questions Regarding Null and Casting	10	Jun 3, 2011

C Standard Regarding Null Pointer Dereferencing

Seebs

Shao Miller

Shao Miller

Tim Rentsch

Ben Bacarisse

Ben Bacarisse

Shao Miller

Shao Miller

Shao Miller

Shao Miller

Seebs

Shao Miller

Shao Miller

Shao Miller

Keith Thompson

Seebs

Shao Miller

Shao Miller

Shao Miller

Shao Miller

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads