C Standard Regarding Null Pointer Dereferencing

Shao Miller · Jul 26, 2010

Lets agree to differ then.

Very well. Agreed.

This is
very different to what I take the phrase "popular consensus" to mean.
For example, I'd bet that the popular consensus amongst C programmers is
that
int two_d_array[10][10];
two_d_array[0][20] = 42;
is well-defined, but it's hard to find well-reasoned arguments from C
experts to support that view.

Click to expand...

Click to expand...

Aha. My take would make good use of 6.5.2.1,p2:

Click to expand...

"The definition of the subscript operator [] is that E1[E2] is
identical to (*((E1)+(E2)))."

Click to expand...

So 'two_d_array[0][20]' becomes (ugh) '(*((*((two_d_array)+(0)))
+(20)))', evaluation of which proceeds as follows:
1. 'two_d_array' slides easily through its parentheses
'(*((*(two_d_array+(0)))+(20)))', 6.5.1,p6.
2. 'two_d_array' is not the operand to 'sizeof' nor '&', so it becomes
an expression having type pointer-to-int[20], pointing to the first
element, per 6.3.2.1,p3. 'sizeof two_d_array[0]' should confirm this..
3. '0' easily slides out of its parentheses.
4. We add '0' to the pointer result in step 2. The declaration
provided an array object with 10 array-of-int objects. The element
with offset 0 is within the bounds (+1) and we have a result with type
pointer-to-int[20]. '(*((*(result))+(20)))' per 6.5.6,p8.
5. The result of step 4 easily slides. '(*((*result)+(20)))'
6. The unary '*' operator yields a result with type int[20].
'(*((result)+(20)))', per 6.5.3.2,p4.
7. The result from step 6 slides. '(*(result+(20)))'
8. So does '20'. '(*(result+20))'
9. The result of step 6 is not an operand to 'sizeof' or '&', has
array type, and is thus an expression having type pointer-to-int,
congruent with step 2's reference.
10. There is an 'int' object within the bounds of the array object at
offset 20, a pointer to this object is the result, congruent with step
4's reference. '(*(result))'
11. The result of step 10 slides. '(*result)'
12. The unary '*' operator yields a result having type 'int' and which
is an lvalue, congruent with the reference from step 6. '(result)'
13. The result of step 12 slides. It's still an lvalue. 'result'
14. The rest is assignment to that lvalue.
Where along here is there sometimes debate, if we can digress for at
least a moment?

Click to expand...

I don't think there is much debate. If there is any, I think it is
about what array to use as the bounds.

Ok. My impression of 'two_d_array[0] + 20' would be that
'two_d_array[0]' should yield a result with type 'int *', with an
intermediary state as an 'int[10]'. So adding 20 should try to move
to the 20th 'int' element in the entire array object. I believe that
6.5.6,p2 excludes an aggregate type, such as an array type.

"For addition, either both operands shall have arithmetic type, or one
operand shall be a pointer to an object type and the other shall have
integer type."

So I don't think an 'int[10]' could be an operand, here. I think the
case for trouble would be:

two_d_array + 20 /* Having an operand with a type going from 'int[10]
[10] to 'int(*)[10]' */

versus:

two_d_array[0] + 20 /* Having an operand with a type going from
'int[10] to 'int *' */

The array-to-pointer conversions come from 6.3.2.1,p3. That's an
example of an implicit conversion.

"...an expression that has type ‘‘array of type’’ is converted to an
expression with type ‘‘pointer to type’’..."

I got lost in your explanation because you have phrases like "having
type pointer-to-int[20]" and "a result with type int[20]". I'm not sure
what these mean. The last is a C type but it does not occur in the
evaluation so I think you meant something else by it.

Well I apologize. I mixed styles here. "pointer-to-int[20]" meaning
"int(*)[20]" and "int[20]" meaning just that. Unfortunately, it's
actually a typo. I should have written '10' instead of '20' in both
places.

Shao Miller · Jul 26, 2010

I understand that you believe that it does. None-the-less you
are confused. The name of the type of void * can be considered
to be "pointer to void" since that is what the syntax indicates.

Beyond syntax, as well as the use of "pointer to void" in 6.3.2.3,p1
and 6.5.3.4,p5 and 6.5.15,p6, etc. This last even goes so far as to
detail a qualified version of 'void', which might confuse someone
until they read 6.7.3,p3. That seems like there's a decent effort to
integrate 'void' as a type, though it also "comprises an empty set of
values", as per 6.2.5,p19.

None-the-less something that has type void * is not a pointer to
void. In the code

void * p;

p is a pointer to a storage object of unspecified type. That is,
the type of p is "pointer to a storage object of unspecified
type".

That is roughly the way in which I interpret it, due to the type
partitioning in 6.2.5,p1, where 'void' would be an incomplete type,
and 'void *' a pointer to such a type.

The essence of the matter is that "void *" and "void" are
primitive types. Despite the similarity in names they are
unrelated.

This is very persuasive. I would find it difficult to drop the
literal interpretation of 6.5.3.2,p2 (a constraint with a "shall"
which does not restrict the operand to a pointer to any of function,
object, or incomplete type) as well as 6.5.3.2,p4 (which describes a
type for the result, based on the type of the operand, and using
quotes to surround these).

Are there further references that might help to drop this literal
interpretation? I believe that there is a major question regarding
some implied suggestion that the expression of a unary '*' operator
and its operand must have one of: Function type, object type, else
undefined. But not incomplete type, as in 'void'.

Would you agree that you believe that 6.5.3.2 implies that the
expression must have one of those two types, despite the further
sentence which appears to define the type, also? Is it remotely
possible that this sentence (the third of p4) can define the type
independently of the previous sentence, or is that not possible?

Shao Miller · Jul 26, 2010

No.

Where does this response come from? Are you suggesting that casting
to 'void' was not an original issue but came later on? If so, this
sounds like a good history lesson worth reading. I will try to find
it, since I could only "guess," above.

In the line
(void)func();
the (void) is not required and does nothing. The line could have
read:
func();
and the result would have been the same - a value would be
returned and discarded. Caveat: If your warning level is set
high enough the compiler might give you a warning about ignoring
a returned value.

Right... So I'd also guess that you don't share my aesthetic
preference for how we could discard a 'void *' result with a single
keystroke, given such warning levels and compilers, and iff we (and
implementations) _did_ agree to the allowance of dereferencing a 'void
*'. I only really meant to suggest a usefulness similar to that of
casting to 'void'. You have described one such situation where
warnings result from an implementation without casting to 'void'.

Ben Bacarisse · Jul 26, 2010

Shao Miller said:
Shao Miller said:

On Jul 25, 9:02Â pm, Ben Bacarisse <[email protected]> wrote:

Click to expand...

For example, I'd bet that the popular consensus amongst C programmers is
that

Click to expand...

Â int two_d_array[10][10];
Â two_d_array[0][20] = 42;

Click to expand...

is well-defined, but it's hard to find well-reasoned arguments from C
experts to support that view.

Click to expand...

Click to expand...

<snip>

Just as everything else was all wrapped up...

Ok. My impression of 'two_d_array[0] + 20' would be that
'two_d_array[0]' should yield a result with type 'int *', with an
intermediary state as an 'int[10]'. So adding 20 should try to move
to the 20th 'int' element in the entire array object. I believe that
6.5.6,p2 excludes an aggregate type, such as an array type.

"For addition, either both operands shall have arithmetic type, or one
operand shall be a pointer to an object type and the other shall have
integer type."

So I don't think an 'int[10]' could be an operand, here.

The + is never applied to an array type, no. I don't see why that means
you think the + 20 is done in relation to the whole array. It seems
more logical to me that it is done in relation to two_d_array[0] and
that therefore the result is undefined.

<snip>

Shao Miller · Jul 26, 2010

Shao Miller said:
Shao Miller said:

On Jul 25, 9:02 pm, Ben Bacarisse <[email protected]> wrote:

Click to expand...

For example, I'd bet that the popular consensus amongst C programmers is
that
int two_d_array[10][10];
two_d_array[0][20] = 42;
is well-defined, but it's hard to find well-reasoned arguments from C
experts to support that view.

Click to expand...

Click to expand...

<snip>

Just as everything else was all wrapped up...

Only between you and I. I would worry that this response might be
accidentally misread as "there are no outstanding questions in the
thread." I doubt that's at all what you meant, here.

Ok. My impression of 'two_d_array[0] + 20' would be that
'two_d_array[0]' should yield a result with type 'int *', with an
intermediary state as an 'int[10]'. So adding 20 should try to move
to the 20th 'int' element in the entire array object. I believe that
6.5.6,p2 excludes an aggregate type, such as an array type.

Click to expand...

"For addition, either both operands shall have arithmetic type, or one
operand shall be a pointer to an object type and the other shall have
integer type."

Click to expand...

So I don't think an 'int[10]' could be an operand, here.

Click to expand...

The + is never applied to an array type, no. I don't see why that means
you think the + 20 is done in relation to the whole array. It seems
more logical to me that it is done in relation to two_d_array[0] and
that therefore the result is undefined.

I cannot think of a way in which any evaluation of the second array
subscript operator does not demand evaluation of the first, before-
hand. Can you?

'two_d_array[0]' has type 'int[10]'. 'sizeof two_d_array[0] / sizeof
(int)' will confirm.

'two_d_array[0]' thus has array type. That means that for anything
other than 'sizeof' or '&', it becomes 'int *'. That pointer points
to a valid object and so does incrementing that pointer by '20'.

Ben Bacarisse · Jul 26, 2010

Shao Miller said:
Shao Miller <[email protected]> writes:

Ok. Â My impression of 'two_d_array[0] + 20' would be that
'two_d_array[0]' should yield a result with type 'int *', with an
intermediary state as an 'int[10]'. Â So adding 20 should try to move
to the 20th 'int' element in the entire array object. Â I believe that
6.5.6,p2 excludes an aggregate type, such as an array type.

Click to expand...

"For addition, either both operands shall have arithmetic type, or one
operand shall be a pointer to an object type and the other shall have
integer type."

Click to expand...

So I don't think an 'int[10]' could be an operand, here.

Click to expand...

The + is never applied to an array type, no. Â I don't see why that means
you think the + 20 is done in relation to the whole array. Â It seems
more logical to me that it is done in relation to two_d_array[0] and
that therefore the result is undefined.

Click to expand...

'two_d_array[0]' has type 'int[10]'. 'sizeof two_d_array[0] / sizeof
(int)' will confirm.

'two_d_array[0]' thus has array type. That means that for anything
other than 'sizeof' or '&', it becomes 'int *'. That pointer points
to a valid object and so does incrementing that pointer by '20'.

Yes, I know that is your position. There is no point in our repeating
the same things over and over.

Tim Rentsch · Jul 26, 2010

Shao Miller said:
[snip] How many times should a person read something before
asking for others to help by sharing their interpretations and their
reasoning for those interpretations?

Continue reading until the urge to post purely captious
rhetoric subsides.

Shao Miller · Jul 26, 2010

Shao Miller said:
Shao Miller said:

[snip] How many times should a person read something before
asking for others to help by sharing their interpretations and their
reasoning for those interpretations?

Click to expand...

Continue reading until the urge to post purely captious
rhetoric subsides.

I have to ask non-rhetorical questions like these because it is
intended to establish at least one of: an agreement or a disagreement.

Hey, if you don't think I'm trying to bring some ambiguities in the
draft with filename 'n1256.pdf' to attention, you'd be wrong. I'm
trying to do it at the same time as looking for an interpretation by
another C pedant (see the first couple of lines of post) which could
convince me that one particular interpretation can be demonstrated to
be less valuable than another. Why? Because there is always that
possibility that one interpretation is less valuable than another.
Many response posts have suggested that interpretations I have offered
are less valuable than theirs, so at least we're all in agreement
about differences in the value of interpretations.

Think again about an implementor for C. To target the highest
conformance possible, is a thorough study of a standard (or draft) of
C warranted or not?

You suggest that there is little significance in the discussion I've
presented. That's your opinion and you're welcome to it, clearly.
What I don't understand is why you (and others) think that it's
important to express that opinion repeatedly.

You "snipped" my saying that I perceived a "gross ambiguity" in
'n1256.pdf'. Why else would I even ask the seemingly silly question
about dereferencing a null pointer, then later about a 'void *'?
Because it's important to me that the perceived ambiguity has at least
one of:
1. The perception corrected, or
2. the ambiguity corrected

Because I enjoy C.

That I need to use "I" this much in posts means the focus is in the
wrong place.

Seebs · Jul 26, 2010

Shao Miller said:
Shao Miller said:

'two_d_array[0]' thus has array type. That means that for anything
other than 'sizeof' or '&', it becomes 'int *'. That pointer points
to a valid object and so does incrementing that pointer by '20'.

Click to expand...

Yes, I know that is your position. There is no point in our repeating
the same things over and over.

But since I haven't pointed it out yet:

Type is not all there is to a pointer. There is also a question of what
object the pointer points into. two_d_array[0] is an array[10], so when
it decays into a pointer, it decays into a pointer into that array of 10
objects. An implementation is welcome to detect attempts to move outside
the boundaries of that array.

To put it another way:

int two_d_array[10][10];

/* should be fine, this is guaranteed to be true */
assert((two_d_array[0] + 10) == &(two_d_array[1][0]));
two_d_array[1][0] = 1; /* valid */
two_d_array[0][10] = 1; /* undefined behavior */

Comparison between pointers does not compare their boundaries. (The assert
is safe because you are allowed to calculate the address of the object one
past the end of an array.)

-s

Tim Rentsch · Jul 26, 2010

[snip] The name of the type of void * can be considered
to be "pointer to void" since that is what the syntax indicates.
None-the-less something that has type void * is not a pointer to
void. In the code

void * p;

p is a pointer to a storage object of unspecified type. That is,
the type of p is "pointer to a storage object of unspecified
type".

The essence of the matter is that "void *" and "void" are
primitive types. Despite the similarity in names they are
unrelated.

As a way of looking at things I think this point of view is not
unreasonable. However, it is at odds with how the Standard
itself describes the relationship between void and void* (which
is just the same as any other type and its derived pointer type).
It's not possible to access an object that has type void (and
even if it were doing so wouldn't do anything) but that doesn't
mean the notion of a void object is automatically nonsensical.
By analogy, if we have a pointer 'struct void_s *pvs;', where
'struct void_s' is never defined, there certainly can be pointers
to objects that are 'struct void_s' objects as far as the
compiler is concerned and as far as the code that uses them can
tell. The situation with void and void* is different only in
that converting to (void) has a special meaning, and void* is
freely interconvertible with other pointer types (with the
obvious qualifiers [no pun intended] about const etc). Array
types provide another example:

typedef unsigned char UCA[];

The type UCA is very much like void, an incomplete type that will
never be completed, and also having the property that objects of
the type cannot be accessed directly but only through other kinds
of pointer type. Despite that surely there can be objects of
type UCA. Is it unreasonable to use similar reasoning with void?

Shao Miller · Jul 26, 2010

Shao Miller said:
Shao Miller said:

'two_d_array[0]' thus has array type. That means that for anything
other than 'sizeof' or '&', it becomes 'int *'. That pointer points
to a valid object and so does incrementing that pointer by '20'.

Click to expand...

Yes, I know that is your position. There is no point in our repeating
the same things over and over.

Click to expand...

But since I haven't pointed it out yet:

Type is not all there is to a pointer.
Agreed.

There is also a question of what
object the pointer points into.
Agreed.

two_d_array[0] is an array[10],

Agreed. Taking as valid our declaration, the sub-expression
'two_d_array[0]' is an expression with the type "array of int with ten
elements." Another useful notation in addition to Peter's could be
'int[10]'. Section 6.7.6 of the C standard draft with filename
'n1256.pdf'.

so when
it decays into a pointer, it decays into a pointer into that array of 10
objects.

Agreed...

It could possibly be confusing for a reader of the draft when
regarding "array type" versus "array object," if only "array" is
mentioned. For example, 5.1.2.2.1,p2 "argv array", where "object" is
perhaps intended (there's no mention of 'argv' as a type).

Also, if only "an array object" is specified (as in 6.5.6,p8), it
might not be clear to a reader whether _any_ array object will do, or
if only _a_particular_ array object will do.

Unfortunately, 6.5.6,p8 results in a circular definition for "Array
subscripting", 6.5.2.1,p2, which defines the subscript operator in
terms including pointer arithmetic via the binary '+' operator, who
then defines pointer arithmetic in terms including the "difference of
the subscripts". Oops.

For an object with "allocated" "storage duration" (6.2.4,p1), there
isn't even a "declared type" for an array object to work with, but
only the type of an lvalue used to access it (6.5,p6). If we accept
"only _a_particular_ array object" above, it might be difficult to
accept any accesses to elements within an array object with allocated
storage, since:

The effective type is the type of the lvalue used to access the
object, in the case of an object with allocated storage.

An implementation is welcome to detect attempts to move outside
the boundaries of that array.

Agreed. Some bounds of arrays are known at translation-time and some
are not, also.

Furthermore, the decay Peter describes above looks like:

int[10] ---> int *

If an implementation attempts to keep track of the 'int *'-typed
result as "must point within an object with type 'int[10]'" instead of
discarding the bound, we can at least work around this by casting the
result to 'void *' or 'char *', then to 'int(*)[sizeof two_d_array /
sizeof two_d_array[0][0]]', then back to 'void *' or 'char *', then
back to an 'int *' like we started with. The middle-measure should
discard any bound the implementation might have been attempting to
track.

To put it another way:

int two_d_array[10][10];

/* should be fine, this is guaranteed to be true */
assert((two_d_array[0] + 10) == &(two_d_array[1][0]));
two_d_array[1][0] = 1; /* valid */
two_d_array[0][10] = 1; /* undefined behavior */

Well at least there's a work-around (detailed above) for concerns of
UB, here.

Comparison between pointers does not compare their boundaries. (The assert
is safe because you are allowed to calculate the address of the object one
past the end of an array.)

I would suggest that implementations should attempt to work with the
knowledge of which locations are valid for which object types
('two_d_array' being declared for 100 contiguous 'int's), OR to drop
any tracked bound during the conversion of 'int[10]' to 'int *'. Just
a suggestion.

In a four-dimensional array:

int fd[10][10][10][10];

'fd' (alone) is an expression with type 'int[10][10][10][10]', which
might become 'int(*)[10][10][10]'.
'fd[0]' (alone) is an expression with type 'int[10][10][10]', which
might become 'int(*)[10][10]'.
'fd[0][0]' (alone) is an expression with type 'int[10][10]', which
might become 'int(*)[10]'.
'fd[0][0][0]' (alone) is an expression with type 'int[10]', which
might become 'int *'.

Shao Miller · Jul 28, 2010

Shao Miller said:
Shao Miller said:

Shao Miller <[email protected]> writes:

Click to expand...

Ok. My impression of 'two_d_array[0] + 20' would be that
'two_d_array[0]' should yield a result with type 'int *', with an
intermediary state as an 'int[10]'. So adding 20 should try to move
to the 20th 'int' element in the entire array object. I believe that
6.5.6,p2 excludes an aggregate type, such as an array type.
"For addition, either both operands shall have arithmetic type, or one
operand shall be a pointer to an object type and the other shall have
integer type."
So I don't think an 'int[10]' could be an operand, here.
The + is never applied to an array type, no. I don't see why that means
you think the + 20 is done in relation to the whole array. It seems
more logical to me that it is done in relation to two_d_array[0] and
that therefore the result is undefined.

Click to expand...

Click to expand...

'two_d_array[0]' has type 'int[10]'. 'sizeof two_d_array[0] / sizeof
(int)' will confirm.

Click to expand...

'two_d_array[0]' thus has array type. That means that for anything
other than 'sizeof' or '&', it becomes 'int *'. That pointer points
to a valid object and so does incrementing that pointer by '20'.

Click to expand...

Yes, I know that is your position. There is no point in our repeating
the same things over and over.

Ben! I think I see your point! Is it roughly:

int main(void) {
union {
char bar[1];
char baz[2];
} foo;

foo.baz[0] = 'z';
foo.baz[1] = 'z';
foo.bar[0] = 'r';
return foo.bar[1]; /* Undefined behaviour */
}

? That the implementation is free to reject/diagnose the 'return'
line above as out-of-bounds?

Thanks for bringing this one up; it's been neat to think about.

Shao Miller · Jul 29, 2010

Just a summary of some of the valuable discussion from respondants.
"The wording" below usually means for either the C draft with filename
'n1256.pdf' or the C Standard.

In regards to '(void)*(char *)0;':

Richard Heathfield: {

- An implementation can handle "any way it likes"

- The expression in the expression statement is evaluated

- '*(char *)0' alone is undefined behaviour

- '*(char *)0' has the side effect of undefined behaviour

- Any evaluation with UB means UB regardless of use

- A null ponter is not a valid operand for unary '*', hence UB

- Experience and knowledge accumulated towards C is valuable

- Operators yield values during evaluation

- Membership during one's lifetime on the committee responsible for
the C Standard does not guarantee correctness of one's detail about C,
but is likely

}

Keith Thompson: {

- '(void)42' is well-defined

- The wording for unary '*' needs to be improved

- The intent is unambiguous

- The "has been assigned" wording only makes sense for a pointer
object as the operand

- In the wording, "pointer" is used to mean one or both of "pointer
value" and "pointer object"

- Likely a small mistake by the author of "has been assigned"

- The intent of "has been assigned" is to define undefined behaviour
for an invalid pointer value [regardless of any assignment]

- If the wording meant the case of an lvalue having been assigned an
invalid value, it could have been worded more clearly for that

- A result has a value, as well as a type

- The result of unary '*(char *)0' has no defined value

- A value cannot be assigned to the pointer operand of unary '*'
unless it's a pointer object

- The operand of unary '*' is not a pointer object [lvalue]

- The "has been assigned" wording was simply a mistaken assumption

- Expressions are not precisely defined by the wording. Keith
provides examples. Keith suggests that the definition for expressions
should refer to the syntax

}

Tim Rentsch: {

- It is undefined behaviour

- '*(struct foo*)0' is undefined behaviour for the same reason

- The draft's wording about the matter might be imprecise

- '*(char *)0' is undefined if evaluated

- The wording could be better but should not be interpreted the way
one might interpret a math textbook

- The pointer operand for unary '*' is never assigned, since it's a
value and not an lvalue. (I might have misremembered who brought this
up first during discussion. Sorry, Tim.)

- The wording for "has been assigned" thus cannot be taken literally

- Reading the Standard carefully and considering what it entails for
the abstract machine can answer questions about subjects like
'(void)*(char *)0;'

- Objections raised towards the C Standard can be considered captious

}

Christian Bau: {

- '*(char *)0' is well-defined to yield and lvalue

- When that lvalue becomes an rvalue, we get undefined behaviour

- C++ differs

- Don't do it if in doubt

- A compiler might attempt to read '(char *)0'

}

Daniel "Stargazer": {

- The location '0' might not be valid for an object of any type

- A null pointer is guaranteed not to point to an object

- Dereferencing a null pointer yields undefined behaviour

}

Ben Bacarisse: {

- '*(char *)0' is well-defined for use with '&'

- '*(char *)0' is well-defined for use with 'sizeof'

- The wording might allow an implementation to define the behaviour;
to optimize side-effect-free expressions

- [Something about] '*(char *)0' is undefined

- The wording omits definition of the case where the operand for unary
'*' does not point to an object [or function], so [evaluation of] the
result is undefined behaviour

- The wording about "has been assigned" is clumsy

- '*E' is only defined when 'E' points to an object or a function

- A null pointer point to neither an object nor a function

- The sentence in the wording for unary '*' about "If...the result is
a function designator; if...the result is an lvalue..." are the only
definitions for the result

- The sentence in the wording for unary '*' about "If the operand has
type..." is the only definition for the type [of the result, I assume]

- These two sentences are two separate attributes for the expression,
not three ["if object, if function, if type"]

- The type for an expression is defined in parts of the wording where
the result is not defined

- Knowing the type does not guarantee defined behaviour

- The '<<' and '>>' operators are an example of defined type but a
possibly undefined result/behaviour

- Failure to define behaviour in the wording means undefined behaviour

- Even knowing the type does not mean that the expression has a
defined result

- The result of '*p' is the entire object [when the operand is a
pointer to a type suitable for an object, presumably]

- A compiler needn't fetch anything from an object designated by '*p',
but might only fetch what is needed (maybe only a member)

- Because it's an expression, '*(char *)0' must either generate side-
effects, designate an object or a function, or specify the computation
of a value. Since it does none of those, it's undefined behaviour for
evaluation

- We know the type for '*(char *)0' is 'char', but we don't know which
'char', so it's undefined

- When the pointer operand for unary '*' points to neither a function
nor an object, there is no defined behaviour

- Every expression does not require a value. Ben provides an example
of casting to 'void'

- The wording should not be interpreted the way one might interpret a
math textbook. Common sense is required while reading

- The wording does not define "result" on its own, but is by most
people some notion of a quantity with a type

- A constraint for the unary '*' operator to point to either an object
or to a function is not possible

- There are three attributes of a an expression and its [evaluated]
result: The quantity [value], the type, and whether it's an lvalue

- The type and lvalue-ness can be determined by syntax alone. [They
are translation-time determinable]

- The value is a run-time property

- The word "result" clouds the distinction between these three
attributes

- Revising the C Standard to make these distinctions would be a huge
challenge

- The "has been assigned..." wording for unary '*' should be
interpreted to mean an invalid pointer value

- The C Standard is more than a guide

- A "shall not be a null pointer constant..." constraint is not
possible for unary '*' since it is not determinable at compile time

- Type-only results are not mentioned [in the wording]

- There are interesting cases where the wording fails to give an
answer to give an answer to a question about C

}

Richard Harter: {

- '*(char *)0' has a defined type

- Evaluation of '*(char *)0' is undefined

}

Peter "Seebs" Seebach: {

- '(char *)0' is a pointer that doesn't point to an object, so '*(char
*)0' is undefined behaviour for evaluation, regardless of use in a
cast, or of the value

- The wording might be poorly-phrased, but the intention is clear

- Failure to define behaviour in the wording means undefined behaviour

- '(char *)0' points to an object type

- "Has been assigned to" could be considered as being clumsy wording,
but obviously includes pointer values regardless of assignment

- The wording for unary '*' about "If...the result is a function
designator; if...the result is an lvalue..." could be considered the
sum total of definitions. No definition for the null pointer case
means undefined behaviour.

- The wording for unary '*' regarding "If the operand has type..."
could be considered to define '*(char *)0' to have an lvalue result,
but lack of designating an object would yield undefined behaviour

- Dereferencing null pointers yields undefined behaviour

- The Standard is clear on the matter

- The wording does not define behaviour for when the operand for unary
'*' does not point to an object [or function]

- The wording could doubtlessly be improved is can be asserted
confidently as being poor, but is not ambiguous and the meaning is
clear

- Only knowing the type of the result (and nothing else) means an
undefined result

- Implementation consistency makes the meaning of the wording for
unary '*' obvious

- 30 years of experience and knowledge accumulation makes the wording
for unary '*' obvious

- Committee membership during one's lifetime on the committee
rsponsible for the C Standard makes the wording obvious

- The committee responsible for the C Standard has always agreed that
dereferencing a null pointer is undefined behaviour

- An invalid pointer not pointing to an object [or function] causes
the undefined behaviour when an operand for unary '*'

- The wording is consistent about invalid pointers meaning pointers
not pointing to an object [or function]

- The Standard of C is the total definition for defined behaviour

- Value and type are two different things. We might even know the
value [of something] and not its type

- '(char *)' is a pointer to an object type

- 'sizeof' is magical because it uses a type but not a value. It
doesn't evaluate its operand's expression

}

Morris Keesan: {

- If evaluation of '*(char *)0' is undefined, so are any side-effects

}

Rich Webb: {

- Implementations might choose to define behaviour for evaluation of
'*(char *)0' or '*(void *)0'

- '*(char *)0' and '*(void *)0' are constraint violations

}

pete: {

- The C Standard only mentions values for expressions with an object
type

}

Concerning '*(void *)0':

Peter "Seebs" Seebach: {

- '*(void *)0' is a constraint violation

}

Richard Heathfield: {

- 'void *' cannot be dereferenced

- The C Standard does not define dereferencing a 'void *'

- The C Standard does define casting to 'void'

- Richard discusses an interesting example of C code which is disputed
to be either well-behaved or undefined behaviour

}

Richard Harter: {

- 'void' and 'void *' are hacks, developed during standardization of C
to deal with generic pointers and functions returning nothing

- Dereferencing is almost universally that something of type 'T *' [as
an operand] yields something of type 'T'

- A 'void *' is a pointer to a storage object of unspecified type

- 'void' and 'void *' are unrelated

- Casting to 'void' was not the solution to any issue of discarding an
uninteresting result during development of the C Standard

- It is easier to call a function and discard its result than to
explicitly discard the result [using additional code]

- Compilers might warn about implicitly discarding a function call's
result

}

Ben Bacarisse: {

- Evaluating an expression and throwing away the result is commonly
done

- Dereferencing a pointer to get nothing is useless

- The purpose of dereferencing is to recover an object or function
designator

- 'void' is not an object [or function] type, so unary '*' application
is counter-intuitive and undefined

- A cast converts and a pointer dereference interprets

- Casting and dereferencing both have definitions for type

- Defining a result for evaluation of unary '*' with a 'void *'
operand would allow for bugs but might not yield any gain

- Type is a static property of an expression form (roughly compile-
time versus run-time)

- Reasoning, logic, and common sense are all better than popular
consensus, when considering a potential case of ambiguity [in the C
Standard/draft]

- There are formal defect reports for the C Standard/draft

- A null pointer can point to neither an object nor a function

- It's likely that an additional constraint for the unary '*' operator
to help to clarify it is unlikely to be added to the C standard/draft

- A constraint to prevent null pointer constants (or casts of such) is
not likely to help C programmers, since it's avoided by them anyway

- Allowing unary '*' to yield a void expression has a value for C that
cannot be used

- Discarding a 'void *' result from a function call is easier to
accomplish by simply calling the function versus dereferencing that
'void *' to accomplish a void expression [were that "defined," which
is disputed]

- That unary '*' defines the type of result as 'void' upon a 'void *'
operand is insufficient to produce a void expression

- Conversions should be discussed for values only; such would be
confusing for discussion of types

- The result of application of unary '*' to anything other than a
function pointer [type] or an object pointer [type] is undefined

- It is pointless to know the type of a result if the result is not
defined

- Allowing to dereference a 'void *' would constitute a relaxation of
a current rule, which allows for currently invalid programs to become
valid. There appears to be little benefit for such an allowance

- Ben provides an interesting example of C code which is disputed as
being well-defined or undefined behaviour

- Types are partitioned into three groups: Incomplete, object, and
function

- Casts specify conversions, but the wording for the cast operator
does not define a result. The section about conversions define the
results

}

Tim Rentsch: {

- Comparing 'void' and 'void *' to other 'type' and 'type *'
relationships might be a reasonable from reading the wording

- Tim provides examples of types which either are not or cannot be
completed, but which pointer types for pointing-to can exist and be
used in C code

}

A thought that still sits with me is from 6.3.2.2:

"The (nonexistent) value of a void expression (an expression that has
type void) shall not be used in any way, and implicit or explicit
conversions (except to void) shall not be applied to such an
expression. If an expression of any other type is evaluated as a void
expression, its value or designator is discarded. (A void expression
is evaluated for its side effects.)"

In retrospect, the majority of my questions and points seem to have
had this piece of the draft as their basis, didn't they?

Nobody agreed to a suggestion that if the type for an expression is
defined to be 'void', that expression is a 'void' expression by
definition, period; that ((no value) is defined) versus (no (value is
defined)). That's fine, isn't it?

Nobody touched on any other form of void expression beyond a function
returning 'void', or a cast to 'void', until simply 'func();' was
mentioned, but it was not explicitly identified as a void expression
(6.8.3,p2). That's fine, isn't it?

I find it most unfortunate that some of the discussion involved
misinterpreted "tone" and my use of English for perceived purposes of
offense in my posts, and that some of the discussion involved insults
or accusations of lesser intelligence or reasoning faculties. That's
likely a risk in any Usenet group, however. Life moves on, doesn't
it?

As for the valuable discussion above, now's as good a time as any to
pass on my most sincere thanks for the time and effort of those who
attempted to provide answers and clarification.

It was valuable of you all to do so, for what that's worth. Learning
happened, didn't it? Thankyouthankyouthankyouthankyouthankyou!

Shao Miller · Jul 29, 2010

Please forgive any errors, omissions, and lack of context that might
be in the previous "summary" post. Any such are unintentional. The
context of the original posts should be considered the only authority
for interpretation of meaning of the "summary" points. Thanks.

Ben Bacarisse · Jul 29, 2010

Shao Miller said:
On Jul 26, 10:52Â am, Ben Bacarisse <[email protected]> wrote:

Yes, I know that is your position. Â There is no point in our repeating
the same things over and over.

Click to expand...

Ben! I think I see your point! Is it roughly:

int main(void) {
union {
char bar[1];
char baz[2];
} foo;

foo.baz[0] = 'z';
foo.baz[1] = 'z';
foo.bar[0] = 'r';
return foo.bar[1]; /* Undefined behaviour */
}

?

No. I see no connection to what I was saying about 2D arrays.

That the implementation is free to reject/diagnose the 'return'
line above as out-of-bounds?

I can't see any permission to do that in the standard.

Shao Miller · Jul 29, 2010

Shao Miller said:
Shao Miller said:

On Jul 26, 10:52 am, Ben Bacarisse <[email protected]> wrote:
Ben! I think I see your point! Is it roughly:

Click to expand...

int main(void) {
union {
char bar[1];
char baz[2];
} foo;

Click to expand...

foo.baz[0] = 'z';
foo.baz[1] = 'z';
foo.bar[0] = 'r';
return foo.bar[1]; /* Undefined behaviour */
}

Click to expand...

?

Click to expand...

No. I see no connection to what I was saying about 2D arrays.

That the implementation is free to reject/diagnose the 'return'
line above as out-of-bounds?

Click to expand...

I can't see any permission to do that in the standard.

Ok, I misunderstood. Thanks anyway, Ben! It was still neat to think
about.

Ben Bacarisse · Jul 29, 2010

Shao Miller said:
Just a summary of some of the valuable discussion from respondants.

Why not summarise what you have concluded to be the standard's meaning,
rather than paraphrasing other people and taking their comments out of
context (which often makes the unintelligible)?

<snip>

Ben Bacarisse · Jul 29, 2010

Shao Miller said:
Please forgive any errors, omissions, and lack of context that might
be in the previous "summary" post. Any such are unintentional. The
context of the original posts should be considered the only authority
for interpretation of meaning of the "summary" points. Thanks.

I agree, but why post it then?

Shao Miller · Jul 29, 2010

Why not summarise what you have concluded to be the standard's meaning,
rather than paraphrasing other people and taking their comments out of
context (which often makes the unintelligible)?

<snip>

Ok, perhaps I'll try that next time... Unless you want one now, for
this subject. Hopefully the summary does demonstrate how valuable the
discussion was, however.

Shao Miller · Jul 29, 2010

I agree, but why post it then?

Because if I'd come across a thread like this when I first had these
questions, a summary post like that would give me a quick digest. It
might answer the questions right away, or I might have to dig deeper
into the thread. However it's important to be sensitive to
perceptions of misrepresentation, hence this "disclaimer" for those
who provided such good discussion.

HELP:function at c returning (null)	4	Mar 21, 2024
possible NULL && dereferencing NULL pointer	8	Jan 31, 2012
Arithmetic will null pointer	19	Jun 16, 2010
Pointer-to-Object type error	0	Mar 26, 2022
Array of structs function pointer	10	Jul 16, 2023
Comparison of Integer and Pointer (that's supposed to be an Integer). Where did I go wrong?	0	Nov 19, 2022
Question regarding array assignment	41	Dec 8, 2013
Questions Regarding Null and Casting	10	Jun 3, 2011

C Standard Regarding Null Pointer Dereferencing

Shao Miller

Shao Miller

Shao Miller

Ben Bacarisse

Shao Miller

Ben Bacarisse

Tim Rentsch

Shao Miller

Seebs

Tim Rentsch

Shao Miller

Shao Miller

Shao Miller

Shao Miller

Ben Bacarisse

Shao Miller

Ben Bacarisse

Ben Bacarisse

Shao Miller

Shao Miller

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads