Shao Miller said:
On Jul 23, 11:36Â pm, Ben Bacarisse <
[email protected]> wrote:
Ok. But I'd rather that even this was clearer.
1. There is a sentence which specifies a value for the result.
Check. 6.5.3.2 p4 "If the operand points to a function, the result is a
function designator; if it points to an object, the result is an lvalue
designating the object."
2. There is a sentence which specifies a type for the result.
Check (well there are two, in fact). 6.5.3.2 p2 "The operand of the
unary * operator shall have pointer type." And 6.5.3.2 p4 "If the
operand has type 'pointer to type', the result has type 'type'."
3. If the sentence regarding the value does not apply, the sentence
regarding the type is _insufficient_ to define a whole result.
Check. 4 p2 "Undefined behavior is otherwise indicated in this
International Standard by the words 'undefined behavior' or by the
omission of any explicit definition of behavior."
Underlying the specific issue you have is a problem that the standard
has never quite managed to resolve. There are three attributes that
matter about an expression and/or its "result": (a) the quantity (which
character, which integer, etc.), (b) the type, and (c) whether it is an
lvalue (there is a detail about whether its is also a modifiable lvalue
but lets simplify for the moment).
(b) and (c) can be determined from the syntactic form of the expression
along with some type analysis whereas (a) is a dynamic property of the
expression at run time. To use "result" for all of these clouds this
distinction and has led you to think that a "result" can be defined when
only the type is known.
I'd prefer the wording to be done like this:
Form: *E
Constraints: The operand, E, must have type 'pointer to T'.
Type: An expression of the form *E has type T and is an
lvalue if T is an object type.
Result: If the result of evaluating E is a pointer to a function,
the result is a function designator denoting the pointed
to function. If E the result of evaluating E is a
pointer to an object, the result denotes that object.
It would then be clear that the type is not really "part of the result"
but a property of the expression form -- something essentially static
and not associated with the evaluation. I'm not suggesting it -- the
work would be monstrous and there would be endless details to get right
(variably modified array types spring to mind) but this highlights what
the current wording is dealing with.
Of course, the way it is done now is much more intuitive. For most
expressions, it suggest that the result is a quantity tagged with a type
and lvalue-ness. But this does not work for sizeof, for example. It
does not (usually) evaluate it's result so the dynamic view of a
type-tagged result has no meaning. People know that the type can be
determined without evaluation so they apply common sense to understand
the sizeof operator. It's a shame that the wording is not perfect, but
it is not nearly as confusing as you seem to think.
<snip another discussion about void expressions. I don't want to get
into that here>
Agreed conditional upon acceptance of at least one of:
1. "...has been assigned..." really means something more like "is an
invalid value"
That is what most people take it to mean. Why? Because making special
provision for when a pointer is an object that has been assigned to
makes no sense when taken literally. Given:
const int *ip = 0;
*ip would not be covered but it would be after:
int *ip;
ip = 0;
Both would remain undefined by omission, so what value would the literal
interpretation serve?
OR:
2. Casting to 'void' and application of the unary '*' operator are
treated differently. Both may fail to define a value for the result
of an evaluation, but the cast is permitted as defined behaviour.
I don't see how this makes any difference but it does not matter because
I'm choosing (1) not (2)!
Each of these points feels like a blow, including any failure on my
part to treat the referenced draft as anything more than a guide to be
supplemented by popular consensus.
Hmmm. Now I doubt your sincerity again. Read what you wrote. You are
suggesting that I (and indirectly Tim) want you to treat the largely
unpaid work of dozens of experts over more than two decades as no more
than a guide to the language.
Further more, neither of us is suggesting that "popular consensus" is
the main tool to be used when there is ambiguity. That would be absurd.
Do you see how that comes over?
"Common sense" meaning "popular interpretation" to me. Very well;
accepted.
I think you need to refine your understanding of the term "common
sense".
If writing a translator, I might have a 'struct result' with a pointer
to a type and a pointer to a value. I might initialize these with
NULL each. If an "operator" for a 'struct result' demanded one of
these properties but it was not defined, I might diagnose undefined
behaviour.
That's hard to get right though it can be done. You need to make sure
that sizeof (1/0) works properly and that you distinguish between
"plain" values and lvalues. Your eval function needs flags to say what
sort of "evaluation" to do. In the example, 1/0 needs to be "evaluated"
for its type alone. I put evaluation in quotes because it is not C's
notion of evaluate but one that comes from the interpreter you are
writing.
It simply seemed to me that there were circumstances in
which some code path for "evaluation" might not ever use one of the
properties, which would lead me to question the validity of diagnosing
as UB if one compares as NULL but there was no expectation for it to
be non-NULL... Such as a void expression, which appears to be more
limited than I thought (casts to void and functions returning void,
for example).
I don't follow all of this. Yes, there are cases where one would not
ever use one of the properties such as in sizeof (1/0), but in
(void)(1/0) you need to evaluate 1/0 for its value property so that you
can throw it away.
<snip>
| It would be nice if a constraint for the
| unary '*' operator were that the pointer must either point to an
| object or to a function. Perhaps some kind reader could introduce
| such a constraint into a future standard.
A constraint that: "except when the '*' operator is used as the
operand to the 'sizeof' operator, an expression evaluating to a null
pointer constant or to a null pointer constant cast to any pointer
type shall not be the operand," might do, mightn't it?
Either this can't be a constraint (because you mean to include something
that can't be tested at compile time) or you have now changed the
suggestion to catch only a few cases. It all depends on what you mean
by "evaluating to a null pointer constant or to a null pointer constant
cast to any pointer type".
Yes. The function call assigns the value of an argument to the 'ip'
parameter. Passing in invalid value would result in UB.
You proposed a constraint (which I have put back since you cut it)
"were that the pointer must either point to an object or to a
function". My example (yes, we both agree it is UB) shows that you
can't tell *at compile time* if the constraint you originally proposed
is or is not violated.
Well actually, it does explain what you can do with the results. I
had made earlier references to these. "Cast operators"' first
constraint says "Unless...the operand shall have scalar type". Its
first semantic point talks about "the value of the expression."
"Simple assignment" talks about "type" and "value" for the
"operands". That explicitness (along with void expressions) was part
of why a result was not required to have both, against the consensus
here.
There is no mention of your suggested type-only results. They would be
a major part of the language. How do they work? What can we do with
them?
However it was the _consumers_ of the results that I was taking
to give meaning to constraint-valid and semantically valid
expressions. The consensus appears to be that the results are defined
or not, regardless of the consumers or their properties, except for
'sizeof' (which nobody has disputed).
If it is a consensus, it is born out of the wording. An evaluation of
*E is defined when E points to an object or a function. (char *)0 does
neither. To allow the fact the expression form has a type to mean that
it also as some sort of valueless, type-only result is to invent a whole
new language.
<snip>