Anand Hariharan said:
You are over-thinking this. What I ask is not anything remotely as
expensive, definitely not meriting a new keyword or a significant
language change. In fact, as Ben showed in his post, what I propose
would be fairly easy to subvert (only requiring some additional
syntax), and I am not convinced that anyone would have a valid use-
case to subvert it.
Let me reproduce what I wrote earlier, this time also indicating what
I would like it to be:
int a;
int *p = &a;
extern int n;
p; /* Okay */
p + 1; /* Okay */
p + 100; /* UB, but not CV */
p + n; /* May or may not be UB */
Ok, so far that's the way it is now (as you know).
&a; /* Okay */
&a + 0; /* Currently neither UB nor CV, but should be CV */
&a + 1; /* Currently neither UB nor CV, but should be CV */
&a + 100; /* Currently neither UB nor CV, but should be CV */
&a + n; /* May or may not be UB. Currently not CV, but should be CV
*/
Let me try to state it a bit more precisely.
Constraint (proposed):
The operand of any arithmetic operation shall not be of the
form &obj, where obj is an identifier that is the name of
a declared object. (Presumably it should also apply to any
member of a struct or union.) Note: This affects operations
of the form pointer+integer, integer+pointer, pointer-integer,
and pointer-pointer.
What I am convinced about:
* That's how the language is.
I don't know what you mean by this. You explicitly acknowledged above
that this *isn't* how the language is ("Currently neither UB nor CV").
Can you clarify this point?
* The language stays simple.
But not quite as simple as it is now; it requires a new restriction to
be added to the existing language.
It also means that two expressions with exactly the same type and
value would be treated very differently: pointer arithmetic on ``&a''
is a constraint violation, but pointer arithmetic on ``p'' is ok.
That's not a fatal flaw, but I'm uncomfortable with the idea.
(Note that there's already a case of this: a constant 0 yields a
null pointer when converted to a pointer type, but a non-constant
expression of type int with the value 0 may or may not do so.
I'm not a big fan of that either, but we're firmly stuck with it.)
What I am not convinced yet [NB: In all the below statements,
prepositions such as 'it', 'this' etc refers to "doing pointer
arithmetic on a pointer value yielded by applying the unary & operator
on a single object"]:
* Why anyone would want to do this.
Suppose you have a function that takes two pointers, one to the first
element of an array on which it's to act and one just past the last
element of the same array. (Think C++ iterators.) To process an array:
some_type arr[N];
func(arr, arr+N);
To process a single object:
some_type obj;
func(&obj, &obj+1);
Why forbid this useful construct?
* Why ratify the language defintion by incorporating a new clause in
the standard that actually highlights a language loophole and pretty
much gives license to a programmer to do this.
Um, why not?
"Trust the programmer." -- C99 Rationale, page 3.
* Why it is insurmountably difficult for an implementation to
identify this. Or impossible without changing the language in a
drastic or incompatible way?
Ok, it's not terribly difficult to add a very limited check for this
kind of thing. But it's far too easy to work around the check.
If you could change the language in a way that would detect such
errors with some reliability, it would be worth considering.
But under your proposal, as soon as you copy an address into a
pointer variable, there's no trace of its origin and any "misuse"
is undetectable.
* How requiring an implementation to issue a diagnostic when it
detects this would seriously impact existing code -- code that one
could argue is already broken.
It would be interesting to look for examples of pointer arithmetic
on single object addresses in existing code, and determine how
many cases are deliberate and how many are unintentionally broken.
I have no idea what the results of such a survey might be.