In said:
I find this a bit upsetting, if true. This means that we can have two
pointers that compare equal, one of which is known to point to a valid
object, and yet dereferencing the other has undefined behaviour.
Yup, C99 *explicitly* mentions this possibility:
6 Two pointers compare equal if and only if both are null pointers,
both are pointers to the same object (including a pointer to an
object and a subobject at its beginning) or function, both are
pointers to one past the last element of the same array object,
or one is a pointer to one past the end of one array object and
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
the other is a pointer to the start of a different array object
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
that happens to immediately follow the first array object in
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
the address space.91)
^^^^^^^^^^^^^^^^^
____________________
91) Two objects may be adjacent in memory because they are
adjacent elements of a larger array or adjacent members of
a structure with no padding between them, or because the
implementation chose to place them so, even though they
are unrelated. If prior invalid pointer operations (such as
accesses outside array bounds) produced undefined behavior,
subsequent comparisons also produce undefined behavior.
For
example, in the following, loop 2 has (according to the above)
undefined behaviour, while loop 3 does not.
char s1[3] = "123";
char s2[4] = "456";
if (s2 == s1 + sizeof s1) {
char *p = s1, *q = s2;
Stylistic issue: the code is much more readable if you name the pointers
p1 and p2, to be consistent with the way they are initialised.
/* loop 1 */
for (; p != q; p++) {
putchar(*p);
}
assert (p == q);
What for?!? Don't you trust the compiler to get the exit condition from
loop1 right or do you suspect that both != and == can evaluate to false
on the same pointer operands?
/* loop 2 */
for (; p != s2 + sizeof s2; p++) {
putchar(*p);
}
You can increment p one past the end of its object, but the
result cannot be either dereferenced or further incremented.
8 When an expression that has integer type is added to or subtracted
from a pointer, the result has the type of the pointer operand. If
the pointer operand points to an element of an array object,
and the array is large enough, the result points to an element
offset from the original element such that the difference of the
subscripts of the resulting and original array elements equals
the integer expression. In other words, if the expression P
points to the i-th element of an array object, the expressions
(P)+N (equivalently, N+(P)) and (P)-N (where N has the value n)
point to, respectively, the i+n-th and i-n-th elements of the
array object, provided they exist. Moreover, if the expression
P points to the last element of an array object, the expression
(P)+1 points one past the last element of the array object, and
if the expression Q points one past the last element of an array
object, the expression (Q)-1 points to the last element of the
array object. If both the pointer operand and the result point to
elements of the same array object, or one past the last element of
the array object, the evaluation shall not produce an overflow;
otherwise, the behavior is undefined. If the result points one
^^^^^^^^^^^^^^^^^^^^^^^^
past the last element of the array object, it shall not be used
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
as the operand of a unary * operator that is evaluated.
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
There is no possible doubt that, according to the standard, your code
invokes undefined behaviour.
/* loop 3 */
for (; q != s2 + sizeof s2; q++) {
putchar(*q);
}
}
No problems here.
Imagine that you were writing a bounds checking implementation. It is
obvious, from these quotes, that pointer equality checking would have to
ignore the bounds information, but the indirection operator would have to
take it into account, as well as the addition and subtraction operators.
If your implementation would silently execute loop2, it would fail to
report a bound violation related invocation of undefined behaviour.
This is a typical example of how a very common mental image of the C
language is at odds with the C standard. Most people would expect loop2
to work and it will work on most (if not all) implementations without
bounds checking, but it will work by accident, not by design.
Dan