pointer-in-array test

M

Martin Dickopp

Irrwahn Grausewitz said:
Bernhard Holzmayer said:
Irrwahn said:
/* as far as I understand, it's possible that p-a returns
the difference in multiples of sizeof(T), so that if element size
were 2, a[2]-a[1] would give 1 instead of 2.
^^^^^^^^^
I assume you mean `&a[2]-&a[1]' here? Otherwise, the statement doesn't
make much sense in this context, as the /contents/ of the array `a' are
not necessarily correlated to the size of its elements.
<snip>

Uck. IIRC gcc 2.xx is indeed broken in several ways.

While gcc 2.95.3 was indeed broken in some ways, it wasn't *that*
broken. :) In fact, I have sucessfully used pointer arithmentic with
more or less every gcc version since at least 2.6.3.

There's likely a different explanation for Bernhard's observation, e.g.
UB at a different position in the program.

Martin
 
C

Chris Torek

[where "p" and "a" are pointers to compatible types]

Forget about the cast. Declare index as ptrdiff_t instead.

Yes.

Also, the Standard requires (of the programmer) that p and a both
"point into" the same underlying object, and the difference between
two such pointers is the (integral) number of objects separating
them. That is, if p == &a (for any valid i), "p - a" must
produce i.

Compilers generally implement this internally by turning:

(ptr2 - ptr1)

into:

(((intptr_t)ptr2 - (intptr_t)ptr1) / sizeof *ptr1)

This underlying division produces the desired integral answer, and
always has a remainder of zero. Years ago, gcc compiled this code
to ordinary signed integer division, which often became a right-shift
with a "remainder correction" step during optimization. I sent
email to RMS pointing out that the remainder was necessarily zero
-- so that the correction never corrected anything and was a waste
of CPU time -- and he added what is now, in gcc3, the "EXACT_DIV_EXPR"
expression-type. An "exact divide" is one whose remainder is known
in advance to be zero (by the rules of the source language).

Since gcc believes the C programmer got this right, the attempt to
do this:

is fundamentally flawed. Converting both pointers to to long --
or, better but C99-only, intptr_t -- will "do the right thing" on
ordinary byte-addressed machines, though.
 
M

Martin Dickopp

Irrwahn Grausewitz said:
Bernhard Holzmayer said:
Irrwahn said:
ISO/IEC 9899:1999 (E)
6.5.6 Additive operators
[...]
9 When two pointers are subtracted, both shall point to elements
of the same array object, or one past the last element of the
array

Here it reads "shall", not "must".

"Shall" is Standardese for: "must, under all circumstances,
otherwise demons might fly out of your nose". :)

Or, as the authors of the standard put it:

ISO/IEC 9899:1999 (E)
4. Conformance
1 In this International Standard, "shall" is to be interpreted as
a requirement on an implementation or on a program; conversely,
"shall not" is to be interpreted as a prohibition.

The standard then goes on:
| 2 If a "shall" or "shall not" requirement that appears outside of
| a constraint is violated, the behavior is undefined.

This is the case here, as 6.5.6#9 is not a constraint.

Undefined behavior is defined in 3.4.3:
| 1 undefined behavior
| behavior, upon use of a nonportable or erroneous program construct
| or of erroneous data, for which this International Standard imposes
| no requirements
|
| 2 NOTE Possible undefined behavior ranges from ignoring the
| situation completely with unpredictable results, to behaving during
| translation or program execution in a documented manner
| characteristic of the environment (with or without the issuance of
| a diagnostic message), to terminating a translation or execution
| (with the issuance of a diagnostic message).

Martin
 
C

Christian Bau

Bernhard Holzmayer said:
Thomas said:
Bernhard said:
Sorry,
pardon me for not checking my code intensively enough.
I just checked with a 'gcc -pedantic' which didn't throw an error
:(

What about this solution:
int ptrInArrSafe(T *p,T *a,int max)
/* check whether p points to an element of a[max] */
{
long px = (long)p;
long ax = (long)a;
if (px<ax) return 0; /*out*/
if (px> (long)&a[max-1]) return 0; /* out */
if ( (px-ax) % sizeof(T)) return 0; /* not at element
start*/
return 1; /* in and on element start */
}

After a close look to K&R 5.3 and additional sources, I'd like
to improve the above a little.

Take a copy of the C Standard as an additional source.

What about the following modified
version:

int ptrInArrSafe(T *p,T *a,int max)
/* check whether p points to an element of a[max] */
{
long index;

/* as long as a points to a valid array, this is a legal
test for the lower boundary*/
if (p<a) return 0; /*out */

Wrong. That single comparison can invoke undefined behavior and make
your program crash. No need to read any further.

/* evaluation of element's address is valid for all elements
including the virtual last+1 element, though its content may not be
retrieved, comparison is legal test for upper boundary */
if (!(p<&a[max])) return 0; /* out */

/* now we can be sure that we are inside array, because
standard requires that array is in a logically "dense" order, from
point of logical addresses they must be accessible in ascending
order */

index = (long)(p - a);
/* as far as I understand, it's possible that p-a returns
the difference in multiples of sizeof(T), so that if element size
were 2, a[2]-a[1] would give 1 instead of 2.
Although I never found a compiler doing this, I replace sizeof(T) by
a method which should work in both cases. */
if ( (p-a) % (&a[1]-&a[0]))
return 0; /* not at element boundary*/

return 1; /* in and on element boundary */
}

Bernhard
 
C

Christian Bau

Bernhard Holzmayer said:
pete said:
Bernhard said:
IMHO you have to use pointer-to-long which must be well-defined
according to the standard, which means, that a long is big enough
to hold a pointer on that system. An integer might be too short.

Whether or not a pointer can be converted to any integer type
is implementation defined.
The standard does not guarantee that relationships which hold
for two pointers, will also hold for their converted integer
values.

N869
6.3.2.3 Pointers
[#6] Any pointer type may be converted to an integer
[#type.
Except as previously specified, the result
is
implementation-defined. If the result cannot be
represented
in the integer type, the behavior is undefined. The
result need not be in the range of values of any integer
type.

1) Isn't it correct, that long has to be of the size to hold any
pointer, so that a conversion:
(pType *) --> (void *) ---> (long)
and vice versa is legal?

No. It is entirely possible that there is no integer type big enough to
hold all possible pointer values.

2) Isn't it correct, that arrays must have their elements arranged
subsequently, so that array == &array[0] and
&array[1]>&array[0] are valid?

True. But if you compare pointers that don't point into the same array,
all odds are off.
 
K

Keith Thompson

Irrwahn Grausewitz said:
Bernhard Holzmayer said:
What a pity. Then, I guess, it's up to the caller of the function,
to make sure p is a valid pointer ;-)
<snip>

That's indeed the best solution. ;-)
/* as far as I understand, it's possible that p-a returns
the difference in multiples of sizeof(T), so that if element size
were 2, a[2]-a[1] would give 1 instead of 2.

That's not only possible, it's /required/ for a conforming
implementation to behave like this (C99 6.5.6#9): pointer
subtraction always yields values in units of element size.
I read this, and got it like you. However, it's not my experience
when working with real compilers.
Although I never found a compiler doing this,

Then you only found non-conforming compilers up until now.
one is gcc 2.95.3, tried on a struct with sizeof(T)==8
<snip>

Uck. IIRC gcc 2.xx is indeed broken in several ways.
[...]

I seriously doubt that gcc 2.xx, or any usable C compiler, gets
something as fundamental as pointer arithmetic wrong.

Here's a sample program:

#include <stdio.h>
int main(void)
{
double arr[10];
double *ptr5 = &(arr[5]);
double *ptr8 = &(arr[8]);

/*
* ddiff is the difference between ptr5 and ptr8, interpreted
* as pointers to double.
* cdiff is the difference bewteen ptr5 and ptr8, both interpreted
* as pointers to char.
* The subtraction operator actually yields a result of type ptrdiff_t,
* but it's implicitly converted to int; as long as it's within
* the range, no information is lost.
*/
int ddiff = ptr8 - ptr5;
int cdiff = (char*)ptr8 - (char*)ptr5;

printf("sizeof(double) = %d\n", (int)sizeof(double));
printf("ptr5 = [%p]\n", (void*)ptr5);
printf("ptr8 = [%p]\n", (void*)ptr8);
printf("ddiff = %d\n", ddiff);
printf("cdiff = %d\n", cdiff);

if (cdiff == ddiff * sizeof(double)) {
printf("Looks ok\n");
}
else {
printf("Something is wrong\n");
}

return 0;
}

If it prints "Something is wrong" on your implementation, there is
indeed a serious problem. I just tried it with gcc versions 2.7.2.2,
2.8.1, 2.95.2, 3.0.4, and 3.2.2, and it printed "Looks ok" every time.
 
O

Old Wolf

Ike Naar said:
If p does not point to an element of a[max], all comparisons
(p<a), (p>&a[max-1]) and ((p-a)%sizeof(T)) invoke undefined
behaviour and can evaluate to anything, including 'true'.
So your solution may produce a 'false positive' result.

Is it actually undefined behaviour (ie. program can crash), or
just "undefined result" (ie. unspecified behaviour). I was assured
by one of the WG members for C++ recently that in C++ it is
merely unspecified behaviour. It would be notable if the two
languages differed in this respect.
 
R

RoSsIaCrIiLoIA

Ike said:
Suppose I want to find out whether a given pointer (say, p) of type
*T points to an element of a given array (say, a) of type T[max].
A way to achieve this would be a linear search through the array:

int ptrInArrSafe(T *p,T *a,int max)
/* check whether p points to an element of a[max] */
{
int j=0;
while (j!=max && a+j!=p) ++j;
return j!=max;
}

Obviously, this algorithm is terribly slow for large max.

A more efficient algorithm is

int ptrInArrUnsafe(T *p,T *a,int max)
/* check whether p points to an element of a[max] */
{
int j=p-a; /* possibly undefined */
return 0<=j && j<max && a+j==p;
}
... snip ...
The safe version can possibly be slightly sped up by:

/* check whether p points to an element of a[max] */
int ptrInArrSafe(T *p, const T *a, int max)
{
T *pp;

for (pp = a + max; a < pp; a++)
if (p == a) return 1;
return 0;
}

which treats the valid, but undereferenceable, pointer one past
the end of a (value pp) as not within a. This may or may not be
what you want.

but is it safe compare pointers to different objects?
K&R A7.9 semms to say no. Why?

Is the compare for void* always possible?
Is this better?

size_t ptrInArrSafe(T *p, const T *a, size_t max)
{T *pp;
assert(max!=0);
for (pp = a + max; a < pp; a++)
if( (void*) p == (void*) a ) return (a-pp)+1;
return 0;
}

T a[50], *p, x[89];
p=x; /* p point to a an object != from x */
index_plus1=ptrInArrSafe(p, a, 50);

and if we have
char *a, *b;
a=malloc(39);
b=malloc(59);
is it safe
if((void*) a> (void*) b) ++minor;
or
if( (void*) a == (void*) b) minor = (minor>0 ? 1: -1);
?
Thanks
 
J

Jeremy Yallop

Old said:
Ike Naar said:
If p does not point to an element of a[max],

(or to "one past the end" of a)
all comparisons (p<a), (p>&a[max-1]) and ((p-a)%sizeof(T)) invoke
undefined behaviour and can evaluate to anything, including 'true'.
So your solution may produce a 'false positive' result.

Is it actually undefined behaviour (ie. program can crash), or
just "undefined result" (ie. unspecified behaviour).

It's actual undefined behaviour by (explicit) omission. It falls
under "all other cases" in the following excerpt:

[C99 6.5.8 Relational operators]

5 When two pointers are compared, the result depends on the
relative locations in the address space of the objects pointed
to. If two pointers to object or incomplete types both point to
the same object, or both point one past the last element of the
same array object, they compare equal. If the objects pointed to
are members of the same aggregate object, pointers to structure
members declared later compare greater than pointers to members
declared earlier in the structure, and pointers to array
elements with larger subscript values compare greater than
pointers to elements of the same array with lower subscript
values. All pointers to members of the same union object
compare equal. If the expression P points to an element of an
array object and the expression Q points to the last element of
the same array object, the pointer expression Q+1 compares
greater than P. In all other cases, the behavior is undefined.
I was assured by one of the WG members for C++ recently that in C++
it is merely unspecified behaviour. It would be notable if the two
languages differed in this respect.

Right: it's unspecified behaviour in C++.

[C++98 5.9 Relational operators]

- If two pointers p and q of the same type point to different
objects that are not members of the same object or elements of
the same array or to different functions, or if only one of them
is null, the results of p<q, p>q, p<=q, and p>=q are
unspecified.

I believe the reason is to allow pointers to work reliably as the keys
of associative containers (set, map, etc.), but you should ask in the
C++ groups if you want a more definitive answer.

Jeremy.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,314
Messages
2,571,623
Members
48,449
Latest member
Randy83131

Latest Threads

Top