pointer-in-array test

C

Christian Bau

Bernhard Holzmayer said:
Sorry,
pardon me for not checking my code intensively enough.
I just checked with a 'gcc -pedantic' which didn't throw an error :(

What about this solution:
int ptrInArrSafe(T *p,T *a,int max)
/* check whether p points to an element of a[max] */
{
long px = (long)p;
long ax = (long)a;
if (px<ax) return 0; /*out*/
if (px> (long)&a[max-1]) return 0; /* out */
if ( (px-ax) % sizeof(T)) return 0; /* not at element start*/
return 1; /* in and on element start */
}

compiles, links, runs without problems with my gcc.
And, as far as I remember, it's legal, to compare pointers after
conversion to long.

It just isn't guaranteed to produce any meaningful result.
 
K

Keith Thompson

Bernhard Holzmayer said:
pointer-to-integer is not enough.
IMHO you have to use pointer-to-long which must be well-defined
according to the standard, which means, that a long is big enough
to hold a pointer on that system. An integer might be too short.

By "word pointer", I meant a pointer to a word. There's no necessary
relationship between a integer and a pointer-to-integer -- and the
term "integer" includes all integer types, not just int.

The concrete example I have in mind is Cray vector systems. A machine
word is 64 bits; there are no instructions that operate directly on
8-bit bytes. Nevertheless, the C implementation has CHAR_BIT==8, so
it needs to have a way to represent pointers to 8-bit bytes.

An int* (int is 64 bits) is simply a word pointer with the obvious
representation. If you take an int*, treat it as a 64-bit integer,
increment it, and convert it back to int*, the resulting pointer will
point to the 64-bit word immediately following the original one.

A char* consists of a word pointer, pointing to the 64-bit word
containing the byte, with a 3-bit byte offset (value 0..7) stored in
the high-order 3 bits (which would otherwise be zero, since no such
system has a large enough address space to need them).
 
B

Bernhard Holzmayer

Dik said:
What about this solution:
int ptrInArrSafe(T *p,T *a,int max)
/* check whether p points to an element of a[max] */
{
long px = (long)p;
long ax = (long)a;
if (px<ax) return 0; /*out*/
if (px> (long)&a[max-1]) return 0; /* out */
if ( (px-ax) % sizeof(T)) return 0; /* not at element
start*/
return 1; /* in and on element start */
}

compiles, links, runs without problems with my gcc.
And, as far as I remember, it's legal, to compare pointers
after conversion to long.

But the result is implementation defined. I know of at least one
system where it will fail to give the correct answer, then T is
char. For instance:
(long)(a + 7) > (long)(a + 8)
when a is an array of type char.

That's not the same.
Would you please try the following:
(long)(&a[7]) < (long)(&a[8])

this should give the correct result.

Reason for your strange behaviour is possibly the implicite type
cast, which converts a before adding.
Instead, the later version increments a as a pointer and then
converts it.

Bernhard
 
B

Bernhard Holzmayer

Christian said:
Bernhard Holzmayer said:
Sorry,
pardon me for not checking my code intensively enough.
I just checked with a 'gcc -pedantic' which didn't throw an error
:(

What about this solution:
int ptrInArrSafe(T *p,T *a,int max)
/* check whether p points to an element of a[max] */
{
long px = (long)p;
long ax = (long)a;
if (px<ax) return 0; /*out*/
if (px> (long)&a[max-1]) return 0; /* out */
if ( (px-ax) % sizeof(T)) return 0; /* not at element
start*/
return 1; /* in and on element start */
}

compiles, links, runs without problems with my gcc.
And, as far as I remember, it's legal, to compare pointers after
conversion to long.

It just isn't guaranteed to produce any meaningful result.

Why. Please give arguments.

Bernhard
 
B

Bernhard Holzmayer

Thomas said:
Bernhard said:
Sorry,
pardon me for not checking my code intensively enough.
I just checked with a 'gcc -pedantic' which didn't throw an error
:(

What about this solution:
int ptrInArrSafe(T *p,T *a,int max)
/* check whether p points to an element of a[max] */
{
long px = (long)p;
long ax = (long)a;
if (px<ax) return 0; /*out*/
if (px> (long)&a[max-1]) return 0; /* out */
if ( (px-ax) % sizeof(T)) return 0; /* not at element
start*/
return 1; /* in and on element start */
}

After a close look to K&R 5.3 and additional sources, I'd like
to improve the above a little. What about the following modified
version:

int ptrInArrSafe(T *p,T *a,int max)
/* check whether p points to an element of a[max] */
{
long index;

/* as long as a points to a valid array, this is a legal
test for the lower boundary*/
if (p<a) return 0; /*out */
/* evaluation of element's address is valid for all elements
including the virtual last+1 element, though its content may not be
retrieved, comparison is legal test for upper boundary */
if (!(p<&a[max])) return 0; /* out */

/* now we can be sure that we are inside array, because
standard requires that array is in a logically "dense" order, from
point of logical addresses they must be accessible in ascending
order */

index = (long)(p - a);
/* as far as I understand, it's possible that p-a returns
the difference in multiples of sizeof(T), so that if element size
were 2, a[2]-a[1] would give 1 instead of 2.
Although I never found a compiler doing this, I replace sizeof(T) by
a method which should work in both cases. */
if ( (p-a) % (&a[1]-&a[0]))
return 0; /* not at element boundary*/

return 1; /* in and on element boundary */
}

Bernhard
 
B

Bernhard Holzmayer

pete said:
Bernhard said:
IMHO you have to use pointer-to-long which must be well-defined
according to the standard, which means, that a long is big enough
to hold a pointer on that system. An integer might be too short.

Whether or not a pointer can be converted to any integer type
is implementation defined.
The standard does not guarantee that relationships which hold
for two pointers, will also hold for their converted integer
values.

N869
6.3.2.3 Pointers
[#6] Any pointer type may be converted to an integer
[#type.
Except as previously specified, the result
is
implementation-defined. If the result cannot be
represented
in the integer type, the behavior is undefined. The
result need not be in the range of values of any integer
type.

1) Isn't it correct, that long has to be of the size to hold any
pointer, so that a conversion:
(pType *) --> (void *) ---> (long)
and vice versa is legal?

2) Isn't it correct, that arrays must have their elements arranged
subsequently, so that array == &array[0] and
&array[1]>&array[0] are valid?
 
B

Bernhard Holzmayer

Bernhard said:
Thomas said:
Bernhard said:
Sorry,
pardon me for not checking my code intensively enough.
I just checked with a 'gcc -pedantic' which didn't throw an
error
:(

What about this solution:
int ptrInArrSafe(T *p,T *a,int max)
/* check whether p points to an element of a[max] */
{
long px = (long)p;
long ax = (long)a;
if (px<ax) return 0; /*out*/
if (px> (long)&a[max-1]) return 0; /* out */
if ( (px-ax) % sizeof(T)) return 0; /* not at element
start*/
return 1; /* in and on element start */
}

After a close look to K&R 5.3 and additional sources, I'd like
to improve the above a little. What about the following modified
version:

int ptrInArrSafe(T *p,T *a,int max)
/* check whether p points to an element of a[max] */
{
long index; ignore above line!

/* as long as a points to a valid array, this is a legal
test for the lower boundary*/
if (p<a) return 0; /*out */
/* evaluation of element's address is valid for all
elements
including the virtual last+1 element, though its content may not
be retrieved, comparison is legal test for upper boundary */
if (!(p<&a[max])) return 0; /* out */

/* now we can be sure that we are inside array, because
standard requires that array is in a logically "dense" order, from
point of logical addresses they must be accessible in ascending
order */

index = (long)(p - a); ignore above line!
/* as far as I understand, it's possible that p-a returns
the difference in multiples of sizeof(T), so that if element size
were 2, a[2]-a[1] would give 1 instead of 2.
Although I never found a compiler doing this, I replace sizeof(T)
by a method which should work in both cases.
Keith, would this hold on your Cray with the strange CHAR
representation, that p-a still returns a reasonable result?
*/
if ( (p-a) % (&a[1]-&a[0]))
return 0; /* not at element boundary*/

return 1; /* in and on element boundary */
}

Bernhard
 
C

Christian Bau

Keith Thompson said:
By "word pointer", I meant a pointer to a word. There's no necessary
relationship between a integer and a pointer-to-integer -- and the
term "integer" includes all integer types, not just int.

The concrete example I have in mind is Cray vector systems. A machine
word is 64 bits; there are no instructions that operate directly on
8-bit bytes. Nevertheless, the C implementation has CHAR_BIT==8, so
it needs to have a way to represent pointers to 8-bit bytes.

An int* (int is 64 bits) is simply a word pointer with the obvious
representation. If you take an int*, treat it as a 64-bit integer,
increment it, and convert it back to int*, the resulting pointer will
point to the 64-bit word immediately following the original one.

A char* consists of a word pointer, pointing to the 64-bit word
containing the byte, with a 3-bit byte offset (value 0..7) stored in
the high-order 3 bits (which would otherwise be zero, since no such
system has a large enough address space to need them).

Take a 16 bit x86 system, using the "huge" memory model: A pointer
consists of 16 bit segment and 16 bit offset. When the offset exceeds
65535, it wraps around to 0 and the segment is increased by eight. So it
can happen that

(unsigned long) p == 0x3241ffff
(unsigned long) (p+1) == 0x32490000

Doing pointer arithmetic on unsigned long would prove fatally wrong. In
the case above, if p and p+1 are both valid pointers to char objects,
then

* (char *) (((unsigned long) p) + 1) = '\0';

might crash your program. It will definitely not write to p [1].
 
B

Bernhard Holzmayer

Irrwahn said:
Bernhard Holzmayer said:
After a close look to K&R 5.3 and additional sources, I'd like
to improve the above a little. What about the following modified
version:

int ptrInArrSafe(T *p,T *a,int max)
/* check whether p points to an element of a[max] */
{
long index;

/* as long as a points to a valid array, this is a legal
test for the lower boundary*/
if (p<a) return 0; /*out */

If p and a didn't point into (or one past) the same array in
the first place, you already invoked undefined behaviour here
(C99 6.5.8#5). All bets off.
What a pity. Then, I guess, it's up to the caller of the function,
to make sure p is a valid pointer ;-)
/* evaluation of element's address is valid for all
elements
including the virtual last+1 element, though its content may not
be retrieved, comparison is legal test for upper boundary */
if (!(p<&a[max])) return 0; /* out */

Ditto. You trapped me.
/* now we can be sure that we are inside array, because
standard requires that array is in a logically "dense" order, from
point of logical addresses they must be accessible in ascending
order */

index = (long)(p - a);

Forget about the cast. Declare index as ptrdiff_t instead.
Sure. Anyhow, I just forgot to remove the line after I realized that
it could be done without explicite cast.
/* as far as I understand, it's possible that p-a returns
the difference in multiples of sizeof(T), so that if element size
were 2, a[2]-a[1] would give 1 instead of 2.

That's not only possible, it's /required/ for a conforming
implementation to behave like this (C99 6.5.6#9): pointer
subtraction always yields values in units of element size.
I read this, and got it like you. However, it's not my experience
when working with real compilers.
Then you only found non-conforming compilers up until now.
one is gcc 2.95.3, tried on a struct with sizeof(T)==8
I replace sizeof(T) by
a method which should work in both cases. */
if ( (p-a) % (&a[1]-&a[0]))

Since &a[1]-&a[0] (==a+1-a) always yields 1, the condition
is always false.

However, with gcc, where p-a is 24, if p==&a[3], and &a[1]-&a[0]
==8, this returns true, if p points to an element between
boundaries, so it's useful in both situations on both types of
compilers... which was the intention.
End of the story: if you ever need to verify if a specific
pointed-to object is part of a certain array, carry around
the array length plus indices and compare these instead of
some 'raw' pointers. And, IMHO, if one feels the need to
perform the operation desired by the OP, one should first
rethink the algorithm, because it's most probably broken by
design. I agree.

HTH
Regards

I learned a bit. Thanks.
Bernhard
 
I

Irrwahn Grausewitz

Bernhard Holzmayer said:
Christian said:
Bernhard Holzmayer <[email protected]> wrote:
What about this solution:
int ptrInArrSafe(T *p,T *a,int max)
/* check whether p points to an element of a[max] */
{
long px = (long)p;
long ax = (long)a;
compiles, links, runs without problems with my gcc.
And, as far as I remember, it's legal, to compare pointers after
conversion to long.

It just isn't guaranteed to produce any meaningful result.

Why. Please give arguments.

As has already been pointed out upthread (e.g. by pete in message
<[email protected]>), it's implementation-defined if
pointer values are meaningfully representable in an object of _any_
integral type at all.

HTH
Regards
 
I

Irrwahn Grausewitz

Bernhard Holzmayer said:
After a close look to K&R 5.3 and additional sources, I'd like
to improve the above a little. What about the following modified
version:

int ptrInArrSafe(T *p,T *a,int max)
/* check whether p points to an element of a[max] */
{
long index;

/* as long as a points to a valid array, this is a legal
test for the lower boundary*/
if (p<a) return 0; /*out */

If p and a didn't point into (or one past) the same array in
the first place, you already invoked undefined behaviour here
(C99 6.5.8#5). All bets off.
/* evaluation of element's address is valid for all elements
including the virtual last+1 element, though its content may not be
retrieved, comparison is legal test for upper boundary */
if (!(p<&a[max])) return 0; /* out */

Ditto.
/* now we can be sure that we are inside array, because
standard requires that array is in a logically "dense" order, from
point of logical addresses they must be accessible in ascending
order */

index = (long)(p - a);

Forget about the cast. Declare index as ptrdiff_t instead.
/* as far as I understand, it's possible that p-a returns
the difference in multiples of sizeof(T), so that if element size
were 2, a[2]-a[1] would give 1 instead of 2.

That's not only possible, it's /required/ for a conforming
implementation to behave like this (C99 6.5.6#9): pointer
subtraction always yields values in units of element size.
Although I never found a compiler doing this,

Then you only found non-conforming compilers up until now.
I replace sizeof(T) by
a method which should work in both cases. */
if ( (p-a) % (&a[1]-&a[0]))

Since &a[1]-&a[0] (==a+1-a) always yields 1, the condition
is always false.
return 0; /* not at element boundary*/

return 1; /* in and on element boundary */
}

End of the story: if you ever need to verify if a specific
pointed-to object is part of a certain array, carry around
the array length plus indices and compare these instead of
some 'raw' pointers. And, IMHO, if one feels the need to
perform the operation desired by the OP, one should first
rethink the algorithm, because it's most probably broken by
design.

HTH
Regards
 
M

Martin Dickopp

Bernhard Holzmayer said:
Dik said:
What about this solution:
int ptrInArrSafe(T *p,T *a,int max)
/* check whether p points to an element of a[max] */
{
long px = (long)p;
long ax = (long)a;
if (px<ax) return 0; /*out*/
if (px> (long)&a[max-1]) return 0; /* out */
if ( (px-ax) % sizeof(T)) return 0; /* not at element
start*/
return 1; /* in and on element start */
}

compiles, links, runs without problems with my gcc.
And, as far as I remember, it's legal, to compare pointers
after conversion to long.

But the result is implementation defined. I know of at least one
system where it will fail to give the correct answer, then T is
char. For instance:
(long)(a + 7) > (long)(a + 8)
when a is an array of type char.

That's not the same.
Would you please try the following:
(long)(&a[7]) < (long)(&a[8])

The only difference between these expressions is that the first uses the
`>' operator, while the second uses the `<' operator. But `(long)(a+7)'
is exactly the same as `(long)(&a[7])' (likewise if both 7s are replaced
by 8s).

Also, trying something (in the sense of compiling and running it and
observing what it does) is a very flawed way to determine if something
is correct, as it does not necessarily give the right answer.
this should give the correct result.

The result of the conversion to `long' is implementation-defined, so it
cannot be correct on all possible implementations.

If `long' cannot represent the pointer value, the behavior is undefined.
Reason for your strange behaviour is possibly the implicite type
cast,

C does not have implicit type casts. You mean conversion. :)
which converts a before adding.

There's no conversion before adding. This is the `+' operator between a
pointer and an integer type, which results in a pointer type.
Instead, the later version increments a as a pointer and then converts
it.

So does the first.

Martin
 
M

Martin Dickopp

Bernhard Holzmayer said:
Christian said:
Bernhard Holzmayer said:
Sorry,
pardon me for not checking my code intensively enough.
I just checked with a 'gcc -pedantic' which didn't throw an error
:(

What about this solution:
int ptrInArrSafe(T *p,T *a,int max)
/* check whether p points to an element of a[max] */
{
long px = (long)p;
long ax = (long)a;
if (px<ax) return 0; /*out*/
if (px> (long)&a[max-1]) return 0; /* out */
if ( (px-ax) % sizeof(T)) return 0; /* not at element
start*/
return 1; /* in and on element start */
}

compiles, links, runs without problems with my gcc.
And, as far as I remember, it's legal, to compare pointers after
conversion to long.

It just isn't guaranteed to produce any meaningful result.

Why. Please give arguments.

6.3.2.3#6:
| Any pointer type may be converted to an integer type. Except as
| previously specified, the result is implementation-defined. If the
| result cannot be represented in the integer type, the behavior is
| undefined. The result need not be in the range of values of any
| integer type.

Martin
 
M

Martin Dickopp

Bernhard Holzmayer said:
1) Isn't it correct, that long has to be of the size to hold any
pointer, so that a conversion:
(pType *) --> (void *) ---> (long)
and vice versa is legal?

No. In C99, you have the optional integer types `intptr_t' and
`uintptr_t' which can represent all values of `void *' if they exist at
all. In C89, no integer type is guaranteed to have this property.

Even if a suitable integer type exists, the conversion is not guaranteed
to preserve the ordering, i.e. while `&a[0] < &a[1]' is always true,
`(long)(&a[0]) < (long)(&a[1])' need not be, even if `long' can
represent the pointer values.
2) Isn't it correct, that arrays must have their elements arranged
subsequently, so that array == &array[0] and
&array[1]>&array[0] are valid?

Yes. Both expression evaluate to true (i.e. 1).

Martin
 
I

Irrwahn Grausewitz

Bernhard Holzmayer said:
pete wrote:
N869
6.3.2.3 Pointers
[#6] Any pointer type may be converted to an integer
[#type.
Except as previously specified, the result
is
implementation-defined. If the result cannot be
represented
in the integer type, the behavior is undefined. The
result need not be in the range of values of any integer
type.

1) Isn't it correct, that long has to be of the size to hold any
pointer, so that a conversion:
(pType *) --> (void *) ---> (long)
and vice versa is legal?

As is obvious from above quote: no.
2) Isn't it correct, that arrays must have their elements arranged
subsequently, so that array == &array[0] and
&array[1]>&array[0] are valid?

Yes, err no, it's correct.

Regards
 
I

Irrwahn Grausewitz

Bernhard Holzmayer said:
Bernhard Holzmayer wrote:
Keith, would this hold on your Cray with the strange CHAR
representation, that p-a still returns a reasonable result?
*/
if ( (p-a) % (&a[1]-&a[0]))
return 0; /* not at element boundary*/
<snip>

Well, I'm not Keith, but for two pointers p and a the expression
(p-a) gives a reasonable result if, and only if, both point to
elements of (or one past) the same array object:

ISO/IEC 9899:1999 (E)
6.5.6 Additive operators
[...]
9 When two pointers are subtracted, both shall point to elements of
the same array object, or one past the last element of the array
object; the result is the difference of the subscripts of the two
array elements. The size of the result is implementation-defined,
and its type (a signed integer type) is ptrdiff_t defined in the
<stddef.h> header. If the result is not representable in an object
of that type, the behavior is undefined. In other words, if the
expressions P and Q point to, respectively, the i-th and j-th
elements of an array object, the expression (P)-(Q) has the value
i-j provided the value fits in an object of type ptrdiff_t.
Moreover, if the expression P points either to an element of an
array object or one past the last element of an array object, and
the expression Q points to the last element of the same array
object, the expression ((Q)+1)-(P) has the same value as
((Q)-(P))+1 and as -((P)-((Q)+1)), and has the value zero if the
expression P points one past the last element of the array object,
even though the expression (Q)+1 does not point to an element of
the array object.

HTH
Regards
 
B

Bernhard Holzmayer

Irrwahn said:
SO/IEC 9899:1999 (E)
6.5.6 Additive operators
[...]
9 When two pointers are subtracted, both shall point to elements
of
the same array object, or one past the last element of the
array

Here it reads "shall", not "must".
That's what I have in mind, and what one can read in several
locations where pointers and pointer arithmetics are handled.

Either ISO should be clearer here, if this means optional like
undefined behaviour.

Otherwise, if it is written on purpose, it's an implicite statement
that at least one of the pointers may point elsewhere.

SCNR,
Bernhard
 
I

Irrwahn Grausewitz

Bernhard Holzmayer said:
What a pity. Then, I guess, it's up to the caller of the function,
to make sure p is a valid pointer ;-)
<snip>

That's indeed the best solution. ;-)
/* as far as I understand, it's possible that p-a returns
the difference in multiples of sizeof(T), so that if element size
were 2, a[2]-a[1] would give 1 instead of 2.

That's not only possible, it's /required/ for a conforming
implementation to behave like this (C99 6.5.6#9): pointer
subtraction always yields values in units of element size.
I read this, and got it like you. However, it's not my experience
when working with real compilers.
Then you only found non-conforming compilers up until now.
one is gcc 2.95.3, tried on a struct with sizeof(T)==8
<snip>

Uck. IIRC gcc 2.xx is indeed broken in several ways.
if ( (p-a) % (&a[1]-&a[0]))

Since &a[1]-&a[0] (==a+1-a) always yields 1, the condition
is always false.

However, with gcc, where p-a is 24, if p==&a[3], and &a[1]-&a[0]
==8, this returns true, if p points to an element between
boundaries, so it's useful in both situations on both types of
compilers... which was the intention.
<snip>

Well, I see your point, but IMHO, if one really has to cater for
broken implementations, I'd try to deal with it in the preprocessing
stage in RealWorld[tm] code (not including throw-away NG examples,
of course).
I learned a bit. Thanks.

</me listening to distant barking sounds>
Hopefully I didn't screw up something this time, but the pack will
eventually jump on it and correct it anyway. ;-)

Regards
 
D

Dik T. Winter

> Dik T. Winter wrote: ....
>
> That's not the same.

What is the difference?
> Would you please try the following:
> (long)(&a[7]) < (long)(&a[8])

Returns false on that system.
> this should give the correct result.

Which is the correct result on that system.
> Reason for your strange behaviour is possibly the implicite type
> cast, which converts a before adding.

No, that is *not* the reason, because there is no implicit type cast.
> Instead, the later version increments a as a pointer and then
> converts it.

As does the first version. But to clarify, on that system a pointer
(and a long) are 64 bits. In a pointer the lower 48 bits are the
word pointer (a word is also 64 bits), the upper 16 bits are a
relative byte pointer within a word. Casting a pointer to long
performs no conversion at all.
 
I

Irrwahn Grausewitz

Bernhard Holzmayer said:
Irrwahn said:
SO/IEC 9899:1999 (E)
6.5.6 Additive operators
[...]
9 When two pointers are subtracted, both shall point to elements
of
the same array object, or one past the last element of the
array

Here it reads "shall", not "must".
That's what I have in mind, and what one can read in several
locations where pointers and pointer arithmetics are handled.

Either ISO should be clearer here, if this means optional like
undefined behaviour.

Otherwise, if it is written on purpose, it's an implicite statement
that at least one of the pointers may point elsewhere.

"Shall" is Standardese for: "must, under all circumstances,
otherwise demons might fly out of your nose". :)

Or, as the authors of the standard put it:

ISO/IEC 9899:1999 (E)
4. Conformance
1 In this International Standard, "shall" is to be interpreted as
a requirement on an implementation or on a program; conversely,
"shall not" is to be interpreted as a prohibition.

Regards.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,141
Messages
2,570,817
Members
47,362
Latest member
ChandaWagn

Latest Threads

Top