pointer subtraction and sizeof

J

junky_fellow

Hi,

In my previous post I asked if sizeof may be implemented as a
function. Somebody pointed a link that says that sizeof may calculated
as follows with a warning that it is not guaranteed to work on all
implementation.

size_t size_obj = (char*)(&obj + 1) - (char*)(&obj);

I wanted to find out on which implementation this would fail. One of
the answer that I got was that ptrdiff_t may not be of same size as
size_t, due to which this may fail.

I was thinking of one more scenario where this may fail, but not sure
whether I am right or wrong. Just wanted to take your opinion.
Consider a 32 bit machine on which the "obj" is lying at the largest
address ie. 0xffffffff on this machine.
Shouldn't the addition of 1 to the obj pointer fail on this machine and
we are likely to get a wrong value of the size ? Aren't we assuming
that &obj + 1 will always be a valid address by doing that ?

With this one more question comes to my mind.
When we talk about arrays, we say that address of element, one past the
last element will always be a valid one. Does that mean, that on a 32
bit machine, the largest array of char would be of size 0xffffffff -1 ?

many thanks for any help/comments ...
 
G

Guest

Hi,

In my previous post I asked if sizeof may be implemented as a
function. Somebody pointed a link that says that sizeof may calculated
as follows with a warning that it is not guaranteed to work on all
implementation.

size_t size_obj = (char*)(&obj + 1) - (char*)(&obj);

I wanted to find out on which implementation this would fail. One of
the answer that I got was that ptrdiff_t may not be of same size as
size_t, due to which this may fail.

I was thinking of one more scenario where this may fail, but not sure
whether I am right or wrong. Just wanted to take your opinion.
Consider a 32 bit machine on which the "obj" is lying at the largest
address ie. 0xffffffff on this machine.
Shouldn't the addition of 1 to the obj pointer fail on this machine and
we are likely to get a wrong value of the size ? Aren't we assuming
that &obj + 1 will always be a valid address by doing that ?

Yes. This is a safe assumption, because it is guaranteed by the C
standard.
With this one more question comes to my mind.
When we talk about arrays, we say that address of element, one past the
last element will always be a valid one. Does that mean, that on a 32
bit machine, the largest array of char would be of size 0xffffffff -1 ?

Whether this is by design or by accident is open to debate, but the
address one past the end is allowed to double as a null pointer, so if
there is not a single other object (including argc, argv, any string
literals, etc.), it's possible in theory to have an object of size
0xFFFFFFFF on a system with 32-bit pointers.
 
J

junky_fellow

Harald said:
Yes. This is a safe assumption, because it is guaranteed by the C
standard.
Thanks Herald for the answer, but I believe this is true only for an
array but not for any other object, that is why I asked this question.
Whether this is by design or by accident is open to debate, but the
address one past the end is allowed to double as a null pointer, so if
there is not a single other object (including argc, argv, any string
literals, etc.), it's possible in theory to have an object of size
0xFFFFFFFF on a system with 32-bit pointers.

Again, I am talking about an array.
 
R

Richard Heathfield

(e-mail address removed) said:
Hi,

In my previous post I asked if sizeof may be implemented as a
function. Somebody pointed a link that says that sizeof may calculated
as follows with a warning that it is not guaranteed to work on all
implementation.

size_t size_obj = (char*)(&obj + 1) - (char*)(&obj);

They were wrong. sizeof cannot be calculated because sizeof is not a value.
It is an operator. Furthermore, it works not just on objects but also on
expressions and on parenthesised types. That is why it's an operator in the
first place - you can't implement it as a C function, because the syntax
simply doesn't exist.

With this one more question comes to my mind.
When we talk about arrays, we say that address of element, one past the
last element will always be a valid one. Does that mean, that on a 32
bit machine, the largest array of char would be of size 0xffffffff -1 ?

No, it doesn't mean that. On a 32-bit machine, or indeed on any machine, the
largest object you can be *guaranteed* to obtain is 32767 bytes (or, if
you've managed to find a C99 compiler, 65535 bytes).
 
J

junky_fellow

Richard said:
(e-mail address removed) said:


They were wrong. sizeof cannot be calculated because sizeof is not a value.
It is an operator. Furthermore, it works not just on objects but also on
expressions and on parenthesised types. That is why it's an operator in the
first place - you can't implement it as a C function, because the syntax
simply doesn't exist.
I agree with your previous answers (in my previous post) that sizeof
cannot be implemented as a function. I just wanted to discuss where the
above subtraction would fail.
 
J

junky_fellow

Richard said:
(e-mail address removed) said:




AFAICT it's not allowed to fail.
But, how can we guarantee that &obj + 1 will always be valid. As in my
previous exmaple, if obj is lying at the largest address . Shouldn't
this subtraction fail in this case ?
 
R

Richard Heathfield

(e-mail address removed) said:
But, how can we guarantee that &obj + 1 will always be valid. As in my
previous exmaple, if obj is lying at the largest address . Shouldn't
this subtraction fail in this case ?

The implementation is required to allow you to point one past the end of any
array without breaking pointer subtraction, and an object can be thought of
as an array of one object, IIRC (but you might be sufficiently motivated to
check with the csc folks on that).
 
G

Guest

Thanks Herald for the answer, but I believe this is true only for an
array but not for any other object, that is why I asked this question.

As Richard Heathfield also mentioned, any object can be thought of as
an array of one element. The section in the standard is 6.5.6p7, which
reads:
"For the purposes of these operators [plus and minus], a pointer to an
object that is not an element of an array behaves the same as a pointer
to the ï¬rst element of an array of length one with the type of the
object as its element type."
Again, I am talking about an array.

In this case, that is irrelevant. Every byte of every object can be
addressed using a pointer-to-char, so there must be some character
pointer value that means "the first byte of argc", and the same for all
bytes and all other objects. This cannot be the same value as a pointer
to any element of your large array.
 
S

Sharath

Hi,

In my previous post I asked if sizeof may be implemented as a
function. Somebody pointed a link that says that sizeof may calculated
as follows with a warning that it is not guaranteed to work on all
implementation.

size_t size_obj = (char*)(&obj + 1) - (char*)(&obj);

I had said that for the other method which tries to find the size of a
type, not size of an object. I should have made it clear before, sorry.
 
S

Sharath

Richard said:
(e-mail address removed) said:


They were wrong. sizeof cannot be calculated because sizeof is not a value.
It is an operator.

The OP had specifically asked to implement it as a function, not to
implement it as an operator, which of course is not possible.
 
R

Richard Heathfield

Sharath said:
The OP had specifically asked to implement it as a function, not to
implement it as an operator, which of course is not possible.

Well, it *is* implemented as an operator, by compiler writers. :)

It cannot be implemented in its entirety as a function, however.
 
Y

Yingjian Zhan

Hi,

In my previous post I asked if sizeof may be implemented as a
function. Somebody pointed a link that says that sizeof may calculated
as follows with a warning that it is not guaranteed to work on all
implementation.

size_t size_obj = (char*)(&obj + 1) - (char*)(&obj);
this substraction never fails since there are two temporary variables (
maybe just two registers in assembly after compile ) holding
values of "address of obj + 1" and "address of obj". they are just same as
other variables. maybe &obj+1 overflows on the limited
word size boxes but it can pass compile and can run but maybe the result is
not correct. if you try to dereference an invalid address you will
get segmentation fault on mordern OSes.

You can not caculate size of obj in this way. you will expect size_obj=1 on
most implementations.
size_obj = (ptr_to_obj+1) - ptr_to_obj. It's done by the compiler writers
too. You can NOT use arithmatic on void *.
sizeof is just an operator which inform the compiler to caculate the value
in the compiling period.
 
C

Clark S. Cox III

Yingjian said:
this substraction never fails since there are two temporary variables (
maybe just two registers in assembly after compile ) holding
values of "address of obj + 1" and "address of obj". they are just same as
other variables. maybe &obj+1 overflows on the limited

&obj+1 is *always* legal for any valid obj (i.e. it is always safe to
add one to any pointer to a valid object)
word size boxes but it can pass compile and can run but maybe the result is
not correct. if you try to dereference an invalid address you will
get segmentation fault on mordern OSes.

You can not caculate size of obj in this way.

Yes, you can.
you will expect size_obj=1 on most implementations. size_obj = (ptr_to_obj+1) - ptr_to_obj.

That is why there is a cast to (char*) before the subtraction.
 
Y

Yingjian Zhan

yeah. u r right.
I misunderstood the post.
Clark S. Cox III said:
&obj+1 is *always* legal for any valid obj (i.e. it is always safe to
add one to any pointer to a valid object)


Yes, you can.


That is why there is a cast to (char*) before the subtraction.
 
C

CBFalconer

J

junky_fellow

Richard said:
(e-mail address removed) said:


The implementation is required to allow you to point one past the end of any
array without breaking pointer subtraction, and an object can be thought of
as an array of one object, IIRC (but you might be sufficiently motivated to
check with the csc folks on that).
I am still not able to understand that how an implementation would
allow such a subtraction if the object is lying at the highest possible
address. Can you please explain this by giving some example ?
 
G

Guest

I am still not able to understand that how an implementation would
allow such a subtraction if the object is lying at the highest possible
address.

If it can't, then the implementation must not allow an object lying at
the highest possible address. Let's assume it can, though...
Can you please explain this by giving some example ?

One possibility is that pointer subtraction works the same way as
unsigned arithmetic. (Note: I have not carefully checked the example
for correctness. If there are mistakes, I hope the intent is clear.)

#include <stdio.h>
#include <inttypes.h>
char object; /* happens to be at the highest possible address */
int main(void) {
char *p1 = &object, *p2 = &object + 1;
printf("%p\n", (void *) p1); /* pointer to highest possible address
*/
printf("%p\n", (void *) p2); /* pointer to one past highest
possible address */
printf("%td\n", p2 - p1); /* one */

uintptr_t a1 = (uintptr_t) p1, a2 = a1 + 1;
printf(PRIXPTR "\n", a1); /* highest possible address */
printf(PRIXPTR "\n", a2); /* highest possible address plus one */
printf(PRIXPTR "\n", a2 - a1); /* one */
}

gives this output:

FFFFFFFF
00000000
1
FFFFFFFF
0
1
 
R

Richard Heathfield

(e-mail address removed) said:

I am still not able to understand that how an implementation would
allow such a subtraction if the object is lying at the highest possible
address. Can you please explain this by giving some example ?

If placing an object at the highest possible address will cause this to
fail, then the implementation is not allowed to place an object at the
highest possible address.
 
J

junky_fellow

Harald said:
If it can't, then the implementation must not allow an object lying at
the highest possible address. Let's assume it can, though...


One possibility is that pointer subtraction works the same way as
unsigned arithmetic. (Note: I have not carefully checked the example
for correctness. If there are mistakes, I hope the intent is clear.)

#include <stdio.h>
#include <inttypes.h>
char object; /* happens to be at the highest possible address */
int main(void) {
char *p1 = &object, *p2 = &object + 1;
printf("%p\n", (void *) p1); /* pointer to highest possible address
*/
printf("%p\n", (void *) p2); /* pointer to one past highest
possible address */
printf("%td\n", p2 - p1); /* one */

uintptr_t a1 = (uintptr_t) p1, a2 = a1 + 1;
printf(PRIXPTR "\n", a1); /* highest possible address */
printf(PRIXPTR "\n", a2); /* highest possible address plus one */
printf(PRIXPTR "\n", a2 - a1); /* one */
}

gives this output:

FFFFFFFF
00000000
1
FFFFFFFF
0
1

Thanks Herald and Richard for your answers. Both of them make sense and
are satisfying. Thanks a ton again.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,989
Messages
2,570,207
Members
46,782
Latest member
ThomasGex

Latest Threads

Top