Portable pointer arithmetics?

M

matt

I (think I have) understood the pitfalls of pointer arithmetics.

For example, the output of this code depends on the platform on which
this program will run:

char * cp = "Hello World";
char * c2p = NULL;
int * ip = (int *) cp;
ip = ip + 1;
c2p = (char *) ip;
printf("%c\n", *c2p);

The output of this program is "o" where the size of integer is 4 and "l"
where the size of integer is 2 bytes. However, this problem is intrinsic
in the application, where we scan an array of chars with a pointer to
integers (through the typecasting).

On the other hand, is the following (pseudo)code

mytype mta[5];
...
mytype * mtp = &mta; // 0 <= i <= 4
mytypeAmazingFunction(mtp, mtp-1, <some_other_args>);

*always* portable to any architecture?

It is equivalent to

mytypeAmazingFunction(&mtp, &mtp[i-1], <some_other_args>);

which is always portable, isn't it?
 
G

Gene

I (think I have) understood the pitfalls of pointer arithmetics.

For example, the output of this code depends on the platform on which
this program will run:

  char * cp = "Hello World";
  char * c2p = NULL;
  int * ip = (int *) cp;
  ip = ip + 1;
  c2p = (char *) ip;
  printf("%c\n", *c2p);

The output of this program is "o" where the size of integer is 4 and "l"
where the size of integer is 2 bytes. However, this problem is intrinsic
in the application, where we scan an array of chars with a pointer to
integers (through the typecasting).

On the other hand, is the following (pseudo)code

   mytype mta[5];
   ...
   mytype * mtp = &mta; // 0 <= i <= 4
   mytypeAmazingFunction(mtp, mtp-1, <some_other_args>);

*always* portable to any architecture?

It is equivalent to

   mytypeAmazingFunction(&mtp, &mtp[i-1], <some_other_args>);

which is always portable, isn't it?


When i == 0, mtp - 1 causes undefined behavior.
 
B

Barry Schwarz

I (think I have) understood the pitfalls of pointer arithmetics.

For example, the output of this code depends on the platform on which
this program will run:

char * cp = "Hello World";
char * c2p = NULL;
int * ip = (int *) cp;

This statement will invoke undefined behavior if the string literal
happens to be improperly aligned for an int.
ip = ip + 1;
c2p = (char *) ip;
printf("%c\n", *c2p);

The output of this program is "o" where the size of integer is 4 and "l"
where the size of integer is 2 bytes. However, this problem is intrinsic
in the application, where we scan an array of chars with a pointer to
integers (through the typecasting).

You could eliminate the potentially undefined behavior by eliminating
ip and assigning c2p the value cp+sizeof(int). The result is
implementation dependent but at least well defined.
On the other hand, is the following (pseudo)code

mytype mta[5];
...
mytype * mtp = &mta; // 0 <= i <= 4
mytypeAmazingFunction(mtp, mtp-1, <some_other_args>);


If i is 0, then evaluating mtp-1 invokes undefined behavior.
*always* portable to any architecture?

It is equivalent to

mytypeAmazingFunction(&mtp, &mtp[i-1], <some_other_args>);


No. The two arguments should be &mta and &mta[i-1]. Evaluating
the second argument still invokes undefined behavior when i is 0.
which is always portable, isn't it?

Since the statement invokes undefined behavior on all systems when i
is 0, that is a perverse form of portability. Most use the term
portability to mean produce equivalent results on multiple systems
(taking into account inevitable differences due to implementation
details such as the number of significant digits in floating point
types). Since undefined behavior is not guaranteed to be consistent
across implementations, I think most would say a program invoking
undefined behavior cannot be portable.

I do not see how second discussion using mytype relates to your first
discussion about the implementation dependent size of an int.
 
M

matt

It is equivalent to

mytypeAmazingFunction(&mtp, &mtp[i-1], <some_other_args>);

which is always portable, isn't it?


Sorry, of course I meant

mytypeAmazingFunction(&mta, &mta[i-1], <some_other_args>);
 
M

matt

You could eliminate the potentially undefined behavior by eliminating
ip and assigning c2p the value cp+sizeof(int). The result is
implementation dependent but at least well defined.

ok thanks.
On the other hand, is the following (pseudo)code

mytype mta[5];
...
mytype * mtp =&mta; // 0<= i<= 4
mytypeAmazingFunction(mtp, mtp-1,<some_other_args>);


If i is 0, then evaluating mtp-1 invokes undefined behavior.
*always* portable to any architecture?

It is equivalent to

mytypeAmazingFunction(&mtp,&mtp[i-1],<some_other_args>);


No. The two arguments should be&mta and&mta[i-1]. Evaluating
the second argument still invokes undefined behavior when i is 0.


Sorry for both. Bad cut&paste:

1 <= i <= 4
and
mytypeAmazingFunction(&mta[i] said:
I do not see how second discussion using mytype relates to your first
discussion about the implementation dependent size of an int.

I didn't mean discuss about the implementation dependent size of an int.

I meant: ok I understood the common pitfalls working with pointer
arithmetic (e.g. scanning a vector of chars with a pointer to integers);
however, I'm dealing with pointer arithmetic in different contest
(mytypeAmazingFunction() on mytype data type). Does
mytypeAmazingFunction(mtp, mtp-1, <some_other_args>) (using pointer
arithmetic) suffer of some problem if ported across multiple platforms,
so that I am constrained to use
mytypeAmazingFunction(&mta,&mta[i-1],<some_other_args>) (which is
always portable), or not?
 
E

Eric Sosman

I (think I have) understood the pitfalls of pointer arithmetics.

For example, the output of this code depends on the platform on which
this program will run:

char * cp = "Hello World";
char * c2p = NULL;
int * ip = (int *) cp;
ip = ip + 1;
c2p = (char *) ip;
printf("%c\n", *c2p);

The output of this program is "o" where the size of integer is 4 and "l"
where the size of integer is 2 bytes. However, this problem is intrinsic
in the application, where we scan an array of chars with a pointer to
integers (through the typecasting).

The output might also be "Haddocks' Eyes," or there might not
be any output at all, or your hard drive might fly away like a
Frisbee and get chewed by a frolicking spaniel (in theory, anyhow).
The problem is that the nameless char[] array created by the literal
might not begin at an address that is suitably aligned for an int,
so an int* is not necessarily able to point at that address. The
initialization of `ip' is therefore problematic, and there's really
no telling what might happen.

On "mainstream" machines nowadays the potential misalignment will
cause no trouble (in this case). But if the rapid and turbulent
cascade of change in computerland keeps flowing as swiftly as it has
for the last half-century, today's mainstream is tomorrow's backwater.
On the other hand, is the following (pseudo)code

mytype mta[5];
...
mytype * mtp = &mta; // 0 <= i <= 4
mytypeAmazingFunction(mtp, mtp-1, <some_other_args>);

*always* portable to any architecture?


Not if `i' is zero, which causes you to try to form a pointer
to the nonexistent element `mta[-1]'. There's a special rule that
lets you form a pointer to the nonexistent element just *after* an
array (`mta[5]', in this case), provided you don't try to access
that fictitious element, but there's no similar special case for
pointing at an imaginary element preceding an array.
It is equivalent to

mytypeAmazingFunction(&mtp, &mtp[i-1], <some_other_args>);

which is always portable, isn't it?


Not if `i' is zero.
 
B

Ben Bacarisse

matt said:
mytype mta[5];
...
mytype * mtp =&mta; // 0<= i<= 4

I meant: ok I understood the common pitfalls working with pointer
arithmetic (e.g. scanning a vector of chars with a pointer to
integers); however, I'm dealing with pointer arithmetic in different
contest (mytypeAmazingFunction() on mytype data type). Does
mytypeAmazingFunction(mtp, mtp-1, <some_other_args>) (using pointer
arithmetic) suffer of some problem if ported across multiple
platforms, so that I am constrained to use
mytypeAmazingFunction(&mta,&mta[i-1],<some_other_args>) (which is
always portable), or not?


E1[E2] means *((E1) + (E2)). &*(E) means E. Thus &mta means mta + i
and &mta[i-1] means mta + (i-1). Your two function calls are
equivalent -- and they are both undefined when i == 0.

The brackets round (i-1) are significant in some cases. Were I to get a
value of 6, &mta[i-1] == mta + 5 is valid (provided you don't
dereference it) but mtp will already have been set to an invalid pointer
and all bets are off.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,954
Messages
2,570,116
Members
46,704
Latest member
BernadineF

Latest Threads

Top