array indexing anecdote

Ben Bacarisse · Mar 1, 2014

glen herrmannsfeldt said:
OK, but those are ancestors of C.

Sure. Didn't know that was not relevant.

Any that are not directly related?

LISP in about 1965 (i.e. not the very first LISP) comes to mind.

PL/I was the first I knew that allowed one to select the
lower bound, and later Pascal and Fortran 77 also did.

That was quite common I think. Amongst the early ones to do this are
Algol, Simula, Coral and SNOBOL.

<snip>

Ben Bacarisse · Mar 2, 2014

#include <stdio.h>
#include <stdlib.h>
#include <stddef.h>

int main(void)
{
int arr[] = {1, 2, 3, 4};
int *p0 = arr;
int *p3 = p0 + 3;
ptrdiff_t diff;

printf("%d\n", p3 - p0); // 3

C99 (and C11) provide 't' as a length modifier for ptrdiff_t values. If
you don't want to (or can't) use it, cast the result to the right type
for whatever format you choose to use.

printf("%d\n", &*p3 - &*p0); // 3
printf("%d\n", (void *) p3 - (void *) p0); // 12

This is not permitted in standard C but it is a very common extension.
The reason it's non-standard is related to the point you make -- how
many non-existent objects lie between the two pointers?

printf("%d\n", (char *) p3 - (char *) p0); // 12

return EXIT_SUCCESS;
}

<snip>

Keith Thompson · Mar 2, 2014

In other words: plus is not plus... but depends on the types involved in plus.

No, plus *is* plus, *and* its behavior depends on the types involved.

Rosario193 · Mar 2, 2014

Came across this old exampe how C (or better said the compiler) doesn't really
understand array indexing...

#include <stdio.h>
#include <stdlib.h>
#define ARSZ 20
main(){
int ar[ARSZ], i;
for(i = 0; i < ARSZ; i++){
ar = i;
i[ar]++;
printf("ar[%d] now = %d\n", i, ar);
}

printf("15[ar] = %d\n", 15[ar]);
exit(EXIT_SUCCESS);
}

ar=*(ar+i)=*(i+ar)=i[ar]

this would print number from 1 to 20

Eric Sosman · Mar 2, 2014

In other words: plus is not plus... but depends on the types involved in plus.

Why should that be even the least bit surprising?

int i = 42 + 17;
unsigned int u = 42u + 17u;
long l = 42L + 17L;
double d = 42.0 + 17e0;
...

Four plus signs (and more not shown), four different operations.

BartC · Mar 2, 2014

Eric Sosman said:
Why should that be even the least bit surprising?

int i = 42 + 17;
unsigned int u = 42u + 17u;
long l = 42L + 17L;
double d = 42.0 + 17e0;
...

Four plus signs (and more not shown), four different operations.

But they all give results which are 17 more than the left operand, which is
what you would generally expect.

Try it with this:

int* p=&i + 17;

I get a value of p which is 68 more than the &i (when printed with %p).

That's why this 'plus' operation is a bit different.

(You might argue we shouldn't be looking at the internal bit-pattern, after
all the bit-pattern for a double, interpreted as an integer, doesn't go up
by 17 either. But there is a too-strong correlation between a pointer and a
memory address to ignore. And there is a high likelihood that internally, 68
*is* added, not 17 or 17.0!)

Eric Sosman · Mar 2, 2014

But they all give results which are 17 more than the left operand, which is
what you would generally expect.

No: The result of the first is 17 more, that of the second
is 17u more, the third is 17L more, and the fourth is 17e0 more.
If you haven't spotted the difference between 17, 17u, 17L, and
17e0, you need to look more closely.

Try it with this:

int* p=&i + 17;

I get a value of p which is 68 more than the &i (when printed with %p).

Since the behavior is undefined, what you get is not especially
persuasive ... But let's modify your example to fix the U.B.:

int i[17];
int *p = i + 17;

Now let's check the result:

assert(p - i == 68); // BartC's prediction ...

BartC · Mar 2, 2014

Eric Sosman said:
On 3/2/2014 9:47 AM, BartC wrote:

Try it with this:

int* p=&i + 17;

I get a value of p which is 68 more than the &i (when printed with %p).

Click to expand...

Since the behavior is undefined, what you get is not especially
persuasive ... But let's modify your example to fix the U.B.:

int i[17];
int *p = i + 17;

Now let's check the result:

assert(p - i == 68); // BartC's prediction ...

That doesn't count. Because you're using the correspondingly special 'minus'
operation, which will convert any actual 68 result down to 17.

All I'm saying is that these plus and minus ops for pointers are special
because they often hide a scaling operation in the machine. Sometimes you
need to be aware of this when, for example, adding an actual byte offset to
a pointer.

Eric Sosman · Mar 2, 2014

Eric Sosman said:
Eric Sosman said:

On 3/2/2014 9:47 AM, BartC wrote:

Try it with this:

int* p=&i + 17;

I get a value of p which is 68 more than the &i (when printed with %p).

Click to expand...

Since the behavior is undefined, what you get is not especially
persuasive ... But let's modify your example to fix the U.B.:

int i[17];
int *p = i + 17;

Now let's check the result:

assert(p - i == 68); // BartC's prediction ...

Click to expand...

That doesn't count. Because you're using the correspondingly special
'minus' operation, which will convert any actual 68 result down to 17.

There *is* no "actual 68 result." The result of the
subtraction is (ptrdiff_t)17, and that's that. It makes not the
slightest difference *how* the silicon arrives at that result:
It can subtract something else and scale the difference, it can
scale two somethings and subtract, it can scatter the bits into
a two-dimensional array and do wavelet transforms, ... but the
result it must produce is (ptrdiff_t)17, by whatever means.

All I'm saying is that these plus and minus ops for pointers are special
because they often hide a scaling operation in the machine. Sometimes
you need to be aware of this when, for example, adding an actual byte
offset to a pointer.

There is nothing in the least bit "special" about the fact
that + and - do different things when applied to different types,
and that's what the O.P. found startling.

Eric Sosman · Mar 2, 2014

[...]
There is nothing in the least bit "special" about the fact
that + and - do different things when applied to different types,
and that's what the O.P. found startling.

Sorry: Not the O.P., but (e-mail address removed).

Ken Brody · Mar 3, 2014

It is consistent with this assertion. But the /reason/ is
that C was designed this way. For a possible reason why it
was designed this way, see also:

http://www.purl.org/stefan_ram/pub/zero

We ran some errands today with the kids. We had told them that $LOCATION
was going to be our "first stop". On the way to $LOCATION, we decided to
stop elsewhere beforehand. When the kids complained that they were told
that $LOCATION was to be our "first stop", I told them that $MOM and I are C
programmers, and that we were simply making our "zeroth" stop.

Ken Brody · Mar 3, 2014

I any machine language, the address of the first component
of an array is the address of the array (plus zero).

Click to expand...

As I previously noted, it would have been possible to define
a as *(a+b-1), in which case arrays would be origin 1.

The machine doesn't care, in most cases, if you give the right
origin for the array. That is, the address constant doesn't have
to be the address of the first element, though C programmers might
disagree. (At least for the hardware that I know about.)

[...]

Of course, if "a" is a pointer, rather than an array, 1-based subscripts is
less efficient, due to the necessity of an additional subtraction operation.

If for no other reason, zero-based subscripts makes more sense because it's
more efficient.

glen herrmannsfeldt · Mar 3, 2014

(snip, then I wrote)

As I previously noted, it would have been possible to define
a as *(a+b-1), in which case arrays would be origin 1.
The machine doesn't care, in most cases, if you give the right
origin for the array. That is, the address constant doesn't have
to be the address of the first element, though C programmers might
disagree. (At least for the hardware that I know about.) [...]

Click to expand...

Of course, if "a" is a pointer, rather than an array,
1-based subscripts is less efficient, due to the necessity of
an additional subtraction operation.

If for no other reason, zero-based subscripts makes more
sense because it's more efficient.

Click to expand...

If "pointer" means a C pointer that can point to either a
scalar value or the first element of an array, and the hardware
doesn't have a convenient way to add, but not subtract and offset,
then I suppose so.

If a pointer is known to point to the beginning of a 1-origin
array, then subtract the appropriate amount when creating it.
(As previously noted, PL/I compilers do this for array
descriptors. PL/I pointers are different.)

Note also that some machines have a convenient way to add a
small constant when referencing data, but not subtract. In that
case, 1-origin wouldn't cost any more, even for a scalar.

-- glen

The behavior of the program.	2	Feb 21, 2014
Dynamic Array Size Problem??	9	Jul 10, 2023
Array of structs function pointer	10	Jul 16, 2023
Adding adressing of IPv6 to program	1	Feb 16, 2023
Program to find the largest integer element of an array.	1	Mar 2, 2022
Command Line Arguments	0	Mar 7, 2023
Comparison of Integer and Pointer (that's supposed to be an Integer). Where did I go wrong?	0	Nov 19, 2022
Function is not worked in C	2	Jun 27, 2023

array indexing anecdote

Ben Bacarisse

Ben Bacarisse

Keith Thompson

Rosario193

Eric Sosman

BartC

Eric Sosman

BartC

Eric Sosman

Eric Sosman

Ken Brody

Ken Brody

glen herrmannsfeldt

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads