array indexing anecdote

J

James Kuyper

IMO, if something treats something as something else, which allows the
something else to become the something, understanding is at fault.

Thus understanding -> "understanding".

Could you identify the second "something" and the "something else" that
you're referring to? C treats array[index] and index[array] as two
different but equivalent ways of expressing the same concept. It doesn't
treat "index" as if it were "array", which would indicate a lack of
understanding - it simply doesn't which one comes first. This is a
natural consequence of C's definition of how the subscript operator is
interpreted, which is, in turn, a natural consequence of C's heavy
obsession with pointers.
 
H

Helmut Tessarek

Nice target language you have!

As mentioned before, this was not my example. Also the explanation was not
mine (which I also mentioned in an update to one of my posts).

Anyway, I found this example and explanation in a book whilst browsing through
it at a friend's place. It was 'The C Book, second edition by Mike Banahan,
Declan Brady and Mark Doran', published by Addison Wesley in 1991.

As I mentioned in my first post, I forgot about it, which implied that I knew
about it at some point and did not need an explanation.

I posted it as an anecdote (hence the subject) and/or amusement to newbies who
read the posts in this newsgroup. That's all.

--
Helmut K. C. Tessarek

/*
Thou shalt not follow the NULL pointer for chaos and madness
await thee at its end.
*/
 
K

Keith Thompson

Eric Sosman said:
My C compiler translates programs with arrays just fine,
exactly as it is supposed to do this according to C.

I didn't say the compiler didn't compile the program. It does, which is the
example.

As far as the compiler is concerned, an expression like x[n] is translated
into *(x+n) and use made of the fact that an array name is converted into a
pointer to the array's first element whenever the name occurs in an
expression. That's why, amongst other things, array elements count from zero:
if x is an array name, then in an expression, x is equivalent to &x[0], i.e. a
pointer to the first element of the array. So, since *(&x[0]) uses the pointer
to get to x[0], *(&x[0] + 5) is the same as *(x + 5) which is the same as
x[5]. A curiosity springs out of all this. If x[5] is translated into *(x +
5), and the expression x + 5 gives the same result as 5 + x (it does), then
5[x] should give the identical result to x[5]!

Yes, yes, we know this. We've known it for years and years.
Thirty-six years, to be precise: You'll find it on page 94 of the
"The C Programming Language" by Brian Kernighan and Dennis Ritchie,
published in 1978.

Well, almost. That page says that a is by definition equivalent to
(*a+i), but it doesn't explicitly follow that to the conclusion that
a is equvilalent to i[a].

Page 210 does say that:

Therefore, despite its assymetric appearance, subscripting is a
commutative operation.

though no examples are given.

But it needn't have been defined that way. The "+" operator
*could* have been defined so it can take a pointer as its left
operand and an integer as its right operand, but not vices versa.
The result would have been a consistent and usable language whose
only difference from C is that a few obscure expressions would be
invalid (and could trivially be modified to be valid).

[...]
 
S

Stefan Ram

Keith Thompson said:
Well, almost. That page says that a is by definition equivalent to
(*a+i),


We the author's of that page can't get the parentheses right
in such a simple case, I would not bother to continue reading.
 
S

Stefan Ram

Supersedes: <[email protected]>
["We the author's"->"When the authors"]

Keith Thompson said:
Well, almost. That page says that a is by definition equivalent to
(*a+i),


When the authors of that page can't get the parentheses right
in such a simple case, I would not bother to continue reading.
 
K

Kaz Kylheku

Ok, I give up. Tell me a better word then.

Intermediate English generation:

"Doesn't understand array indexing to be something different from
^^^^^^^^^^ ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
displaced dereference"

Optimization pass:

"Doesn't distinguish array indexing as from displaced dereference"
^^^^^^^^^^^ ^^^^

:)
Also, did you notice the quotes around the word? They can be interpreted as:
sort of <word-in-quotes> or for the lack of a better word.

http://en.wikipedia.org/wiki/Scare_quotes
 
G

glen herrmannsfeldt

(snip, someone wrote)
As far as the compiler is concerned, an expression like x[n] is
translated into *(x+n) and use made of the fact that an array
name is converted into a pointer to the array's first element
whenever the name occurs in an expression.
(snip)

Well, almost. That page says that a is by definition equivalent to
(*a+i), but it doesn't explicitly follow that to the conclusion that
a is equvilalent to i[a].

Page 210 does say that:
Therefore, despite its assymetric appearance, subscripting is a
commutative operation.
though no examples are given.
But it needn't have been defined that way. The "+" operator
*could* have been defined so it can take a pointer as its left
operand and an integer as its right operand, but not vices versa.
The result would have been a consistent and usable language whose
only difference from C is that a few obscure expressions would be
invalid (and could trivially be modified to be valid).

A non-commutative + reminds me of the non-commutative multiply
in some Cray processors. But yes, a non-commutative pointer
addition could have been done. It also reminds me of the
non-commutative + string concatenation operator in Java.
(But I still think they should have used a different operator
for the operation.)

To continue what they could have done, note that C allows
pointer-integer but not integer-pointer. Maybe that one seems
obvious, but note that the OS/360 assembler knows how to do it,
(that is, absolute-relocatable) and also to add two pointers
(relocatable+relocatable). (Conveniently, the assembler doesn't
scale by the size of an object. That would really complicate
allowing those combinations in C.)

But OK, commutative pointer+integer addition allows C to be C.
That is, to be special, and not just any other language.

-- glen
 
J

Jorgen Grahn

Jorgen Grahn said:
int ar[ARSZ], i;
for(i = 0; i < ARSZ; i++){
ar = i;
i[ar]++;
...

After over 2 decades of programming in C, I totally forgot about it.

Because it's not something you encounter in the wild. People pull all
kinds of crazy stunts, but for some reason not this one.


Yes, this one too. See, for example, David Korn's entry in the 1987
International Obfuscated C Code Contest (ioccc.org):

main() { printf(&unix["\021%six\012\0"],(unix)["have"]+"fun"-0x60);}

(It depends on the compiler to predefine the macro "unix" to 1.
The output is "unix", but for reasons having nothing to do with
the spelling of the macro name.)


In terms of that metaphor, the IOCCC is a zoo, not "in the wild".


Yes. I would have explicitly excluded the IOCCC, but I didn't remember
its abbreviation and was too lazy to look it up ...

/Jorgen
 
B

Ben Bacarisse

glen herrmannsfeldt said:
Is C the first language that only allows for array indexing
starting at zero?

Both BCPL and B have/had zero-based indexing.

<snip>
 
G

glen herrmannsfeldt

Both BCPL and B have/had zero-based indexing.

OK, but those are ancestors of C.

Any that are not directly related?

PL/I was the first I knew that allowed one to select the
lower bound, and later Pascal and Fortran 77 also did.

Given that mathematics like to start indexing at 1, though,
forcing 0 is breaking from tradition.

-- glen
 
K

Kaz Kylheku

(snip)
As far as the compiler is concerned, an expression like x[n] is
translated into *(x+n) and use made of the fact that an array name
is converted into a pointer to the array's first element whenever
the name occurs in an expression. That's why, amongst other
things, array elements count from zero:

Hmm. If you converted x[n] into *(x+n-1) then arrays would count
from 1, like some other languages. Those who like arrays from zere
could always write *(x+n) instead...

Is C the first language that only allows for array indexing
starting at zero?

Lisp vectors and n-dimensional arrays go from zero. (There are also displaced
arrays that refer to other arrays).

How far back does that zero-based indexing go? At least as far back as Lisp 1.5, 1962.
The manual describes support for arrays that doesn't resemble the modern ones:

http://www.softwarepreservation.org/projects/LISP/book/LISP 1.5 Programmers Manual.pdf

"Indices range from 0 to n-1." (P. 27 "The Array Feature")

I do not see any such thing in the Lisp 1 manual (1960).
 
S

Stefan Ram

Ben Bacarisse said:
Both BCPL and B have/had zero-based indexing.

I any machine language, the address of the first component
of an array is the address of the array (plus zero).
 
B

BartC

Hmm. If you converted x[n] into *(x+n-1) then arrays would count
from 1, like some other languages. Those who like arrays from zere
could always write *(x+n) instead...

Well, writing x[n-1] is a bit like counting from 1 too! Except it's
obviously counting from 0.

You can't say a language is 1-based unless you can port a 1-based algorithm
to it without messing with the indexing or the bounds.
Not counting assembler programs where
the user computes the indexing.)

Those use the natural zero-base of arrays and offsets.

Given that mathematics like to start indexing at 1, though,
forcing 0 is breaking from tradition.

I don't know why this 0- and 1-based business is such a big deal. In all the
languages I've ever created, I've generally allowed both, but the default
base was usually 1. Both are useful.

(Actually I usually allow any lower bound, but anything other than 0 or 1 is
rare.)

But if the choice has to be only 0 or only 1, then 0 is a better bet
(because, with an extra element allocated, you can just ignore the 0th
element and index from 1).
 
K

Kaz Kylheku

OK, but those are ancestors of C.

Any that are not directly related?

PL/I was the first I knew that allowed one to select the
lower bound, and later Pascal and Fortran 77 also did.

Given that mathematics like to start indexing at 1, though,
forcing 0 is breaking from tradition.

Mathematics does not "like" to start indexing at 1.

There is no such "tradition".

It depends on the situation.

For instance, I see plenty of both zero and one based indexing here:

http://en.wikipedia.org/wiki/Series_(mathematics)
 
J

Jorgen Grahn

That's why I said in my original post that I came across this again after a
long, long time and that I have forgotten about it.

It was not meant to be the reason for an endless discussion.

Hey, this is comp.lang.c -- /anything/ can cause an endless
discussion ...

FWIW, I was mildly amused by your first posting. I had
forgotten about that little C curiosity.

/Jorgen
 
G

glen herrmannsfeldt

I any machine language, the address of the first component
of an array is the address of the array (plus zero).

As I previously noted, it would have been possible to define
a as *(a+b-1), in which case arrays would be origin 1.

The machine doesn't care, in most cases, if you give the right
origin for the array. That is, the address constant doesn't have
to be the address of the first element, though C programmers might
disagree. (At least for the hardware that I know about.)

The array descriptors used by IBM PL/I compilers store the
address of array element with all subscripts zero, even if it
isn't inside the array. Once you do that, you can easily find
any array element.

The VAX/VMS array descriptor includes both virtual and physical
origin, allowing for either PL/I or Fortran array argument passing.

-- glen
 
S

Stefan Ram

glen herrmannsfeldt said:
As I previously noted, it would have been possible to define
a as *(a+b-1), in which case arrays would be origin 1.


int a_[ 10 ], *a = a_ - 1;

Now, you can use a[ 1 ] up to a[ 10 ], but a[ 0 ] would be
an error. Disclaimer: The evaluation of »a_ - 1« has
undefined behavior.
 
J

jononanon

#include <stdio.h>
#include <stdlib.h>
#include <stddef.h>

int main(void)
{
int arr[] = {1, 2, 3, 4};
int *p0 = arr;
int *p3 = p0 + 3;
ptrdiff_t diff;

printf("%d\n", p3 - p0); // 3
printf("%d\n", &*p3 - &*p0); // 3
printf("%d\n", (void *) p3 - (void *) p0); // 12
printf("%d\n", (char *) p3 - (char *) p0); // 12

return EXIT_SUCCESS;
}

In other words: a pointer is *NOT* just an address-value on to which an integer is added in an an UNSCALED manner.

Rather the integer is scaled
(int *)p + 3 <<-->> (int *)(p + 3*sizeof(int))
 
J

jononanon

Rather the integer is scaled
(int *)p + 3 <<-->> (int *)(p + 3*sizeof(int))

Ah sorry, rather like this:
(int *)p + 3 <<-->> (int *)((char*)p + 3*sizeof(int))
 
J

jononanon

(int *)p + 3 <<-->> (int *)((char*)p + 3*sizeof(int))

In other words: plus is not plus... but depends on the types involved in plus.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,075
Messages
2,570,549
Members
47,197
Latest member
NDTShavonn

Latest Threads

Top