T
Thomas David Rivers
Paul said:Paul said:To return to that, do you agree that in this declaration:
int arr[5];
'arr' is the name of the contigous set of 5 `int' elements?
And, to go to the next step, would you agree that in this
declaration:
int i;
`i' is the name of single `int'?
Yes I agree, you may continue.
So - we should then consider something fairly important... what does
"name" mean?
I would ask that we consider the 'name' of something to be the
designation of the location that contains that 'something'. And, I
want
to movitave that definition with the following scenario.
If we can imagine that variables, in the C/C++ sense, are simply
"boxes" that contain their values - like shoeboxes on a shelf,
then the "name" associated with the variable is simply the name
associated with a "box." Let's imagine that this shelf can contain
many boxes, and that the boxes are numbered... box #0, box #1, box
#2..
A box can be referred to by its number - its location on the shelf,
or if
we associate a name with a box #; we can use the name. For example,
box #51 might have the name `i' associated with it.
Thus, if we consider this snippet of C code:
int i;
i = 5;
the semantics of that would be something similar to:
define a box name 'i', allocate space for it, let's say that is
box #51, associate
location #51 with the name 'i'.
produce the manifest `int' constant 5
Find the box with the name 'i', oh - it's box #51.
store the manifest constant 5 into box #51.
Similarly, the snippet:
int i,j;
i = j;
might be realized in this way:
define a box named 'i' that can contain int-typed objects, say,
box #51.
define a box named 'j' that can contain int-typed objects, say,
box #52.
Given the name 'j'; find its box - oh - it's box #52.
Retrieve (fetch) the value from box #52.
Given the name 'i'; find its box - oh - it's box #51.
Store the value previously fetched value into box #51.
So far, I've only talked about boxes that can contain `int'-typed
values.
Let's introduce a new box, and say that it can contain the location
(the box #)
of another box, and the other box can contain `int'-typed values.
We would
call this new box a "pointer" to an `int'-typed box, because it
doesn't contain
the `int'-type value itself, but instead tells us the location (the
box #) of the
value.
Now - let's consider this:
int *ip;
int i;
ip = &i;
in this mindset - what does this mean? To discuss this, we need to
define
the &-operator. Le's say that the &-operator is named 'address of'
and that
it returns the location (the box #) of its operand. Say, for
example, if `i'
happened to be box #51, then the expression "&i" would produce
location #51.
Remember - I can have two kinds of boxes on my shelf, boxes that
contain
`int'-typed values, and boxes that contain the location of other
`int'-typed boxes.
So, &i doesn't produce an `int'-typed value, it produces a
location. In this
example, it produced #51. (I have been careful to use '#' to
indicate location numbers
instead of simple `int'-typed values.. this distinction will become
important.)
So - the semantics of that statement would be (if I may be so bold
as to be a little loose on the details):
define a box named 'ip' - it can contain the location of other
`int'-typed boxes,
let's say it is box #52.
define a box named 'i' - it can contain `int' values, let's say
it's box #51.
Apply the &-operator to the box named `i' - oh - i is box #51, so
&i produces
the location #51.
Look up the name `ip - oh - that's box #52.
Store our previously found location (#51) into box #52.
OK - now - let's say that I happen to buy my boxes from a
manufacturer that
can only make 2-kinds of boxes, those boxes that can contain `int'
elements,
and those boxes that can contain the location of other boxes that
contain
`int' elements. (that is, `int' boxes and `int *' boxes).
So, let's look at this:
int *ip;
int arr[5];
What can this do? I propose it does the following:
define a box named 'ip', that box is allowed to contain the
location of
other `int'-type boxes; let's say that is box #52.
define a space of 5 boxes, each of these boxes is on the shelf
starting
at, say, location #53. Each of these boxes is an `int'-typed
box. The
boxes are contiguous, starting a location #53 and going thru
location
#57 (inclusive.) Since `arr' (at the C/C++ level) is a
single object,
the location associated with `arr' is #53 (the start of the 5
contiguous
boxes on the shelf.)
That is, in this scenario, the location of 'ip' is box #52, the
location of 'arr'
is box #53. `ip' uses one box, `arr' uses 5 boxes.
With this, we can provide a definition of "name".... the "name" of
the variable
provides a mechanism for finding precisely which "box" (the box #,
or location
of the box) houses the value of the variable. The "name" is
nothing more than
a convenience for mapping the location of the variable.
So - in our example, the name 'ip' is asociated with box #52, the
name 'arr'
is associated with box #53.
Before I go on, I want to check to see if you agree with everything
so far...
I agree with you that the "name" provides a mechanisn for mapping
the location but....
The names 'i', 'ip' and 'arr' have different name mechanisms.
The name 'i' refers to box#51, when used in an expression it
evaluates to whatever is contained in box#51.
The name 'ip' refers to box#52, it evaluates to whatever is
contained in box#52.
The name 'arr' refers to box#53.... but it does not evaluate to the
contents of box#53. It evaluates to the address of box#53.
I left all of the context above, so we have something to refer to....
And - yes - you are absolutely right... the name 'arr' refers to
the start of
the array... that is; as you correctly point out above; 'arr' would
refer to box #53.
To echo your statement, the name 'arr' does not refer to the contents of
boxes #53 thru #57 - it, in many contexts, refers to the location #53.
Similarly, the name 'ip' denotes box #52... but, as you correctly
point out,
the context in which the name is referenced is key to understanding
what is
happening.
For example, in this snippet:
int i, j;
i = j;
On the right-hand-side of the assignment operator, we find 'j' - that
refers
to the contents of the box at the location associated with the name
'j'. In C, we call
that an "rvalue". On the left-hand-side, we find 'i'. That refers
to the location of 'i',
the location used for storing the value computed on the
right-hand-side. In C, we call than an "lvalue". Note that, in both
instances, the names do refer to the location
of the boxes - it is simply what is done with that location that
changes. In the rvalue
context, we fetch the contents of the box, on the lvalue side, we
store the value into
the box.
(Note that 'j' in the assignment expression is first an "lvalue" and
is converted
into an "rvalue" by the rules of the language. It is this treatment
that indicates
that the value is to be retrieved from the box.)
As you mentioned above, the context is critical to understanding what
is going
on. I commend you on recognizing that distinction.
Thus - we now need to consider what it means in C/C++ when we encounter
an expression using the name of an array.
The way this is encapsulated in the C/C++ language is the rule regarding
conversion of an lvalue with an array type to a pointer to the first
element
of the array.
So - if we now consider this:
int *ip;
int arr[5];
ip = arr;
What would happen in our "shoebox" world? If we use your absolutely
correct
observation, then it would be this:
define a box named "ip", say - box #52 - that box can contain the
location of
other `int'-typed boxes
define a space of 5 boxes, beginning at box #53 thru box #57;
name that
area 'arr' by associating 'arr' with box #53
Now - how do we evaluate the expression "arr" in the context of the
assignment statement? Well, according to the C language rules, when
you discover an lvalue that has a type 'array'; you convert that into
the address of its first element.
Thus, we can say that:
ip = arr;
can be evaluated as:
Look up the name 'arr' - oh, it is an array of 5 `int'-typed
boxes. According
to the rules, in this context, we should produce the location
of the
first box of the array, that would be location #53.
Look up the name 'ip' - oh - that is a box (at location #52) that
can contain the location
of other `int'-typed boxes.
Store the value previously found location (#53) into box #52.
As you had correctly forseen, we are able to evaluate this using the
rules
of the programming language.
Note that, there wasn't a "extra" shoebox allocated/needed to produce
the result.
Note that the 'name' did not mean some other space was allocated...
it simply
refers to the location, as we have previously shown/agreed to.
This behaviour is different to other C++ names.
For example given the name of an int or an int*:
int x=7;
int* p =7;
When we use the names of a object in C++ we expect the the object to
be addressed and the value accessed whether it be a read or write
operation.
With an arr the situation is different , when we use the name of an
array we do not expect the array to be addressed , instead we expect a
pointer to the array:
int* = arr;
The identifier or name 'arr' refers to an array of objects yet somehow
its value is an address , not the same value as any of the objects it
refers to.
In C++
You are correct.. the reason for your correctness has been detailed
above.
For arrays, there is a "special" rule.. this rule is only for arrays,
the rule
is that when the array name is used, it does not designate the entire
array, it designates the address of the first element.
I am glad that you have reached this understanding, and that you have
expressed your agreement!
That is, when you reference an array name, it does not denote the entire
array; nor is there any other special object that is created to point to the
array - it is simply that the result of the reference is the address of the
first element (which you pointed out above.)
Now - given that understanding, I suggest that given this
declaration:
int arr[5];
that the expression
(&arr[0])
is also the address of the first element of 'arr' - do you not agree?
- Dave Rivers -