gcc: pointer to array

L

Lawrence Kirby

[Given "int arr[N];" and considering "&arr" vs "&arr[0]")

It is important to be clear that &arr and &arr[0] are fundamentally
different things. The only sense in which they are equal is that they
point to (different) objects that share the same starting byte, i.e.
(char *)&arr == (char *)&arr[0]. One points at an int, another points at
an array, so saying they are the same makes as much sense as given:

struct {
int a;
long b;
double c;
} s;

saying that &s is the same as &s.a. And before you say anything, it isn't.
:) One is a pointer to an int, the other is a pointer to a struct. That
makes them very different.
I think the real question boils down to whether &arr and &arr[0]
will compare equal under *all* "well-defined" conversions

The point is that they can't be compared directly. Pointer conversions are
a nasty business, once you've converted to a different pointer type you
can't really say that you're comparing the original value.

If you want to say that they point to objects sharing the same starting
byte address then say that, don't say that the pointers are the same. One
property of a pointer doesn't define the whole thing.
Is that actually defined by the standard? I remember that in some
pre-standard C compilers arrays were actually implemented as pointers,
so int arr[5]; would actually expand to the equivalent in
pseudo-assember:

arr: dw &_arr
_arr: dw ?[5]

(The array pointer itself might be declared in a read-only segment.)

Is this sort of expansion actually banned by the standard, or is it
"just not the done thing"?

The compiler can do whatever magic it likes so long as the program behaves
correctly. In standard C a array doesn't define a separate pointer object
so they must behave as if they don't. That goes for dynamically allocated
arrays and arrays of arrays too.
In those compilers the effect would be the same as happens with "array
parameters", taking &arr would get you the address of the pointer.

There is no such pointer object in standard C so that is not a valid
implementation. &arr gives a result of type pointer to array, not pointer
to pointer.

Lawrence
 
M

Maxim S. Shatskih

What do you mean by this? I don't know of any situation where an array of
5 integers and a pointer to int would be indexed differently.

int Array[10][5];
int** Ptr;

Array[2][3] is evaluated as *( Array + 2 * 5 + 3 )
Ptr[2][3] is evaluated as *( *( Ptr + 2 ) + 3 ) - double indirection

So, a[j] is evaluated differently, depending on whether a is a pointer or an
array.
 
A

Andrey Tarasevich

Maxim said:
What do you mean by this? I don't know of any situation where an array of
5 integers and a pointer to int would be indexed differently.

int Array[10][5];
int** Ptr;

Array[2][3] is evaluated as *( Array + 2 * 5 + 3 )
Ptr[2][3] is evaluated as *( *( Ptr + 2 ) + 3 ) - double indirection

So, a[j] is evaluated differently, depending on whether a is a pointer or an
array.


Well, strictly speaking, the former is also evaluated as

*( *( Array + 2 ) + 3 )

At purely syntactical level, the similarity between the two is still there.

The difference in terms of the "total number of indirections" exists for a
completely different reason. Take, for example, just the inner part - '*(Array +
2)'. There appears to be an indirection here. However, consider how this
expression is really interpreted: 'Array' decays from 'int[10][5]' to 'int
(*)[5]' (i.e. an implicit pointer is introduced), then 2 is added to that
pointer and then the pointer is dereferenced by '*' operator. In practice this
indirection is purely conceptual - we create a temporary pointer and then
almost immediately dereference it. For this reason, there's no need to generate
any code that will implement any actual indirection here.

In case of 'Ptr' the situation is substantially different. The same inner
subexpression - '*(Ptr + 2)' - has a different meaning in this case. It does
indeed require an "explicit" indirection, i.e. the compiler is required to
generate code that will retrieve the pointer value stored in memory location
pointed by 'Ptr + 2'.
 
A

Andrey Tarasevich

Alexei said:
What's wrong in the original case is perfectly clear here. The warning is a
consequence of C language's const-correctness rules when they are applied to
arrays (and it's been mentioned here already).

Simply speaking, in accordance with C language's const correctness rules, you
can perform an implicit 'T* -> const T*' conversion. Often people assume that,
say, 'int (*)[10] -> const int (*)[10]' conversion also belongs to that category
and should be performed implicitly without any warnings (that's what you
assumed, apparently). Unfortunately, this is not the case. In C language it is
not possible to const-qualify an array type as a whole. Period. Any attempts to
const-qualify an array type will actually const-qualify the type of the array
_elements_ not the array type itself. In other words, there's no situation in C
when the 'T* -> const T*' can be applied to pointers-to-arrays, since the proper
destination type cannot ever exist in C.
...

Which means C's types are wrong since the T that is an array is treated
differently from T that is not.

Yes, it is indeed treated differently.
If we define:
typedef int(tArrayType)[5];
then still an object of type tArrayType is pretty much different from say
typedef struct tArrayType {/*some member(s)*/} tArrayType;
In the first case we'll get a warning when passing &T to a function
declaring
argument as "const* T" but we won't get the warning in the second case.

That's true. "Hiding" an array type behind a typedef-name does not make it to
loose its second-class-citizen properties. It doesn't become assignable, it
doesn't become const-qualifyable as a whole. That's just the way it is in C.
 
N

Netocrat

What do you mean by this? I don't know of any situation where an array
of 5 integers and a pointer to int would be indexed differently.

int Array[10][5];
int** Ptr;

Array[2][3] is evaluated as *( Array + 2 * 5 + 3 ) Ptr[2][3] is evaluated
as *( *( Ptr + 2 ) + 3 ) - double indirection

So, a[j] is evaluated differently, depending on whether a is a pointer
or an array.


Agreed, but your wording implied 1-dimensional arrays, not 2-dimensional
arrays.
 
A

Alexei A. Frounze

Netocrat said:
Netocrat said:
On Mon, 11 Jul 2005 09:29:27 +1000, Netocrat wrote:

[D]ouble indirection can easily be used to gain access to the
supposedly protected const type by assigning the middle pointer to a
non-const pointer. There's no way the compiler can detect this, so
disallowing automatic const-conversion for double-indirection
parameters makes sense.

I'm talking nonsense. You can't assign the middle pointer to a
non-const pointer without a cast. But you can use the automatic
conversion to violate the const protection: quoting "Me" in another
thread:

const char c = 'c';
char * pc;
const char ** pcc = & pc ; /* not allowed */
*pcc = & c;
*pc = 'C'; /* would modify a const object if the conversion above were
* allowed */

Correct, but I don't think it's a good example.

I think it's a great example for what it's intended to show. What it
shows is that the const protection you are giving your c variable is
useless if a char ** variable can be automatically cast to const char **.
So really it's showing: if you want to be able to trust that the const
qualifier actually does what it's supposed to do, then automatic
conversions from char ** to const char ** _cannot_ be allowed.

It may not be as useful in explaining how this relates to your case. You
seem to be arguing that an array should be treated as though it were a
simple type, rather than as an indirected type, in this case. Actually I
don't see any problems with doing that. The trick above can't be used
when pc is declared as a char array rather than char pointer, since then
we can't change its base address. So perhaps you could argue something
like this:

Since the type char [] is fundamentally different from char * in that the
base address of the variable it defines cannot be modified, the automatic
conversion from char (*)[] to const char (*)[] is not unsafe, as the
automatic conversion from char ** to const char ** is, and should be
allowed.

The thing about the base address is that I can use any index/offset to get
outside the allowed memory region, be it array or a pointer, I can simply
write:
ptr[123456] = 0; or
arr[-987654] = 1;
whatever. C will let me shoot in the foot :) As I showed, gcc doesn't warn
about writing past the array's last element. At least, it did not do so with
the -Wall option, which I normally use.
I don't see any problems with this statement (others may...). But the
point is, regardless of whether this statement is true or not, it seems to
me (and again I must reiterate that I don't know what the standard says on
this) that C does treat these types equivalently for const-casting
purposes.
Simply being able to
modify something by pc and being unable to modify it by ppc is just fine,
so is in the simpler example:
char c = 'A';
char *pc = &c;
const char *pc_ = &c;
*pc = 'B'; // OK
*pc_ = 'B'; // prohibited

Yes this is all OK because the automatic conversion of &c from type
char* to type const char* can not be used in any way to surreptitiously
write to any const-protected values.

This "simpler" example is indeed just that - because it doesn't deal with
double indirection or pointers to array at all - which is the case in
point. If you really want to use a simple example that applies to our
discussion, we could change the character types to character array types
and extend it thus:

char c[5] = "abcde";
char (*pc)[5] = &c;
const char (*pc_)[5] = &c; /* warning - prohibited */
(*pc)[1] = 'B'; /* OK */
(*pc_)[1] = 'B'; /* prohibited */

This illustrates your case better. As I have argued above - to echo
what I understand to be your fundamental argument - there is no semantic
reason for prohibiting the automatic casting of &c from char (*)[5] to
const char (*)[5].
Yep.

Nevertheless in C it is prohibited. C'est la vie.
Sh!t.

Apparently the casting rules are the same as for char ** to const char **
- for which there _is_ a good reason for prohibition. We might not think
it necessary or appropriate, but that's just the way C works in this case.
:(

Another note - // comments are only allowed by the standard in C99, and
since you didn't specifically mention using C99 I presume you are not
using switches on gcc to invoke standards-compliant behaviour at all,
which you really should. Using -ansi -pedantic makes it reject non-C90
compliant code and using -std=C99 -pedantic does same for C99. -W and
-Wall are good for additional non-standard-specific warnings.

Well, there used to be a lot of nonstandard (not yet standardized) features,
among which // is more or less harmless. I do not know many good compilers
that do not support // but nevertheless are of some good value to the
developers.
But let's not get away from the topic for now.
I think you mean "const-qualifed type" rather than "constant value", but I
understand you.

Well, I may not be expressing myself well in terms of C's jargon, but I hope
I write clearly enough to be understood even though I'm not speaking English
natively :)
Neither am I because it doesn't seem to serve any protective purpose,
but as I've tried to explain it seems to be because the array is being
treated equivalently to the case of a pointer, where a protective purpose
_is_ served.

OK, how about:
const char* * const p
?
Would it be better than simply
const char** p
?

Sorry, I maybe talking nonsense at the moment... But if the dereferenced
character is protected by const (or is it not?), then I'm not sure of the
problem, the write/assignment should be prohibited.
You can in some simple cases as your example shows, so I don't know why
you're arguing here that you cannot.

I cannot with arrays. :)

....
Now to respond to the message you posted elsewhere:

Netocrat said:
That's true, but a pointer to an object can be treated as an array of
those objects and this is no less true for pointers to arrays - which
can be treated as arrays of arrays. So for index other than 0

ap[index] != (*ap)[index]

In fact accessing ap[1] in this instance is illegal but conceptually it
is an access of the second "5 element integer array" element of ap.

OK, I agree with this kind of double indirection.

The prob is though that we have different semantics for pointer to some
object being not an array and for pointer to some object being an array in
the sense that these two objects are treated differently.

Well that's not a totally accurate statement of the problem, because in
some cases (ie when that "some object" is a pointer) there really _is_
a need for different treatment as "Me"'s example shows. But if you
rewrite that statement as "We have different semantics for pointer to
array and for pointer to any other non-pointer type", then I agree.

Doesn't matter whether you say pointer or array. We both see the same
problem with [const]char** and [const]int(*)[]. So, yes, we do have
different semantics... To me its more of an inconsistency. C is a great
language, but it really does have some problems and peculiarities someone
should be aware of (say, two ints are multipled, and the product is assigned
to a long. what a normal human being expects to get isn't something the
experienced programmer knows to get in reality).
There doesn't seem to be a compelling argument for pointer to array being
treated any differently to other non-simple types, such as structs. To
again modify your simple example:

struct s {
int x;
int y;
};

struct s c;
struct s *pc = &c;
const struct s *pc_ = &c; /* no prohibition here as there is when c is an
* array */
pc->x = 1; /* OK */
pc_->x = 2; /* prohibited */

Sure. Now I think what if I typdef or union the array with something else to
hide the array type... Would I still get the warning? Or would I get a
different one? Sure this is something stupid to do, but what the heck, since
we're into, there are maybe some posibilities :)
As I've explained I believe that's the cause, but I don't know the
standard well enough to confirm this.


For good reason, as the example I quoted shows.

Can you show *the bad* code, where this automatic pointer type conversion
(if considered to be fully allowed) yields something real nasty? I'm getting
lost...
That's a strong statement, and clearly as it applies to char ** conversion
its inaccurate, but as it applies to treating pointers to arrays
equivalently with double pointers, in this case your opinion that
"something is wrong in C" appears to be at the least a proposition that
could reasonably be argued.
Maybe...


But Alex you're now ignoring the lesson of the original example I quoted.

I'm lost. That's it. :)
You're effectively saying, "the automatic conversion from char ** to
const char ** may have caused problems in Me's example, but it doesn't
in this case so it should be allowed here."

I might missed that "Me" example you're referring to...
Wait! This one:
const char c = 'c';
char * pc;
const char ** pcc = & pc ; /* not allowed */
*pcc = & c;
*pc = 'C'; /* would modify a const object if the conversion above were
* allowed */
Shoot. I didn't read it with enough attention. I see it now.
But ... what a perverted mind one must have to do something like that!
Cetainly, that wasn't something I expected to see nor do myself!!! At least,
not now, having programmed in C for quite some years.
Well and good, but in general
the compiler can't know when such an automatic conversion would cause
problems and when it wouldn't, which is why there is a rule that applies
in _all_ cases.

So in summary:
(a) it appears that there is no way around your problem without using a
cast. The cast doesn't appear to violate any const-protection semantics
as it would in the case where aInt3 was declared int * rather than int[5].
Casts in general are unwise though because they can mask useful (or
required) warnings, so be wary before using one.
(b) the prohibition of an automatic cast in this case does seem according
to const safety semantics to be unnecessary.

A big thanks,
Alex
P.S. where was I when I saw that weird example, huh? :)
 
N

Netocrat

[Given "int arr[N];" and considering "&arr" vs "&arr[0]")

It is important to be clear that &arr and &arr[0] are fundamentally
different things. The only sense in which they are equal is that they
point to (different) objects that share the same starting byte, i.e. (char
*)&arr == (char *)&arr[0].

<snip further explanation>

Perhaps you missed Chris Torek's full post; because he made an identical
point to yours which the abbreviation quoted above omits.
I think the real question boils down to whether &arr and &arr[0] will
compare equal under *all* "well-defined" conversions

If you want to say that they point to objects sharing the same starting
byte address then say that, don't say that the pointers are the same.
One property of a pointer doesn't define the whole thing.

The compiler can do whatever magic it likes so long as the program
behaves correctly. In standard C a array doesn't define a separate
pointer object so they must behave as if they don't. That goes for
dynamically allocated arrays and arrays of arrays too.

So to clarify, would you agree with the assertions that given the
declaration int arr[10];

a) &arr and &arr[0] are required by the standard to always point to
objects sharing the same starting byte address

b) the values of &arr and &arr[0], provided that they can be cast to
equivalent pointer types with no loss of information, are required by the
standard to always be equal?
 
N

Netocrat

On Thu, 14 Jul 2005 04:48:24 +0400, Alexei A. Frounze wrote:

The thing about the base address is that I can use any index/offset to get
outside the allowed memory region, be it array or a pointer, I can simply
write:
ptr[123456] = 0; or
arr[-987654] = 1;
whatever. C will let me shoot in the foot :) As I showed, gcc doesn't warn
about writing past the array's last element. At least, it did not do so
with the -Wall option, which I normally use.

I agree that it is reasonable to expect that the -Wall option of gcc
should warn about obvious cases where a constant index exceeds the array
bounds. In the case where you have assigned the array to a pointer it
isn't so obvious and I wouldn't expect it, but it would be a nice bonus.
Probably a tool such as lint would provide this capability.

C is a great
language, but it really does have some problems and peculiarities
someone should be aware of (say, two ints are multipled, and the product
is assigned to a long. what a normal human being expects to get isn't
something the experienced programmer knows to get in reality).

Agreed - casting requirements aren't immediately obvious as a C beginner.
Once you understand the paradigm they are - except for a few intricacies
like the case in point, which are not straightforward.

Sure. Now I think what if I typdef or union the array with something
else to hide the array type... Would I still get the warning? Or would I
get a different one? Sure this is something stupid to do, but what the
heck, since we're into, there are maybe some posibilities :)

You could place it as the only element of a struct. I don't see any
advantage of a union over a struct. Then if you wanated to you could
typedef the struct so that declaring the array is simpler (many frown on
hiding structs with typedef when it's not strictly necessary with good
reason, but it's personal preference in the end). It's a workaround but
it will achieve what you want; with the proviso that your array must be of
constant size unless you use C99 - otherwise you will need a separate
structure type for each array size.

I might missed that "Me" example you're referring to... Wait! This one:
[...] Shoot. I didn't read it with enough attention. I see it now. But
... what a perverted mind one must have to do something like that!
Cetainly, that wasn't something I expected to see nor do myself!!! At
least, not now, having programmed in C for quite some years.

Yes it's subtle - creatively perverted.

where was I when I saw that weird example, huh? :)

In the same place we go when we read several pages of a novel that
directly afterwards we can't remember anything from.
 
M

Maxim S. Shatskih

Agreed - casting requirements aren't immediately obvious as a C beginner.

With "const", all is clear. The "const" attribute of the lvalue cannot be
removed without the explicit cast - neither in C nor on C++.
 
B

Ben Pfaff

Maxim S. Shatskih said:
With "const", all is clear. The "const" attribute of the lvalue cannot be
removed without the explicit cast - neither in C nor on C++.

Sure it can--try calling a function like strchr().
 
P

Peter Nilsson

Ben said:
Sure it can--try calling a function like strchr().

Or just...

char *foo();

void bah(const char *x)
{
char *y = foo(x);
...
}

char *foo(char *x)
{
return x;
}
 
N

Netocrat

Or just...

char *foo();

void bah(const char *x)
{
char *y = foo(x);
...
}
}
char *foo(char *x)
{
return x;
}

Neither of these examples removes a const attribute from an lvalue, which
is impossible by definition. An lvalue can be assigned to. If it had a
const attribute that was to be removed, you couldn't assign to it and it
wouldn't be an lvalue in the first place.

I believe though that you've correctly interpreted what Maxim meant as
opposed to what actually said.
 
M

Maxim S. Shatskih

char *foo();

Oh, sorry. The only language for which I've read the formal description was C++
(old one - circa 1993) and not C. In C++, such things are impossible.
 
F

Fabio Alemagna

Andrey said:
It is not "indexed in a different way". All arrays, single- or
multi-dimensional, are indexed in exactly the same way.

Nope, they're not. The equivalence stops at one-dimensional arrays.
Multidimensional arrays aren't addressed the same way as
multi-indirectioned (sp?) pointers.

That is to say, int foo[10][20] doesn't decay to int **.

Multi-dimensional arrays store all of their elements consecutively, the
single elements are accessed by means of pointer math on the dimensions
and size of the elements, whilst multi-indirectioned pointers are simply
pointers (to pointers... to pointers) to the elements:


int foo[3][3] :

+--+--+--+
| | | |
+--+--+--+
| | | |
+--+--+--+
| | | |
+--+--+--+

int **baz :

+--+--+--+ ...
| | | | ...
+--+--+--+ ...
|| || ||
\/ \/ \/
+--+--+--+
| + + |
+--+--+--+
| + + |
+--+--+--+
| + + |
+--+--+--+
...........
...........
...........
 
N

Netocrat

Oh, sorry. The only language for which I've read the formal description
was C++ (old one - circa 1993) and not C. In C++, such things are
impossible.

Actually assuming you that by "lvalue" you meant "expression" what you
said was correct anyway. You said that the const attribute cannot be
removed from an expression without an explicit cast, which is true.

Looking again at the examples given by Ben Pfaff and Peter Nilsson,
they're showing the const attribute being added, rather than removed,
without an explicit cast.
 
A

Andrey Tarasevich

Fabio said:
Nope, they're not.

Yes, they are.
The equivalence stops at one-dimensional arrays.
Multidimensional arrays aren't addressed the same way as
multi-indirectioned (sp?) pointers.

That's true. But in my message I'm talking about the difference between
the way single- and multi-dimensional arrays are addressed. No pointers
involved (multi-indirectioned or not). You for some reason start talking
about pointers. Why?
That is to say, int foo[10][20] doesn't decay to int **.

That's true. But, once again, how is this relevant?
[skipped]

Once again, true. But I still don't see how all this applies to what I
said in my message.
 
A

Andrey Tarasevich

Netocrat said:
...
Neither of these examples removes a const attribute from an lvalue, which
is impossible by definition. An lvalue can be assigned to.

Huh? No. By definition, lvalue is something that has address in storage. In
general case lvalue cannot be assigned to. Only _modifyable_ lvalue can be
assigned to, but there are also non-modifyable lvalues.
If it had a
const attribute that was to be removed, you couldn't assign to it and it
wouldn't be an lvalue in the first place.

Not true. You seem to assume that assignability is a defining property of
"lvalueness". That's simply not true.
 
N

Netocrat

Huh? No. By definition, lvalue is something that has address in storage. In
general case lvalue cannot be assigned to. Only _modifyable_ lvalue can be
assigned to, but there are also non-modifyable lvalues.

Right, my concept of lvalue was slightly out. Non-modifiable being for
example arrays and structs.

Maxim's original statement then is accurate:
With "const", all is clear. The "const" attribute of the lvalue cannot
be removed without the explicit cast - neither in C nor on C++.

Not true. You seem to assume that assignability is a defining property of
"lvalueness". That's simply not true.

On reading the standard I see that you are correct.
 
C

CBFalconer

Fabio said:
Andrey said:
It is not "indexed in a different way". All arrays, single- or
multi-dimensional, are indexed in exactly the same way.

Nope, they're not. The equivalence stops at one-dimensional arrays.
Multidimensional arrays aren't addressed the same way as
multi-indirectioned (sp?) pointers.

That is to say, int foo[10][20] doesn't decay to int **.

Which is because there are no multi-dimensioned arrays in C, there
are just arrays of arrays. An array of pointers is not a
multidimensioned array, or even a fake of one linearized.
 
S

S.Tobias

In comp.lang.c Peter Nilsson said:
char *foo();

void bah(const char *x)
{
char *y = foo(x);

You raise UB here: (const char*) and (char*) types are
not compatible - you cannot pass incompatible arguments
to a function call (cf. 6.5.2.2#6).
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,181
Messages
2,570,970
Members
47,537
Latest member
BellCorone

Latest Threads

Top