Address syntax

A

Adam Warner

Hi Andrey Tarasevich,
But frankly, I don't see where Adam says that this function returns an
existing object (I did re-read his follow-up message you mention). All
the time I was assuming that every invocation of 'l_SYM_2B' is supposed
to build and return a new instance of 'struct v1'. I could be wrong, of
course.

You assumed right. `l_SYM_2B' built and returned a new instance of
`struct v1'. Now the caller sets up the return object or objects and
passes its address to each function (which returns void).

Regards,
Adam
 
D

Dietmar Schindler

Andrey said:
...
But frankly, I don't see where Adam says that this function returns an
existing object (I did re-read his follow-up message you mention). All

As I tried to express before, Adam didn't say that in his original
message, but he neither said anything contrary, so at that time both
assumptions were possible.
the time I was assuming that every invocation of 'l_SYM_2B' is supposed
to build and return a new instance of 'struct v1'. I could be wrong, of
course.

Now we know that you were right.
You, too, seem to suspect that forming of the new 'struct v1' value from
'a' can be optimized in certain cases (which, as we know now, are
probably different from the matter in hand). May I assume that you do
see how passing a pointer to a to-be-filled structure can impede an
optimization in such cases, and there is no need to elaborate on that?
...

I'm not saying that it can't impede optimization, but at the same time
I'd be glad to see a [simplified] example that would illustrate the
particular optimization you have in mind.

If you are talking about the above situation with 'l_SYM_2B' always
returning "an existing object", then, of course, it can be optimized.
However, I can only see how it can be done manually, and somehow I'm not
sure that we can realistically expect this kind of optimization from the
compiler.

I have in mind an implementation where functions which are defined to
return a struct are compiled to code that just returns an address (of an
existing struct object or of a statically allocated return area) and
where it is left to the calling code to copy the struct at that address
into another object, if needed. Since the calling code here has to do
the copying, it can be optimized away in cases where no copy is needed.

But I must admit that I know of no real compiler doing this, so you may
again be right by not expecting this kind of optimization.

Best regards,
Dietmar Schindler
 
G

Giorgos Keramidas

Andrey Tarasevich said:
Adam said:
print_aesthetic(&(l_SYM_2B(&a).o[0]));

I discovered that the only way to get GCC 3.4 to accept this syntax was to
append the -std=c99 compiler option. Otherwise GCC produced the error
message "invalid operands to binary +", i.e. it appeared to be trying to
interpret the address operator as a binary AND.

Is this a bug in GCC's pre-C99 support or has the syntax of C been changed
in C99 to support the code I wrote above? If so, thank goodness!

AFAIK the code is supposed to be ill-formed in both C89/C90 and C99
because you are trying to apply the address-of operator to a non-lvalue
argument. (Although I couldn't find an exact place in either document
that would explicitly state that function return value is not an lvalue
or prohibit this application of '&' in some other way. Anyone?).

The calling conventions of the underlying architecture may require that
at least some function arguments are passed in registers. Registers
don't necessarily work with the address-of operator, so foo(&bar()) may
be undefined.

ISO/IEC 9899:1999 pp. 78, § 6.5.3.2 (Unary operators // Address and
indirection operators), constraint 1 refers only to expressions that
use the 'register' keyword.

pp. 71 § 6.5.2.2 (4) says that "An argument may be an expression of any
object type", but I'm not sure if this means that anything can be
assumed about the storage of argument values before the function call
happens.
 
C

Chris Torek

(I realize these are a bit old, but I have been away for some time...)

[Given a function l_SYM_2B() that returns a struct that contains
an array...]
Adam said:
The (l_SYM_2B(&a).o[0]) operand appears to be the result of a [] operator.
So the result is evaluated as if the & operator were removed and the []
operator were changed to a + operator.

I found one key difference between the C89/C90 and C99 that must be at
play here.

The property of "being a non-lvalue" propagates from the left side of
'.' operator to its result in both C89/90 and C99. This means that the
following expression

l_SYM_2B(&a).o

is not an lvalue in both C89/90 and C99.

Now, the key moment: in C89/C90 the array-to-pointer conversion was only
applicable to _lvalues_ of array type (see 6.2.2.1). In C99 this
limitation was removed (see 6.3.2.1/3), and now the array-to-pointer
conversion is applicable to any values of array type.

This means that the following expression

l_SYM_2B(&a).o[0]

is ill-formed in C89/90 (there's no way to apply operator '[]' to the
result of previous expression) and well-formed in C99.

Right.

This is where "The Rule" is illustrative. I always say that The
Rule describes how array objects are treated in a value context:

In a value context, an object of type "array N of T" is converted
to a value of type "pointer to T", pointing to the first element
of that array, i.e., the one with subscript 0.

When we apply The Rule in typical C code, we find that arrays never
survive to the "value" stage: arrays are only ever objects, and
when you attempt to do anything with their value, you get instead
a pointer value, stripping off the "array N of" part.

Functions that return a "struct" that contains an array create a
problem. A function's return value is, by definition, a value,
not an object.

C handles values of type "struct S" a bit clumsily, but
consistently: they exist, you can put them on the right hand side
of assignments, you can pass them to functions, and so on:

struct S a, b;
struct T x;
...
a = b; /* valid, copies all the fields of b to a */
x = b; /* ERROR: type mismatch; diagnostic required */
f(b); /* valid, passes the value of the entire struct */

And of course, you can return them from functions as well:

struct S sfunc(void);
b = sfunc();

(Many compilers actually implement this by passing a "secret"
argument to the function, &b in this case, to let the function fill
in the provided struct -- but this is an implementation detail.
Note that in this case, if the return value is discarded, the
compiler has to pass a pointer anyway. Typically this is a pointer
to a compiler-created temporary that is then discarded.)

Because of The Rule, this "works as a value" does not hold for
arrays -- you put an array object in a value context and get a
pointer:

int arr[10];
int *p;

p = a; /* valid, sets p to point to &a[0]; array NOT copied */
g(a); /* valid, passes &a[0] to g(); array NOT copied */

and, by fiat, C simply forbids functions from returning arrays, so
that there is no problem with "function returning array N of T".
But if a function returns a struct and the struct contains an array,
we have the impossible: an array value. Logically, the value is
the value of the entire array -- all the elements at once, as it
were.

Because C applies The Rule to all *other* cases, it "wants", in an
odd nonsentient fashion, to apply it here as well -- but The Rule
converts an array object to a pointer to that object's first element,
and we do not *have* an object to point into. (A number of C89
compilers have bugs here, and apply The Rule anyway, giving you a
pointer to -- well, something. Whether that "something" lasts even
until the next expression is tough to predict, at least not without
digging into the compiler innards or experimenting with disassembling
the compiler's output. But this *is* a bug; it comes about because
the compiler fails to enforce the "object" part of The Rule. After
all, the compiler writer probably forgot that array values exist
at all -- they only occur in this one special case, of a function
returning a struct containing an array.)
For example, the
following simple code will be rejected by Comeau Online compiler in
C89/C90 mode and accepted in C99 mode for this very reason

struct S { int i[5]; };

struct S foo() { struct S s = { 0 }; return s; }

int main() { int i = foo().i[1]; }

Note, that this code does not involve operator '&' at all.

C99 once again solves the "what does it mean to get a pointer to
the nonexistent object" problem by fiat, restricting what you can
do: "i = foo().i[index]" is OK; "p = foo().i" is not.
However, this also means that you original code (with intermediate
variable 'val1') was also ill-formed in C89/C90. But you said that you
could compile it without any problems. Apparently, some quirks of
concrete compilers are also at play here.

Indeed. In general, it compiles (because, unlike the Comeau
compiler, the compiler fails to check the "must be an object" part)
and then produces unreliable machine code, often in weird and
unpredictable fashions (e.g., change the declaration order of local
variables and it sometimes works).
 
C

Chris Torek

[Again, we are dealing with a function that returns a struct that
contains an array. Here the function is named l_SYM_2B() and the
array in the struct is named "o", and is the first (and in fact
only) member of the struct.]
Adam Warner wrote:

[corrected for types, but still inherently "wrong" :) ]
I've been considering similar issues after I tried to compile
this to show the addresses were the same:

printf ("&(l_SYM_2B(&a)) is %p\n", (void *)&(l_SYM_2B(&a)));
printf ("&(l_SYM_2B(&a).o[0]) is %p\n", (void *)&(l_SYM_2B(&a).o[0]));

The first statement correctly generates the error: "invalid lvalue
in unary `&'" according to 6.5.3.2 (a meta question is *why* it
should be prohibited to obtain the address of a return value on
the stack. Sometimes C feels frustratingly high level :)

What stack? (Some implementations don't have one). The C90
logic seems clear: the return value might be in a register, so
how can you take the address of it? (or of parts of it).

Indeed: the return value is a value, not an object, and in C, only
objects have addresses. (Well, OK, functions have addresses too.)
Not all objects have addresses -- "register int i; ... &i" is invalid
and requiers a diagnostic -- but values *definitely* do not have
addresses.
Reading Andrey Tarasevich's replies, it seems that C99 fixes
this problem by forcing the return value to be an lvalue if
you use operator* (or equivalently, operator[]) on it.

That is not quite right. Instead, C99 says that, given an array
value (which only exists in this special weird case in which a
function returns a struct that contains an array), you *can* use
indirection and/or pointer arithmetic -- i.e., the usual kinds of
subscripting -- to access an element of the array, but this element
is also a value and is as ephemeral as any other value, so that
the only thing you can do is copy the value to some other object.
In particular, you may not take its address.

(All of this implies that:

struct S { int a[3][4]; };
struct S f(void);
...
f().a /* or even f().a[j] */

is probably a fruitful source of bugs in C99 compilers, because
we have multiple levels of "value arrays" going on.)
My question then: suppose we have in C99

struct t1 t = l_SYM_2B(&x).o[0];

does that mean that there must be an lvalue object created to be
the return-value (so that [0] can be applied to its .o),
and then that object is copied to t ? ie. there is a wasted
copy?

Not necessarily. The details are up to the implementation. But
this is just where all those details can get hairy and sticky and
otherwise unpleasant; and indeed, this is the reason for much of
the complexity in C++, with its copy constructors and rules about
lifetimes of temporary objects. C has, so far, avoided having to
encode such rules in the language itself, leaving the details up
to the implementor, by making sure that you are not allowed to
"grab hold" of the temporary object-or-value-or-whatever-it-is.
(In C++, some things are passed by reference, so: is that a copy
of the temporary, or the original?, and so on.)

(A typical C implementation will have an entire temporary
"struct v1", and compile:

struct T1 t = l_SYM_2B(&x).o[0];

as if it read:

struct v1 temp;
struct T1 t;
l_SYM_2B(&temp, &x);
t = temp.o[0];

with the lifetime of the "temp" variable local to the entire
function, or a block created just to enclose the assignment to "t".
In this particular case, since the array o has size 1 and is the
only member, so that sizeof(temp)==sizeof(o), it is possible that
the optimization phase of the compiler will then discard the
temporary "temp" and rewrite the call as:

l_SYM_2B((struct v1 *)&t1, &x);

Of course, the very existence of the secret first argument is
compiler-specific as well, and if the struct fits in a register,
that argument may be eliminated in favor of returning the struct
in that register and just using the return-value-register to hold
the variable "t"! This is the best possible case, and I suspect
relatively few compilers will achieve it.)
 
P

pete

Chris Torek wrote:
When we apply The Rule in typical C code, we find that arrays never
survive to the "value" stage: arrays are only ever objects, and
when you attempt to do anything with their value, you get instead
a pointer value, stripping off the "array N of" part.

A string literal is not converted to a pointer,
when it is used to initialize an array.

char array[] = "string literal";
 
C

Chris Torek

A string literal is not converted to a pointer,
when it is used to initialize an array.

char array[] = "string literal";

This is true; but note that in this case the string literal never
becomes an anonymous array in the first place. *Other* occurrences
of string literals *do* produce an array object, and then that object
undergoes The Rule.
 
L

Lawrence Kirby

A string literal is not converted to a pointer,
when it is used to initialize an array.

char array[] = "string literal";

This is true; but note that in this case the string literal never
becomes an anonymous array in the first place.

Strictly it does, according to C99 6.4.5 any instance of a string-literal
in translation phase 7 causes an array of static storage duration to be
initialised. It is just that in the case above that array cannot be
otherwise accessed by the program which provides an opportunity for an
as-if optimisation. But it does exist in the abstract machine. And you get
the feeling that there's something behind the scenes when you execute an
initialiser for an automatic array repeatedly.

Lawrence
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,182
Messages
2,570,960
Members
47,509
Latest member
Jack116

Latest Threads

Top