Address syntax

Adam Warner · Dec 29, 2004

Hi all,

In the code snippet below I successfully determine the address of val1:*

struct o val1=l_SYM_2B(&a).o[0];
print_aesthetic(&val1);

The structure o is heavyweight. I understand (hopefully correctly) that
(barring compiler optimisations) C will shallow copy the structure into
val1.

As I merely wished to pass the structure to print_aesthetic I combined the
statements to avoid creating an intermediate copy of the structure. This
was my attempt:

print_aesthetic(&(l_SYM_2B(&a).o[0]));

I discovered that the only way to get GCC 3.4 to accept this syntax was to
append the -std=c99 compiler option. Otherwise GCC produced the error
message "invalid operands to binary +", i.e. it appeared to be trying to
interpret the address operator as a binary AND.

Is this a bug in GCC's pre-C99 support or has the syntax of C been changed
in C99 to support the code I wrote above? If so, thank goodness!

Regards,
Adam

* Footnote:
a is a local variable of type struct o.
l_SYM_2B(struct o * arg0) returns struct v1, defined as { struct o o[1]; };
print_aesthetic has the prototype struct v1 print_aesthetic(struct o *);

Dietmar Schindler · Dec 29, 2004

Adam said:
In the code snippet below I successfully determine the address of val1:*

struct o val1=l_SYM_2B(&a).o[0];
print_aesthetic(&val1);

The structure o is heavyweight. I understand (hopefully correctly) that
(barring compiler optimisations) C will shallow copy the structure into
val1.

As I merely wished to pass the structure to print_aesthetic I combined the
statements to avoid creating an intermediate copy of the structure. This
was my attempt:

print_aesthetic(&(l_SYM_2B(&a).o[0]));

I discovered that the only way to get GCC 3.4 to accept this syntax was to
append the -std=c99 compiler option. Otherwise GCC produced the error
message "invalid operands to binary +", i.e. it appeared to be trying to
interpret the address operator as a binary AND.

Is this a bug in GCC's pre-C99 support or has the syntax of C been changed
in C99 to support the code I wrote above? If so, thank goodness!
...
* Footnote:
a is a local variable of type struct o.
l_SYM_2B(struct o * arg0) returns struct v1, defined as { struct o o[1]; };
print_aesthetic has the prototype struct v1 print_aesthetic(struct o *);

If l_SYM_2B is a function, you may get an intermediate copy of the
structure anyway. If it is a macro, it might work without ():

print_aesthetic(&l_SYM_2B(&a).o[0]);

Keith Thompson · Dec 29, 2004

Adam Warner said:
In the code snippet below I successfully determine the address of val1:*

struct o val1=l_SYM_2B(&a).o[0];
print_aesthetic(&val1);

The structure o is heavyweight. I understand (hopefully correctly) that
(barring compiler optimisations) C will shallow copy the structure into
val1.

As I merely wished to pass the structure to print_aesthetic I combined the
statements to avoid creating an intermediate copy of the structure. This
was my attempt:

print_aesthetic(&(l_SYM_2B(&a).o[0]));

I discovered that the only way to get GCC 3.4 to accept this syntax was to
append the -std=c99 compiler option. Otherwise GCC produced the error
message "invalid operands to binary +", i.e. it appeared to be trying to
interpret the address operator as a binary AND.

Is this a bug in GCC's pre-C99 support or has the syntax of C been changed
in C99 to support the code I wrote above? If so, thank goodness!

I think the "+" it's complaining about is the one implicit in the
indexing operator. x[y] is equivalent to *(x+y). (Usually x is an
address, often resulting from the implicit conversion of an array
name, and y is an integer.)

Adam Warner · Dec 29, 2004

Hi Dietmar Schindler,

* Footnote:
a is a local variable of type struct o.
l_SYM_2B(struct o * arg0) returns struct v1, defined as { struct o o[1]; };
print_aesthetic has the prototype struct v1 print_aesthetic(struct o *);

Click to expand...

If l_SYM_2B is a function, you may get an intermediate copy of the
structure anyway. If it is a macro, it might work without ():

print_aesthetic(&l_SYM_2B(&a).o[0]);

l_SYM_2B is a function (l_SYM is a namespace prefix and underscore is the
escape character for non-alphanumerics. 0x2B is the ASCII code for +. It
is to be my dynamic Lisp function +).

v1, v2, ... are structures for stack allocating multiple return values:

struct v1 {
struct o o[1];
};

struct v2 {
struct o o[2];
};

struct v3 {
struct o o[3];
};

l_SYM_2B() returns the structure v1, i.e. leaves room for one object o on
the stack, aligned to the same position as if struct o itself was returned.

I sincerely hope no compiler will ever create a copy of struct o before
returning its address. I am only using the struct v1 for symmetry with v2,
v3, etc. The address of the first object o is expected to be the same as
the v1 structure. As I am only obtaining the address of o[0] within the
structure v1 I cannot see how C semantics could result in an intermediate
copy of o.

Regards,
Adam

Andrey Tarasevich · Dec 30, 2004

Adam said:
In the code snippet below I successfully determine the address of val1:*

struct o val1=l_SYM_2B(&a).o[0];
print_aesthetic(&val1);

The structure o is heavyweight. I understand (hopefully correctly) that
(barring compiler optimisations) C will shallow copy the structure into
val1.

As I merely wished to pass the structure to print_aesthetic I combined the
statements to avoid creating an intermediate copy of the structure. This
was my attempt:

print_aesthetic(&(l_SYM_2B(&a).o[0]));

I discovered that the only way to get GCC 3.4 to accept this syntax was to
append the -std=c99 compiler option. Otherwise GCC produced the error
message "invalid operands to binary +", i.e. it appeared to be trying to
interpret the address operator as a binary AND.

Is this a bug in GCC's pre-C99 support or has the syntax of C been changed
in C99 to support the code I wrote above? If so, thank goodness!

AFAIK the code is supposed to be ill-formed in both C89/C90 and C99
because you are trying to apply the address-of operator to a non-lvalue
argument. (Although I couldn't find an exact place in either document
that would explicitly state that function return value is not an lvalue
or prohibit this application of '&' in some other way. Anyone?).

However, even if we'd assume for a second that this code is well-formed,
we'd still have to come to a conclusion that the code is broken at least
in C99. According to C99 (6.5.2.2/5), an attempt to access the returned
value after the next sequence point results in undefined behavior. If
inside 'print_aesthetic' you are actually trying to access the value
pointed by the argument (i.e. the return value of 'l_SYM_2B'), you are
in violation of that rule, since 'print_aesthetic' has a sequence point
at the entry. (Again, I couldn't find an equivalent statement in C89/C90.)

Andrey Tarasevich · Dec 30, 2004

Andrey said:
Adam said:

In the code snippet below I successfully determine the address of val1:*

struct o val1=l_SYM_2B(&a).o[0];
print_aesthetic(&val1);

The structure o is heavyweight. I understand (hopefully correctly) that
(barring compiler optimisations) C will shallow copy the structure into
val1.

As I merely wished to pass the structure to print_aesthetic I combined the
statements to avoid creating an intermediate copy of the structure. This
was my attempt:

print_aesthetic(&(l_SYM_2B(&a).o[0]));

I discovered that the only way to get GCC 3.4 to accept this syntax was to
append the -std=c99 compiler option. Otherwise GCC produced the error
message "invalid operands to binary +", i.e. it appeared to be trying to
interpret the address operator as a binary AND.

Is this a bug in GCC's pre-C99 support or has the syntax of C been changed
in C99 to support the code I wrote above? If so, thank goodness!

Click to expand...

AFAIK the code is supposed to be ill-formed in both C89/C90 and C99
because you are trying to apply the address-of operator to a non-lvalue
argument. (Although I couldn't find an exact place in either document
that would explicitly state that function return value is not an lvalue
or prohibit this application of '&' in some other way. Anyone?).

However, even if we'd assume for a second that this code is well-formed,
we'd still have to come to a conclusion that the code is broken at least
in C99. According to C99 (6.5.2.2/5), an attempt to access the returned
value after the next sequence point results in undefined behavior. If
inside 'print_aesthetic' you are actually trying to access the value
pointed by the argument (i.e. the return value of 'l_SYM_2B'), you are
in violation of that rule, since 'print_aesthetic' has a sequence point
at the entry. (Again, I couldn't find an equivalent statement in C89/C90.)

On the second thought, the code should probably circumvent the "lvalue
check" since it actually accesses an element of an array aggregated by
the returned struct. Array element access in C decays to pointer
arithmetics, which will formally turn the '&'s argument into an lvalue
(i.e. 'l_SYM_2B(&a).o[0]' is '*(l_SYM_2B(&a).o + 0)' and the result of
'*' is always an lvalue).

However, the second problem - UB - is still there.

Adam Warner · Dec 30, 2004

Hi Andrey Tarasevich,

Adam said:
Adam said:

In the code snippet below I successfully determine the address of val1:*

struct o val1=l_SYM_2B(&a).o[0];
print_aesthetic(&val1);

The structure o is heavyweight. I understand (hopefully correctly) that
(barring compiler optimisations) C will shallow copy the structure into
val1.

As I merely wished to pass the structure to print_aesthetic I combined the
statements to avoid creating an intermediate copy of the structure. This
was my attempt:

print_aesthetic(&(l_SYM_2B(&a).o[0]));

I discovered that the only way to get GCC 3.4 to accept this syntax was to
append the -std=c99 compiler option. Otherwise GCC produced the error
message "invalid operands to binary +", i.e. it appeared to be trying to
interpret the address operator as a binary AND.

Is this a bug in GCC's pre-C99 support or has the syntax of C been changed
in C99 to support the code I wrote above? If so, thank goodness!

Click to expand...

AFAIK the code is supposed to be ill-formed in both C89/C90 and C99
because you are trying to apply the address-of operator to a non-lvalue
argument. (Although I couldn't find an exact place in either document
that would explicitly state that function return value is not an lvalue
or prohibit this application of '&' in some other way. Anyone?).

However, even if we'd assume for a second that this code is well-formed,
we'd still have to come to a conclusion that the code is broken at least
in C99. According to C99 (6.5.2.2/5), an attempt to access the returned
value after the next sequence point results in undefined behavior. If
inside 'print_aesthetic' you are actually trying to access the value
pointed by the argument (i.e. the return value of 'l_SYM_2B'), you are
in violation of that rule, since 'print_aesthetic' has a sequence point
at the entry. (Again, I couldn't find an equivalent statement in C89/C90.)

I've been considering similar issues after I tried to compile this to show
the addresses were the same:

printf ("&(l_SYM_2B(&a)) is %p\n", &(l_SYM_2B(&a)));
printf ("&(l_SYM_2B(&a).o[0]) is %p\n", &(l_SYM_2B(&a).o[0]));

The first statement correctly generates the error: "invalid lvalue in
unary `&'" according to 6.5.3.2 (a meta question is *why* it should be
prohibited to obtain the address of a return value on the stack. Sometimes
C feels frustratingly high level

The second one may however be legitimate in C99. The semantics state (note
that this the final draft): <http://dev.unicals.com/c99-draft.html#6.5.3.2>

Similarly, if the operand is the result of a [] operator, neither the &
operator nor the unary * that is implied by the [] is evaluated and the
result is as if the & operator were removed and the [] operator were
changed to a + operator. Otherwise, the result is a pointer to the
object or function designated by its operand.

The (l_SYM_2B(&a).o[0]) operand appears to be the result of a [] operator.
So the result is evaluated as if the & operator were removed and the []
operator were changed to a + operator.

print_aesthetic does try to access the value pointed by the argument.
It has the prototype: struct v1 print_aesthetic(struct o *);

There's no reason this can't become struct v1 print_aesthetic(struct o[])
because arrays are also passed by reference. If you believe this change to
print_aesthetic is necessary to be conforming I can make the change. I'd
like to avoid it because the syntax is messier.

With the function argument "struct o * arg" I can use structure
deferencing notation, e.g. arg->type, and follow pointer members to the
next object. With the function argument "struct o * arg[]" I have to
reference members as arg[0].type, arg[1].type, etc.

Regards,
Adam

Andrey Tarasevich · Dec 30, 2004

Adam said:
...
AFAIK the code is supposed to be ill-formed in both C89/C90 and C99
because you are trying to apply the address-of operator to a non-lvalue
argument. (Although I couldn't find an exact place in either document
that would explicitly state that function return value is not an lvalue
or prohibit this application of '&' in some other way. Anyone?).

However, even if we'd assume for a second that this code is well-formed,
we'd still have to come to a conclusion that the code is broken at least
in C99. According to C99 (6.5.2.2/5), an attempt to access the returned
value after the next sequence point results in undefined behavior. If
inside 'print_aesthetic' you are actually trying to access the value
pointed by the argument (i.e. the return value of 'l_SYM_2B'), you are
in violation of that rule, since 'print_aesthetic' has a sequence point
at the entry. (Again, I couldn't find an equivalent statement in C89/C90.)

Click to expand...

I've been considering similar issues after I tried to compile this to show
the addresses were the same:

printf ("&(l_SYM_2B(&a)) is %p\n", &(l_SYM_2B(&a)));
printf ("&(l_SYM_2B(&a).o[0]) is %p\n", &(l_SYM_2B(&a).o[0]));

The first statement correctly generates the error: "invalid lvalue in
unary `&'" according to 6.5.3.2 (a meta question is *why* it should be
prohibited to obtain the address of a return value on the stack. Sometimes
C feels frustratingly high level

The second one may however be legitimate in C99. The semantics state (note
that this the final draft): <http://dev.unicals.com/c99-draft.html#6.5.3.2>

Similarly, if the operand is the result of a [] operator, neither the &
operator nor the unary * that is implied by the [] is evaluated and the
result is as if the & operator were removed and the [] operator were
changed to a + operator. Otherwise, the result is a pointer to the
object or function designated by its operand.

The (l_SYM_2B(&a).o[0]) operand appears to be the result of a [] operator.
So the result is evaluated as if the & operator were removed and the []
operator were changed to a + operator.

I found one key difference between the C89/C90 and C99 that must be at
play here.

The property of "being a non-lvalue" propagates from the left side of
'.' operator to its result in both C89/90 and C99. This means that the
following expression

l_SYM_2B(&a).o

is not an lvalue in both C89/90 and C99.

Now, the key moment: in C89/C90 the array-to-pointer conversion was only
applicable to _lvalues_ of array type (see 6.2.2.1). In C99 this
limitation was removed (see 6.3.2.1/3), and now the array-to-pointer
conversion is applicable to any values of array type.

This means that the following expression

l_SYM_2B(&a).o[0]

is ill-formed in C89/90 (there's no way to apply operator '[]' to the
result of previous expression) and well-formed in C99. For example, the
following simple code will be rejected by Comeau Online compiler in
C89/C90 mode and accepted in C99 mode for this very reason

struct S { int i[5]; };

struct S foo() { struct S s = { 0 }; return s; }

int main() { int i = foo().i[1]; }

Note, that this code does not involve operator '&' at all.

However, this also means that you original code (with intermediate
variable 'val1') was also ill-formed in C89/C90. But you said that you
could compile it without any problems. Apparently, some quirks of
concrete compilers are also at play here.

print_aesthetic does try to access the value pointed by the argument.
It has the prototype: struct v1 print_aesthetic(struct o *);

Once again, if I'm not missing something, that definitely leads to UB in
both C89/C90 and C99 for reasons explained in my previous messages.

There's no reason this can't become struct v1 print_aesthetic(struct o[])
because arrays are also passed by reference.

This is an exact equivalent of the previous declaration. It won't make
any difference.

If you believe this change to
print_aesthetic is necessary to be conforming I can make the change. I'd
like to avoid it because the syntax is messier.

With the function argument "struct o * arg" I can use structure
deferencing notation, e.g. arg->type, and follow pointer members to the
next object. With the function argument "struct o * arg[]" I have to
reference members as arg[0].type, arg[1].type, etc.

Did you mean 'struct o arg[]'? If yes, then you are wrong. The following
two declarations

struct v1 print_aesthetic(struct o *arg)
struct v1 print_aesthetic(struct o arg[])

are exactly equivalent in C. In both cases you can use 'arg->type' and
'arg[0].type'.

Adam Warner · Dec 30, 2004

Hi Andrey Tarasevich,

On the second thought, the code should probably circumvent the "lvalue
check" since it actually accesses an element of an array aggregated by
the returned struct. Array element access in C decays to pointer
arithmetics, which will formally turn the '&'s argument into an lvalue
(i.e. 'l_SYM_2B(&a).o[0]' is '*(l_SYM_2B(&a).o + 0)' and the result of
'*' is always an lvalue).

I'm glad we agree that this part is legal C99.

However, the second problem - UB - is still there.

6.5.2.2/5: "If an attempt is made to modify the result of a function call
or to access it after the next sequence point, the behavior is undefined."

Therefore I have to copy the return value into a local variable so I can
pass it by reference to a function, thereby defeating the benefits of pass
by reference (which were to avoid copying heavyweight structures and
permit mutation of all objects via function calls).

My mistake is that I can't rely upon a return value remaining on the stack
until the _calling_ function exits (and when you think about it the stack
could blow up horribly otherwise!)

<http://madchat.org/osdevl/stack.html>

As you can see, the data still is on the stack, but once the pop
operation is completed, we consider that part of the data invalid.
Thus, the next push operation overwrites this data. But that's OK,
because we assume that after a pop operation, the data that's popped
off is considered garbage.

If you've ever made the error of returning a pointer to a local
variable or to a parameter that was passed by value and wondered why
the value stayed valid initially, but later on got corrupted, you
should now know the reason.

The data still stays on the garbage part of the stack until the next
push operation overwrites it (that's when the data gets corrupted).

My method, while supremely fast and apparently correct with simple
benchmarks, would have resulted in hideously difficult to track down
data corruption.

Thanks for enlightening me.

Regards,
Adam

Adam Warner · Dec 30, 2004

Hi Andrey Tarasevich,

Did you mean 'struct o arg[]'?

I did, sorry!

If yes, then you are wrong. The following two declarations

struct v1 print_aesthetic(struct o *arg)
struct v1 print_aesthetic(struct o arg[])

are exactly equivalent in C. In both cases you can use 'arg->type' and
'arg[0].type'.

That's great to know! Thanks for helping me appreciate the undefined
behaviour of calling an (at least not inlined) function with the address
of the return value of a nested function.

Regards,
Adam

Andrey Tarasevich · Dec 30, 2004

Adam said:
Hi Andrey Tarasevich,

On the second thought, the code should probably circumvent the "lvalue
check" since it actually accesses an element of an array aggregated by
the returned struct. Array element access in C decays to pointer
arithmetics, which will formally turn the '&'s argument into an lvalue
(i.e. 'l_SYM_2B(&a).o[0]' is '*(l_SYM_2B(&a).o + 0)' and the result of
'*' is always an lvalue).

Click to expand...

I'm glad we agree that this part is legal C99.

Well, I'm sorry to say that, but it appears that my above statement is
incorrect in C89/C90 and irrelevant in C99. The reason why it is
incorrect in C89/C90 is explained in my other message (in short,
'l_SYM_2B(&a).o' is an rvalue of array type and '[]' cannot be applied
to it). The reason why it is irrelevant in C99 is explained in your
other message ('&' and '[]' cancel each other out).

However, this doesn't change the fact that we agree that this part is
well-formed in C99

I would still like to know whether the following code

struct S { int i[5]; };

struct S foo() { struct S s = { 0 }; return s; }

int main() { int i = foo().i[1]; }

is valid C89/C90? Comeau thinks it is not (and the standard, as I
understand it, seems to support this position). GCC, on the other hand,
accepts it. Can anyone comment on it?

6.5.2.2/5: "If an attempt is made to modify the result of a function call
or to access it after the next sequence point, the behavior is undefined."

Therefore I have to copy the return value into a local variable so I can
pass it by reference to a function, thereby defeating the benefits of pass
by reference (which were to avoid copying heavyweight structures and
permit mutation of all objects via function calls).

It looks like you have to do that, yes.

Another approach would be to pass a pointer to the destination structure
to 'l_SYM_2B' function as a separate parameter and fill it in inside
'l_SYM_2B' instead of returning it from 'l_SYM_2B' by value. In other
words, you can (if you can) change the signature of 'l_SYM_2B' to
something like

void l_SYM_2B(<whatever>* lp_a, struct v1* lp_v1)
{
/* fill in '*lp_v1' accordingly */
}

Your original code in this case would look as follows

struct v1 val1;
l_SYM_2B(&a, &val1);
print_aesthetic(&val1.o[0]);

No heavy copying is performed.

My mistake is that I can't rely upon a return value remaining on the stack
until the _calling_ function exits (and when you think about it the stack
could blow up horribly otherwise!)

That's an interesting question. In C++ you wouldn't have to worry about
the lifetime of temporary objects (assuming that you switch from using
pointers to using references) because the lifetime of temporary objects
extends at least to the end of the expression. I don't see any reason
why it should be different in C (except for the fact that C doesn't have
references).

<http://madchat.org/osdevl/stack.html>

As you can see, the data still is on the stack, but once the pop
operation is completed, we consider that part of the data invalid.
Thus, the next push operation overwrites this data. But that's OK,
because we assume that after a pop operation, the data that's popped
off is considered garbage.

If you've ever made the error of returning a pointer to a local
variable or to a parameter that was passed by value and wondered why
the value stayed valid initially, but later on got corrupted, you
should now know the reason.

The data still stays on the garbage part of the stack until the next
push operation overwrites it (that's when the data gets corrupted).

I don't think this is relevant in your case. The text you quoted talks
about dangers of returning _pointers_ to local data from functions. You
are not returning any pointers to local data from functions in your
code. Your code return the data _by_ _value_. This is a significantly
different situation.

Old Wolf · Dec 30, 2004

Adam said:
I've been considering similar issues after I tried to compile this to show
the addresses were the same:

printf ("&(l_SYM_2B(&a)) is %p\n", &(l_SYM_2B(&a)));
printf ("&(l_SYM_2B(&a).o[0]) is %p\n", &(l_SYM_2B(&a).o[0]));

A minor point: %p expects a (void *), you are causing UB
by passing a different pointer type to it. You need to cast
to (void *) in case you are on a platform where (void *) is
bigger than pointer-to-struct.

The first statement correctly generates the error: "invalid lvalue
in unary `&'" according to 6.5.3.2 (a meta question is *why* it
should be prohibited to obtain the address of a return value on
the stack. Sometimes C feels frustratingly high level

What stack? (Some implementations don't have one). The C90
logic seems clear: the return value might be in a register, so
how can you take the address of it? (or of parts of it).
Reading Andrey Tarasevich's replies, it seems that C99 fixes
this problem by forcing the return value to be an lvalue if
you use operator* (or equivalently, operator[]) on it.
My question then: suppose we have in C99

struct t1 t = l_SYM_2B(&x).o[0];

does that mean that there must be an lvalue object created to be
the return-value (so that [0] can be applied to its .o),
and then that object is copied to t ? ie. there is a wasted
copy?

Adam Warner · Dec 30, 2004

Hi Andrey Tarasevich,

However, this doesn't change the fact that we agree that this part is
well-formed in C99

I would still like to know whether the following code

struct S { int i[5]; };

struct S foo() { struct S s = { 0 }; return s; }

int main() { int i = foo().i[1]; }

is valid C89/C90? Comeau thinks it is not (and the standard, as I
understand it, seems to support this position). GCC, on the other hand,
accepts it. Can anyone comment on it?

6.5.2.2/5: "If an attempt is made to modify the result of a function call
or to access it after the next sequence point, the behavior is undefined."

Therefore I have to copy the return value into a local variable so I can
pass it by reference to a function, thereby defeating the benefits of pass
by reference (which were to avoid copying heavyweight structures and
permit mutation of all objects via function calls).

Click to expand...

It looks like you have to do that, yes.

Another approach would be to pass a pointer to the destination structure
to 'l_SYM_2B' function as a separate parameter and fill it in inside
'l_SYM_2B' instead of returning it from 'l_SYM_2B' by value. In other
words, you can (if you can) change the signature of 'l_SYM_2B' to
something like

void l_SYM_2B(<whatever>* lp_a, struct v1* lp_v1)
{
/* fill in '*lp_v1' accordingly */
}

Your original code in this case would look as follows

struct v1 val1;
l_SYM_2B(&a, &val1);
print_aesthetic(&val1.o[0]);

No heavy copying is performed.

Ingenious. Thanks for the tip. In other words I make it the responsibility
of the caller to supply the data structure to fill in. The caller can
therefore create the value or reference semantics it desires. For example
I could supply a reference to a new object to an increment function for
functional semantics. Or I could supply a reference to the old object and
it would be mutated instead.

That's an interesting question. In C++ you wouldn't have to worry about
the lifetime of temporary objects (assuming that you switch from using
pointers to using references) because the lifetime of temporary objects
extends at least to the end of the expression.

That would be useful. It wouldn't cause the stack to blow up like I was
imagining (repeatedly calling a function within a loop) as the references
to the return values would become invalid at the end of each expression.
But it would permit me to nest references to return values.

Regards,
Adam

Adam Warner · Dec 30, 2004

The second one may however be legitimate in C99. The semantics state (note
that this the final draft): <http://dev.unicals.com/c99-draft.html#6.5.3.2>

Similarly, if the operand is the result of a [] operator, neither the &
operator nor the unary * that is implied by the [] is evaluated and the
result is as if the & operator were removed and the [] operator were
changed to a + operator. Otherwise, the result is a pointer to the
object or function designated by its operand.

I've come across "Rationale for International Standard--Programming
Languages--C" which explains some of the decisions of C99:
<http://careferencemanual.com/>
<http://wwwold.dkuug.dk/JTC1/SC22/WG14/www/docs/n897.pdf>

In relation to 6.5.3.2 it states:

6.5.3.2 Address and indirection operators

Some implementations have not allowed the & operator to be applied to
an array or a function. (The construct was permitted in early versions
of C, then later made optional.) The C89 Committee endorsed the
construct since it is unambiguous, and since data abstraction is
enhanced by allowing the important & operator to apply uniformly to any
addressable entity.

Regards,
Adam

Dietmar Schindler · Dec 30, 2004

Andrey said:
Another approach would be to pass a pointer to the destination structure
to 'l_SYM_2B' function as a separate parameter and fill it in inside
'l_SYM_2B' instead of returning it from 'l_SYM_2B' by value. In other
words, you can (if you can) change the signature of 'l_SYM_2B' to
something like

void l_SYM_2B(<whatever>* lp_a, struct v1* lp_v1)
{
/* fill in '*lp_v1' accordingly */
}

Your original code in this case would look as follows

struct v1 ;
l_SYM_2B(&a, &val1);
print_aesthetic(&val1.o[0]);

No heavy copying is performed.

Still l_SYM_2B has to copy an entire struct v1 into *lp_v1. That is
heavy copying (and, moreover, leaving compilers less opportunities to
optimize it away).

Regards,
Dietmar Schindler

Andrey Tarasevich · Dec 30, 2004

Dietmar said:
Andrey said:

Another approach would be to pass a pointer to the destination structure
to 'l_SYM_2B' function as a separate parameter and fill it in inside
'l_SYM_2B' instead of returning it from 'l_SYM_2B' by value. In other
words, you can (if you can) change the signature of 'l_SYM_2B' to
something like

void l_SYM_2B(<whatever>* lp_a, struct v1* lp_v1)
{
/* fill in '*lp_v1' accordingly */
}

Your original code in this case would look as follows

struct v1 ;
l_SYM_2B(&a, &val1);
print_aesthetic(&val1.o[0]);

No heavy copying is performed.

Click to expand...

Still l_SYM_2B has to copy an entire struct v1 into *lp_v1. That is
heavy copying (and, moreover, leaving compilers less opportunities to
optimize it away).
...

As I understood it, the OP is not actually doing any dumb "copying" of
anything into the resultant structure. But even if he does, that's still
beside the point. As far as I'm concerned, inside 'l_SYM_2B' he is
somehow _building_ the new 'struct v1' from 'a' (and I don't know what
the type of 'a' is). How exactly he is doing it inside 'l_SYM_2B' - I
don't know, but it is irrelevant anyway.

Note that OP's original code with intermediate value 'val1' did two
things: firstly, it had to form the result of 'l_SYM_2B' function (it is
done inside 'l_SYM_2B' and I, once again, don't know and don't care how
it is done, by "heavy copying" or in some other way), and secondly, the
returned result was copied to variable 'val1', which is definitely a
"heavy copying". The approach that I propose eliminates the second heavy
copying, it builds the value of 'val1' "in place" (through a pointer).
That's the entire point of this approach.

A smart optimizing compiler might be good enough to do the same thing
with the original code automatically. But just to be sure one can do it
manually. I don't see how this can impede any compiler optimizations.
Can you please elaborate?

Whether the first step (forming of the new 'struct v1' value from 'a')
can be optimized is a separate question. So far we didn't even touch it
in this discussion.

Dietmar Schindler · Dec 30, 2004

Hi, Adam!

Adam said:
* Footnote:
a is a local variable of type struct o.
l_SYM_2B(struct o * arg0) returns struct v1, defined as { struct o o[1]; };
print_aesthetic has the prototype struct v1 print_aesthetic(struct o *);

Click to expand...

If l_SYM_2B is a function, you may get an intermediate copy of the
structure anyway. ...

Click to expand...

l_SYM_2B() returns the structure v1, i.e. leaves room for one object o on
the stack, aligned to the same position as if struct o itself was returned.

I sincerely hope no compiler will ever create a copy of struct o before
returning its address. I am only using the struct v1 for symmetry with v2,
v3, etc. The address of the first object o is expected to be the same as
the v1 structure. As I am only obtaining the address of o[0] within the
structure v1 I cannot see how C semantics could result in an intermediate
copy of o.

I had already typed in a different reply, when I realized where our
misunderstanding might be. By "intermediate copy of the structure", I
meant your abovementioned object o (since this is already a copy of v1)
itself, not a copy of o. If you mean it reversely, then we agree.

Regards,
Dietmar

Dietmar Schindler · Dec 30, 2004

Andrey said:
Dietmar said:

Andrey said:

Another approach would be to pass a pointer to the destination structure
to 'l_SYM_2B' function as a separate parameter and fill it in inside
'l_SYM_2B' instead of returning it from 'l_SYM_2B' by value. In other
words, you can (if you can) change the signature of 'l_SYM_2B' to
something like

void l_SYM_2B(<whatever>* lp_a, struct v1* lp_v1)
{
/* fill in '*lp_v1' accordingly */
}

Your original code in this case would look as follows

struct v1 ;
l_SYM_2B(&a, &val1);
print_aesthetic(&val1.o[0]);

No heavy copying is performed.

Click to expand...

Still l_SYM_2B has to copy an entire struct v1 into *lp_v1. That is
heavy copying (and, moreover, leaving compilers less opportunities to
optimize it away).
...

Click to expand...

As I understood it, the OP is not actually doing any dumb "copying" of
anything into the resultant structure. But even if he does, that's still
beside the point. As far as I'm concerned, inside 'l_SYM_2B' he is
somehow _building_ the new 'struct v1' from 'a' (and I don't know what
the type of 'a' is). How exactly he is doing it inside 'l_SYM_2B' - I
don't know, but it is irrelevant anyway.

According to what Adam Warner (the OP) wrote in a followup about
l_SYM_2B ("It
is to be my dynamic Lisp function +"), it seems you understood it right.
I was referring to the (in the original post not ruled out) case where
l_SYM_2B just returned an existing object, and that was not "beside the
point" (see below).

Note that OP's original code with intermediate value 'val1' did two
things: firstly, it had to form the result of 'l_SYM_2B' function (it is
done inside 'l_SYM_2B' and I, once again, don't know and don't care how
it is done, by "heavy copying" or in some other way), and secondly, the
returned result was copied to variable 'val1', which is definitely a
"heavy copying". The approach that I propose eliminates the second heavy
copying, it builds the value of 'val1' "in place" (through a pointer).
That's the entire point of this approach.

A smart optimizing compiler might be good enough to do the same thing
with the original code automatically. But just to be sure one can do it
manually. I don't see how this can impede any compiler optimizations.
Can you please elaborate?

Whether the first step (forming of the new 'struct v1' value from 'a')
can be optimized is a separate question. So far we didn't even touch it
in this discussion.

I meant exactly this "first step". Since we had no conversation about
dividing the original problem (which included avoiding the creation of
intermediate copies of the structure) into distinct questions or steps,
I don't think we can say that we "didn't even touch it in this
discussion" - I certainly did.

You, too, seem to suspect that forming of the new 'struct v1' value from
'a' can be optimized in certain cases (which, as we know now, are
probably different from the matter in hand). May I assume that you do
see how passing a pointer to a to-be-filled structure can impede an
optimization in such cases, and there is no need to elaborate on that?

Best regards,
Dietmar Schindler

Lawrence Kirby · Dec 30, 2004

On Wed, 29 Dec 2004 18:31:17 -0800, Andrey Tarasevich wrote:

....

I would still like to know whether the following code

struct S { int i[5]; };

struct S foo() { struct S s = { 0 }; return s; }

int main() { int i = foo().i[1]; }

is valid C89/C90? Comeau thinks it is not (and the standard, as I
understand it, seems to support this position). GCC, on the other hand,
accepts it. Can anyone comment on it?

Here gcc -ansi -pedantic gives:

retval.c:5: warning: ISO C89 forbids subscripting non-lvalue array

So gcc also generates an appropriate diagnostic.

Lawrence

Andrey Tarasevich · Dec 30, 2004

Dietmar said:
...
Your original code in this case would look as follows

struct v1 val1;
l_SYM_2B(&a, &val1);
print_aesthetic(&val1.o[0]);

No heavy copying is performed.

Still l_SYM_2B has to copy an entire struct v1 into *lp_v1. That is
heavy copying (and, moreover, leaving compilers less opportunities to
optimize it away).
...

Click to expand...

As I understood it, the OP is not actually doing any dumb "copying" of
anything into the resultant structure. But even if he does, that's still
beside the point. As far as I'm concerned, inside 'l_SYM_2B' he is
somehow _building_ the new 'struct v1' from 'a' (and I don't know what
the type of 'a' is). How exactly he is doing it inside 'l_SYM_2B' - I
don't know, but it is irrelevant anyway.

Click to expand...

According to what Adam Warner (the OP) wrote in a followup about
l_SYM_2B ("It
is to be my dynamic Lisp function +"), it seems you understood it right.
I was referring to the (in the original post not ruled out) case where
l_SYM_2B just returned an existing object, and that was not "beside the
point" (see below).

Oh, if 'l_SYM_2B' returns an exact copy of an existing object (i.e. an
object that is not local to function 'l_SYM_2B'), then, of course, the
situation can be perceived differently. In that case, maybe, it could be
made to return a [const-qualified] pointer to the object

const struct v1* l_SYM_2B(<whatever>* lp_a);

and the original code would look as follows

print_aesthetic(&l_SYM_2B(&a)->o[0]);

thus eliminating all copying entirely.

But frankly, I don't see where Adam says that this function returns an
existing object (I did re-read his follow-up message you mention). All
the time I was assuming that every invocation of 'l_SYM_2B' is supposed
to build and return a new instance of 'struct v1'. I could be wrong, of
course.

...
You, too, seem to suspect that forming of the new 'struct v1' value from
'a' can be optimized in certain cases (which, as we know now, are
probably different from the matter in hand). May I assume that you do
see how passing a pointer to a to-be-filled structure can impede an
optimization in such cases, and there is no need to elaborate on that?
...

I'm not saying that it can't impede optimization, but at the same time
I'd be glad to see a [simplified] example that would illustrate the
particular optimization you have in mind.

If you are talking about the above situation with 'l_SYM_2B' always
returning "an existing object", then, of course, it can be optimized.
However, I can only see how it can be done manually, and somehow I'm not
sure that we can realistically expect this kind of optimization from the
compiler.

Const struct at a specified Address	0	Jul 27, 2022
Taking address of struct temporary	5	Jul 19, 2012
Calling mechanisms and struct literals	1	Nov 27, 2012
compiler reaction to wrong struct member name	2	Sep 3, 2009
using member address of an unaligned structure	5	Feb 8, 2006
About regaining the memory associated with a block-specific variableand reusing it further.	14	Feb 1, 2011
comparing binary trees in C	12	May 1, 2009
Compiling fics-1.7.4	3	May 6, 2011

Address syntax

Adam Warner

Dietmar Schindler

Keith Thompson

Adam Warner

Andrey Tarasevich

Andrey Tarasevich

Adam Warner

Andrey Tarasevich

Adam Warner

Adam Warner

Andrey Tarasevich

Old Wolf

Adam Warner

Adam Warner

Dietmar Schindler

Andrey Tarasevich

Dietmar Schindler

Dietmar Schindler

Lawrence Kirby

Andrey Tarasevich

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads