why is it so ?

R

Robert Gamble

Barry said:
Barry said:
On 21 Jul 2005 09:23:50 GMT, "S.Tobias"

[...]
Personally, I've never been entirely convinced that
`i = f(++i)' is bulletproof. Yes, there's a sequence point
[snip]

The whole purpose of sequence points, I think, is to impose a
reasonable set of of restrictions on what optimizations a compiler is
allowed to perform. An optimizer *can* move a side effect across a
sequence point, but it's allowed to do so only if it doesn't destroy
the semantics of the program. The actual program needs to behave,
[snip]

As a programmer, I'll just avoid things like "i = f(++i);". If I were
implementing a compiler, I'd try to be conservative enough in my
optimizations so that "i = f(++i);" works as expected, even if I can
[snip]

I think there's a worse pit-fall:
int i=0;
a = f(i++);
Which element is being set?
I think this is unspecified (6.5.16#4), but the behaviour
is defined.

I don't think the behavior is defined. While i is being updated only
once, there is a second requirement that i be evaluated at most once
as part of the process. Here i is being evaluated twice.


That would mean that the expression "i = i + i"; is undefind as i is
evaluated more than once. The second requirement you speak of actually


I don't think the i on the left of the = operator is evaluated.


Neither do I. I do think that it is evaluated twice on the right-hand
side though.

Robert Gamble
 
C

CBFalconer

Robert said:
.... snip ...

Neither do I. I do think that it is evaluated twice on the
right-hand side though.

One possible code generation sequence for a stack machine is:

instr. stack content after
lda i &i, ....
lda i &i, &i, ....
load i, &i, ....
lda i &i, i, &i, ....
load i, i, &i, ...
add i+i, &i, ....
store ....

where lda is load address, load is *TOS->TOS,, etc. Another
possible sequence could be:

lda i
dup
load
dup
add
store

after some elementary optimization.
 
S

S.Tobias

Barry Schwarz said:
On 21 Jul 2005 09:23:50 GMT, "S.Tobias"
I think there's a worse pit-fall:
int i=0;
a = f(i++);
Which element is being set?
I think this is unspecified (6.5.16#4), but the behaviour
is defined.


I don't think the behavior is defined. While i is being updated only
once, there is a second requirement that i be evaluated at most once
as part of the process. Here i is being evaluated twice.


Thanks, I think you're referring to:
(n8??.txt, 6.5)
[#2] Between the previous and next sequence point an object
shall have its stored value modified at most once by the
evaluation of an expression. Furthermore, the prior value
shall be accessed only to determine the value to be
stored.60)

Note that this requirement is made only between two consecutive
sequence points. Indeed, (only) if lhs is evaluated first, the
last sentence is violated for the `i' object.

Have a look at this, I beleive it is correct now:

that's gcc3.3
gcc2.95: 0 1
como: 0 1



#include <stdio.h>

int one(int unused) { return 1; }

int main()
{
int a[2] = {0};
int i = 0; #if 0
a = one(i++);

#endif
a = one(0 || i++);
printf("%d %d\n", a[0], a[1]);
return 0;
}
 
R

Robert Gamble

S.Tobias said:
Barry Schwarz said:
On 21 Jul 2005 09:23:50 GMT, "S.Tobias"
I think there's a worse pit-fall:
int i=0;
a = f(i++);
Which element is being set?
I think this is unspecified (6.5.16#4), but the behaviour
is defined.


I don't think the behavior is defined. While i is being updated only
once, there is a second requirement that i be evaluated at most once
as part of the process. Here i is being evaluated twice.


Thanks, I think you're referring to:
(n8??.txt, 6.5)
[#2] Between the previous and next sequence point an object
shall have its stored value modified at most once by the
evaluation of an expression. Furthermore, the prior value
shall be accessed only to determine the value to be
stored.60)

Note that this requirement is made only between two consecutive
sequence points. Indeed, (only) if lhs is evaluated first, the
last sentence is violated for the `i' object.

Have a look at this, I beleive it is correct now:

that's gcc3.3
gcc2.95: 0 1
como: 0 1



#include <stdio.h>

int one(int unused) { return 1; }

int main()
{
int a[2] = {0};
int i = 0; #if 0
a = one(i++);

#endif
a = one(0 || i++);
printf("%d %d\n", a[0], a[1]);
return 0;
}


Still has the possibility for undefined behavior, the "0 || i++" could
be evaluated followed by i in "a" before the function is called
without an intervening sequence point.

Robert Gamble
 
S

S.Tobias

Robert Gamble said:
S.Tobias wrote:
a = one(0 || i++);

Still has the possibility for undefined behavior, the "0 || i++" could
be evaluated followed by i in "a" before the function is called
without an intervening sequence point.


I think it can be evaluated in one of the two ways:

(lhs, rhs)
a , 0 [SEQP] i++ [SEQP] one() , return 1 [SEQP] =

or:

(rhs, lhs)
0 [SEQP] i++ [SEQP] one() , return 1 [SEQP] a , =

The first case is more interesting. There's a sequence point
between evaluation of `a' and `i++', so the behaviour is defined.
As for the whole expression, the behaviour is unspecified (I think).

Or is it that lhs and rhs can be evaluated in paralell? Then
there's indeed UB, but then there would be one in `i=f(++i)' too.
 
T

Tim Rentsch

S.Tobias said:
Robert Gamble said:
S.Tobias wrote:
a = one(0 || i++);

Still has the possibility for undefined behavior, the "0 || i++" could
be evaluated followed by i in "a" before the function is called
without an intervening sequence point.


I think it can be evaluated in one of the two ways:

(lhs, rhs)
a , 0 [SEQP] i++ [SEQP] one() , return 1 [SEQP] =

or:

(rhs, lhs)
0 [SEQP] i++ [SEQP] one() , return 1 [SEQP] a , =


You missed at least one way:

0 [SEQP] a , i++ [SEQP] one() , return 1 [SEQP] =

This ordering shows how this statement could evoke undefined behavior.
 
R

Robert Gamble

S.Tobias said:
Robert Gamble said:
S.Tobias wrote:
a = one(0 || i++);

Still has the possibility for undefined behavior, the "0 || i++" could
be evaluated followed by i in "a" before the function is called
without an intervening sequence point.


I think it can be evaluated in one of the two ways:

(lhs, rhs)
a , 0 [SEQP] i++ [SEQP] one() , return 1 [SEQP] =

or:

(rhs, lhs)
0 [SEQP] i++ [SEQP] one() , return 1 [SEQP] a , =

The first case is more interesting. There's a sequence point
between evaluation of `a' and `i++', so the behaviour is defined.
As for the whole expression, the behaviour is unspecified (I think).

Or is it that lhs and rhs can be evaluated in paralell? Then
there's indeed UB, but then there would be one in `i=f(++i)' too.


I am not aware of anything in the Standard that states that both ++i in
f(++i) and i in a can't be evaluated before the actual function
call, this would invoke undefined behavior, see Tim's response.

As for i=f(++i), I still think this is well-defined. The i on the lhs
is not evaluated and cannot be assigned to until after f returns which
guarantees a sequence point between the two modifications.

To summerize my stance:
i = f(++i); well-defined
i = f(i++); well-defined
a = f(++i); undefined
a = f(i++); undefined

Also undefined would be:
a[++i] = f(i);
a[i++] = f(i);

Robert Gamble
 
S

S.Tobias

Tim Rentsch said:
S.Tobias said:
Robert Gamble said:
S.Tobias wrote:
a = one(0 || i++);

Still has the possibility for undefined behavior, the "0 || i++" could
be evaluated followed by i in "a" before the function is called
without an intervening sequence point.


I think it can be evaluated in one of the two ways:

(lhs, rhs)
a , 0 [SEQP] i++ [SEQP] one() , return 1 [SEQP] =

or:

(rhs, lhs)
0 [SEQP] i++ [SEQP] one() , return 1 [SEQP] a , =


You missed at least one way:

0 [SEQP] a , i++ [SEQP] one() , return 1 [SEQP] =


Could you please tell me by which rule you have interleaved evaluation
of lhs and rhs?

# 6.5.16 Assignment operators
# 4 The order of evaluation of the operands is unspecified. [...]

Compare it with:

# 6.5 Expressions
# 3 The grouping of operators and operands is indicated by the
# syntax.71) Except as specified later (for the function-call (),
# &&, ||, ?:, and comma operators), the order of evaluation of
# subexpressions and the order in which side effects take place
# are both unspecified.
#
# 6.5.2.2 Function calls
# 10 The order of evaluation of the function designator, the actual
# arguments, and subexpressions within the actual arguments is
# unspecified, but there is a sequence point before the actual call.

I interpret above wording in such a way, that for "=" operator either
lhs is fully evaluated and then rhs is evaluated, or rhs is fully
evaluated and then lhs. If the Standard meant otherwise, 6.5.16p.4
could be dropped (as is for other operators) - 6.5p.3 would be enough.
Note also explicit wording for function call ("and subexpressions...").

(6.5.2.2p.10 is actually needed and is not covered by 6.5p.3, because
function designator and arguments are not operands to a common operator;
rather arguments parameterize the operator, to which the (single)
operand is function expression.)
 
R

Robert Gamble

S.Tobias said:
Tim Rentsch said:
S.Tobias said:
S.Tobias wrote:

a = one(0 || i++);

Still has the possibility for undefined behavior, the "0 || i++" could
be evaluated followed by i in "a" before the function is called
without an intervening sequence point.

I think it can be evaluated in one of the two ways:

(lhs, rhs)
a , 0 [SEQP] i++ [SEQP] one() , return 1 [SEQP] =

or:

(rhs, lhs)
0 [SEQP] i++ [SEQP] one() , return 1 [SEQP] a , =


You missed at least one way:

0 [SEQP] a , i++ [SEQP] one() , return 1 [SEQP] =


Could you please tell me by which rule you have interleaved evaluation
of lhs and rhs?

# 6.5.16 Assignment operators
# 4 The order of evaluation of the operands is unspecified. [...]

Compare it with:

# 6.5 Expressions
# 3 The grouping of operators and operands is indicated by the
# syntax.71) Except as specified later (for the function-call (),
# &&, ||, ?:, and comma operators), the order of evaluation of
# subexpressions and the order in which side effects take place
# are both unspecified.
#
# 6.5.2.2 Function calls
# 10 The order of evaluation of the function designator, the actual
# arguments, and subexpressions within the actual arguments is
# unspecified, but there is a sequence point before the actual call.

I interpret above wording in such a way, that for "=" operator either
lhs is fully evaluated and then rhs is evaluated, or rhs is fully
evaluated and then lhs. If the Standard meant otherwise, 6.5.16p.4
could be dropped (as is for other operators) - 6.5p.3 would be enough.
Note also explicit wording for function call ("and subexpressions...").


I don't think this is a solid argument, on the contrary, one could just
as easily conclude that the wording is there specifically to allow what
you say it doesn't. If the intent was that the rhs be evaluated before
the lhs it would have been very easy to add a clause saying just that.

There is a document entitled "Sequence Point Analysis" authored by
Raymond Mak of IBM and published as WG14 N926 in which he presents a
method of breaking up expressions using partial ordering by creating an
abstract syntax tree to better understand these types of issues
regarding sequence points, how expressions are evaluated, and what
types of expressions fall into the categories of defined, undefined,
and unspecified hehavior. The document is freely available at
http://www.open-std.org/jtc1/sc22/wg14/www/docs/n926.htm. I recommend
reading it, the material is well-presented and easy to follow.

The method presented in this document provides a relatively clear and
concise way to understand the nature of sequence points in expressions.
Many examples are discussed including two that are relevant to this
thread. Below are those two examples and the conclusions, refer to the
actual document for the details.

EXAMPLE 4

int x;
extern int f(int);
x = f(x++);

....

There is an intervening sequence point. The expression is
well-defined.

and

EXAMPLE 13

int x[2], *y;
y=x;
*y = f(y++);

....

Even though there is a sequence point in between, the two nodes are
not ordered. The expression is undefined.

Example 13 is equivalent in nature to the example we are discussing
now. It is undefined for the same reasons that a = f(i++); is
undefined.

It might be worth noting that gcc 3.3.5 seems to agree with all of this
(not to imply of course that the compiler dictates the Standard, etc.).

Robert Gamble
 
T

Tim Rentsch

S.Tobias said:
Tim Rentsch said:
S.Tobias said:
S.Tobias wrote:

a = one(0 || i++);

Still has the possibility for undefined behavior, the "0 || i++" could
be evaluated followed by i in "a" before the function is called
without an intervening sequence point.

I think it can be evaluated in one of the two ways:

(lhs, rhs)
a , 0 [SEQP] i++ [SEQP] one() , return 1 [SEQP] =

or:

(rhs, lhs)
0 [SEQP] i++ [SEQP] one() , return 1 [SEQP] a , =


You missed at least one way:

0 [SEQP] a , i++ [SEQP] one() , return 1 [SEQP] =


Could you please tell me by which rule you have interleaved evaluation
of lhs and rhs? [snip]


Sorry for the late reply here...

I don't have specific text I can point to that shows this
clearly and unambiguously. It's just how expression
evaluation works, with the clause that "order of evaluation
of subexpressions is unspecified".

A way that might be useful as an explanation is this: start
at the top of the abstract syntax tree, and recurse
non-deterministically down both subtrees of ordinary
operators (basically everything but &&, ||, comma, and ?:)
and deterministically down the left branch then the other
branch (if necessary) for the remaining operators; any
order of execution in this non-deterministic expansion
is a possible order of execution.

I also echo the comment made in another reply to look
at the document describing the formal model done by
Raymond Mak.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,167
Messages
2,570,911
Members
47,453
Latest member
MadelinePh

Latest Threads

Top