x=(x=5,11)

G

google

Barry said:
Barry Schwarz wrote:

<snip and summarize thus>

((*f(0))++, (*f(1))++) + ((*f(2))++, (*f(3))++)

(we're assuming here that f(0) is done before f(1) is done before f(2)
is done before f(3) - other orderings are allowed but are not relevant
to this discussion. f(3) returns a pointer to the same address as f(1))
I don't see how. The only way d[1] can be updated is if f() returns
its address. The expression f(1) will always return &d[1]. The
expression f(3) will return &d1 if twice is not 0. No other call to
f() will return &d[1].
(I'm using some lazy shorthand here - f(0)++ actually means (*f(0))++
etc)

because f(0) is done before f(1) and there is a comma sequence point
after f(0) and before f(1) the side effects of the f(0)++ must be
completed before f(1).

While the comma sequence point does in fact guarantee this, the
sequence point before the call to f(1) also guarantees it. It doesn't
matter why there is a sequence point, only that there is one.
because f(2) is done before f(3) and there is a comma sequence point
after f(2)++ and before f(3) the side effects of the f(2)++ must be
completed before f(3).

As I said in the portion you chose to snip, neither f(0) nor f(2)
matter since they cannot have any effect of d[1].
But f(1) can be called before f(2) but the increment deferred until the
end of the full expression. What this does mean is that part of f(1)++

No it cannot. There is a sequence point prior to the call to f(2) and
any side effect from f(1) must be completed before this sequence
point.
I agree that the standard is ambiguous about this (which is where we
came in on this discussion anyway) but I believe that the _intent_ of
the standard is that

(*f())++ + (*f())++ has undefined behaviour because a _valid_
optimization for a compiler that can see that f() always returns the
same pointer is:

g=f(); (*g)++ + (*g)++;

i.e. the sequence points before and after the calls to f() only
constrain the side effects that occur in f() itself.

So
S, f(), S, f(), S, inc, inc, S

is allowed.

This is subtly different from the subject line. In that case I think
the standard requires the assignment of 5 to x before the 11 is
evaluated (and so assigned to x) but I don't think the standard
_requires_ that the assignment of 11 to x starts _after_ the assignment
of 5 to x.
Evaluations never cross sequence points. The whole purpose of
sequence points is to insure that the evaluation is complete.
Yes. But there is confusion when it comes to sub-expressions that
include sequence points. How much to they constrain things?


The standard guarantees that all side effects are complete prior to
the sequence point. Since there is a sequence point prior to the call
to f(2), the increment must be complete.
This is one valid interpretation. A second is that the sequence points
on the RHS have no effect on the LHS whatsoever and vice versa. A third
interpretation is that the sequence points on the RHS require that no
side effects are _still_ in progress on the LHS but do not otherwise
constrain when the side effects on the LHS can occur.

I would expect that it's actually the third interpretation that the
standard authors intended but it doesn't actually sound as reasonable
as either the first or second option when phrased as above.

Since evaluating g in its many forms does not involve a sequence
point, this is nothing but a red herring.

huh? there's a sequence point both before and after the call to f() in
each of these therefore there is a sequence point between the
modifications of g.

I think these are interesting because one example I gave where the
subject line expression could give undefined results is where the
hardware only has the ability to flip bits so a write requires a read
first. For (g=g+1) + f(); clearly we can have:

read g
call f
add 1 to the g we read
write new g

Now if my hardware requires a read before write, does the standard
require a second read before the write in the above or is it legitimate
to reuse the first read? I think the standard should require that
second read. I'm not convinced it does.

Tim.
 
B

Barry Schwarz

I agree that the standard is ambiguous about this (which is where we
came in on this discussion anyway) but I believe that the _intent_ of
the standard is that

(*f())++ + (*f())++ has undefined behaviour because a _valid_
optimization for a compiler that can see that f() always returns the
same pointer is:

But the standard is not ambiguous at all. There is a sequence point
before each function call and all side effects must be complete at
that point. Optimizations are required to behave "as if" all
requirements are satisfied.
g=f(); (*g)++ + (*g)++;

This was a red herring when it first entered the conversation and is
still a red herring.
i.e. the sequence points before and after the calls to f() only
constrain the side effects that occur in f() itself.

Where did you invent that from? There is no such thing as a partial
sequence point. Which ever call to f() is evaluated second is
guaranteed that all side effects of any previous evaluation are
complete.
So
S, f(), S, f(), S, inc, inc, S

is allowed.

This is subtly different from the subject line. In that case I think
the standard requires the assignment of 5 to x before the 11 is
evaluated (and so assigned to x) but I don't think the standard
_requires_ that the assignment of 11 to x starts _after_ the assignment
of 5 to x.

Yes. But there is confusion when it comes to sub-expressions that
include sequence points. How much to they constrain things?

A sequence point is a sequence point. All side effects are complete.
This is one valid interpretation. A second is that the sequence points
on the RHS have no effect on the LHS whatsoever and vice versa. A third
interpretation is that the sequence points on the RHS require that no
side effects are _still_ in progress on the LHS but do not otherwise
constrain when the side effects on the LHS can occur.

Only for some strange definition of the word all.
I would expect that it's actually the third interpretation that the
standard authors intended but it doesn't actually sound as reasonable
as either the first or second option when phrased as above.

Dream on.

It is still a red herring.
huh? there's a sequence point both before and after the call to f() in
each of these therefore there is a sequence point between the
modifications of g.

Actually, the standard only talks about a sequence point at the return
from a *library* function. But you make my case for me.
I think these are interesting because one example I gave where the
subject line expression could give undefined results is where the
hardware only has the ability to flip bits so a write requires a read
first. For (g=g+1) + f(); clearly we can have:

read g
call f
add 1 to the g we read
write new g

Now if my hardware requires a read before write, does the standard
require a second read before the write in the above or is it legitimate
to reuse the first read? I think the standard should require that
second read. I'm not convinced it does.

There are lots of situations where it becomes murky. I am only
addressing the case described at the top of this message with four
calls to f.


Remove del for email
 
E

ena8t8si

Harald said:
Thanks. Using your interpretation of sequence points, which is the
intervening sequence point between the two assignments to g in line B?
With "previous" and "next" sequence point referring to the previous and
next in time, the expression is unspecified but not undefined with any
order of evaluation, but using your interpretation, I think the
behaviour would be undefined (and the answer is that it is not
undefined).

The Standard isn't very clear in stating what the semantics are
in such situations, but (as the DRs explain) function calls are
different than non-function call expressions. An expression like

f() + g()

will always (act as if it does) completely
evaluate f() and completely evaluate g() in
one order or the other, but never overlapping.
So function calls are "atomic", at least with
respect to other function calls in the same
expression. This difference is brought out
in the DR's.
 
E

ena8t8si

Tim said:
On 20 Jun 2006 07:18:51 -0700,

No. I'm wrong here. my example does always have undefined behaviour and
your example is the same.

However, we have agreed something from this (quick bit of face saving :)

The evaluation of the left and right operands of a binary operator can
overlap.

Therefore I still contend that the writing to x=() in the subject can
overlap with the evaluation of (x=5,11), even if the final value cannot
be written until x=5 has completed.

The evaluation of the _operands_ of the = operator can overlap
each other. The evaluation of the operator itself must wait
for the operands to have been evaluated.
 
E

ena8t8si

Tim said:
How about?


#include <stdio.h>

unsigned int* f(int x)
{
static unsigned int d[2];
static unsigned int done[2];

done[x] = 1;

if(done[0] && d[0])
return &d[0];
else
return &d[1];
}

int main(void)
{
printf("%u\n", (*f(0))-- + (*f(1))++ );
return 0;
}

This question turns on a subtle point about how
function calls behave.

If you believe that evaluating a function call
is atomic relative to any other side effects
in the same expression as the function call,
the behavior is defined.

If you believe that evaluating a function call
is atomic relative to any other function calls
in the same expression but not necessarily
atomic relative to other side effects in the
same expression, the behavior is undefined.

Simple example:

int x;
int f(){ return x; }
int g(){ return (x++)+f(); }

The function g() has either unspecified
behavior or undefined behavior, depending
on which kind of atomicness holds for
function calls relative to side effects in
the same expression.

I know this point has been discussed but
I don't remember a definitive resolution
in any of the official documents. Unfortunately
clarifying language doesn't always make it into
the Standard even when a response to a DR makes
it clear that one point or another needs to be
explained.
 
E

ena8t8si

Barry said:
Barry said:
On 19 Jun 2006 09:22:13 -0700, (e-mail address removed) wrote:


(e-mail address removed) wrote:
(e-mail address removed) wrote:
Tim Woodall wrote:
On 29 May 2006 06:32:10 -0700,

Of course not. The order of evaluation of the operators
doesn't determine the order in which the side effects
take place. Evaluating an assignment operator causes
a store to take place, but the side effect of updating the
stored value may take place at any point before the
next sequence point (and after the assignment operator
has been evaluated).


Now we're getting somewhere.

"(and after the assignment operator has been evaluated)"

Where is this in the standard?

"Evaluation of an expression may produce side effects."
Obviously if evaluation produces the side effects, then
the side effects can't come first.

But they can come during.

Evaluation of operators are point events. See Wojtek's
explanation in comp.std.c.

(a,b) + (c,d)

if an example implementation evaluates a then b then c then d then you
seem to be saying that the sequence point after c but before d means
that all the side effects of evaluating b will have completed.

I'm not. Both commas must be evaluated before plus. There is
no ordering between either operand of the first comma and
either operand of the second comma. When the Standard says
"the next sequence point" what it means is the first sequence
point that must come afterwards in all possible orderings. The
next sequence point after b is the sequence point of the whole
expression, regardless of evaluation order.

So does this have undefined behaviour, or implementation defined
behaviour?

#include <stdio.h>

int* f(int x)
{
static int d[4];
static int done[4];
static int twice;
done[x]=1;
if(x==2 && done[0] && done[1])
twice=1;
if(x==3 && twice)
return &d[1];
else
return &d[x];
}

int main(void)
{
printf("%d\n", ( (*f(0))++, (*f(1))++ ) + ( (*f(2))++, (*f(3))++ )
);
return 0;
}

Undefined behavior.

Since there are sequence points before each function call to f, why do
you think this is any worse than implementation defined behavior.

The object d[1] can be updated more than once after all
the calls to f have completed.

I don't see how. The only way d[1] can be updated is if f() returns
its address. The expression f(1) will always return &d[1]. The
expression f(3) will return &d1 if twice is not 0. No other call to
f() will return &d[1].

Let us assume that twice is not 0 by the time f(3) is evaluated
(otherwise d1 is updated only once and we have no multiple updates at
all). The two possibilities are:

f(1) is evaluated before f(3). f(1) returns &d1. The
evaluation of f(3) obviously involves a call to the function f. There
is a sequence point before this call.

All ok up to this point.
That means that any side
effects of (*f(1))++ must be complete before this call.

This step is where your reasoning goes wrong.

Consider this sequence of evaluation, writing {f(1)} to
mean the temporary value that resulted from a previous
evaluation of f(1), and similarly {f(0)}, {f(2)}, {f(3)}:

f(0)
(*{f(0)})++
f(2)
(*{f(2)})++
f(1)
f(3)
(*{f(1)})++
(*{f(3)})++

For the last two lines, {f(1)} == {f(3)}. There are
no sequence points, but both lines update what the
pointers point to, in other words they update the
same object. That's undefined behavior.
The side
effect is the update to d[1] and it is completed before the call.
Sometime after the return from f(3), d[1] is updated again but there
was a sequence point between this update and the previous one.

f(3) is evaluated before f(1). The argument is symmetrical.

The bottom line is that at each call to f, all updates to d have been
completed. After the last call to f, there is only one update to be
performed.
The sequence points before the function calls don't change
anything. The function calls have to complete before the side
effects in each case, but the side effects of (*f(1))++ and
(*f(3))++ can still overlap each other. The expression

Since f(1) cannot overlap f(3) and since there is a sequence point
between the end of one call and the start of the next, whichever
expression is evaluated first must complete its side effect before the
next function call.

The relevant side effect operators are outside the function
calls, and both side effects may be done after both function
calls have completed.
((*f(0))++, p=f(1), (*p)++) + ((*f(2))++, q=f(3), (*q)++)

also has undefined behavior, for the same reason.

But this is different. (*p)++ does not involve a function call and
there is no guaranteed sequence point between (*p)++ and (*q)++. The
sequence of evaluation could be
(*f(0))++
p = f(1)
(*f(2))++
q = f(3)
(*p)++
(*q)++
sum
and the penultimate two steps do in fact both update d[1] (if twice is
not 0) with no intervening sequence point.

It's the same, as I have explained above.
 
E

ena8t8si

Barry said:
Barry Schwarz wrote:

<snip and summarize thus>

((*f(0))++, (*f(1))++) + ((*f(2))++, (*f(3))++)

(we're assuming here that f(0) is done before f(1) is done before f(2)
is done before f(3) - other orderings are allowed but are not relevant
to this discussion. f(3) returns a pointer to the same address as f(1))
I don't see how. The only way d[1] can be updated is if f() returns
its address. The expression f(1) will always return &d[1]. The
expression f(3) will return &d1 if twice is not 0. No other call to
f() will return &d[1].
(I'm using some lazy shorthand here - f(0)++ actually means (*f(0))++
etc)

because f(0) is done before f(1) and there is a comma sequence point
after f(0) and before f(1) the side effects of the f(0)++ must be
completed before f(1).

While the comma sequence point does in fact guarantee this, the
sequence point before the call to f(1) also guarantees it. It doesn't
matter why there is a sequence point, only that there is one.
because f(2) is done before f(3) and there is a comma sequence point
after f(2)++ and before f(3) the side effects of the f(2)++ must be
completed before f(3).

As I said in the portion you chose to snip, neither f(0) nor f(2)
matter since they cannot have any effect of d[1].
But f(1) can be called before f(2) but the increment deferred until the
end of the full expression. What this does mean is that part of f(1)++

No it cannot. There is a sequence point prior to the call to f(2) and
any side effect from f(1) must be completed before this sequence
point.
is evaluated before the various sequence points on the RHS of the +
operator and part of it is evaluated after, i.e. the evaluation of the
f(1)++ sub expression crosses sequence points.

Evaluations never cross sequence points. The whole purpose of
sequence points is to insure that the evaluation is complete.
What we are really interested in is whether the standard allows the
increment of f(1)++ to start before f(2) is called but to not complete
until the end of the full expression.

The standard guarantees that all side effects are complete prior to
the sequence point. Since there is a sequence point prior to the call
to f(2), the increment must be complete.
consider g++ + f(); and (g=g+1) + f(); and (g=5) + f(); where f()
modifies g. Do these have defined or undefined behaviour?

Since evaluating g in its many forms does not involve a sequence
point, this is nothing but a red herring.
The side effects of (*f(1))++ do not have to have started by the time
of the call to f(3).

How can you state that when the standard guarantees exactly the
opposite? There is a sequence point before the call to f(3) and side
effects must be complete at the sequence point.

You're utterly confused about sequence points. Go read one
of the formal model documents about how sequence points
work.
 
G

Guest

The Standard isn't very clear in stating what the semantics are
in such situations, but (as the DRs explain) function calls are
different than non-function call expressions. An expression like

f() + g()

will always (act as if it does) completely
evaluate f() and completely evaluate g() in
one order or the other, but never overlapping.
So function calls are "atomic", at least with
respect to other function calls in the same
expression. This difference is brought out
in the DR's.

Even if f() and g() are atomic, unless there is a sequence point
between assignments to the same variable in both, the behaviour is
explicitly undefined, and you have not explained where there is one. Or
are you saying v = v = 0 is defined when v is declared as volatile
sig_atomic_t, too?
 
T

Tim Woodall

The evaluation of the _operands_ of the = operator can overlap
each other. The evaluation of the operator itself must wait
for the operands to have been evaluated.
So you keep saying. And I think this is probably what the standard
authors intended but I still can't see where the standard _requires_ it
unless you assume a particular meaning to previous and subsequent in
5.1.2.3 #2


5.1.2.3 #5 and 5.1.2.3 #8 imply to me that x=(x=5,11); must have defined
behaviour if x is volatile but leave the question open if x is not
volatile and so require us to go back to 5.1.2.3 #2

There are at least three reasonable interpretations of previous and
subsequent in 5.1.2.3 #2

1. the strongest - that previous and next refer to the particular
ordering of sequence points that the particular abstract machine that
the implementation is emulating would imply.

2. that previous and next refer to the ordering that is required by all
possible abstract machines. This means that a side effect can cross a
sequence point if, in another abstract machine, the sequence point would
not have fallen between the two sequence points bounding the side
effect. [1]

3. the weakest - that previous and next refer to the sequence points
that _must_ bound the expression being evaluated. (I do not think the
standard authors intended this interpretation because it would leave a
lot of aparently reasonable code undefined but I do not think the
standard prohibits this interpretation)



[1] It is hard to come up with an example where this can differ from 1.
I had thought my previous example achieved this but I now think it
doesn't. The best I've managed (which doesn't really achieve what I want
but does show another place where "before" is ambiguous) is

#include <stdio.h>

int x[2]={0,1};
int y[2]={2,3};

int *p=y;

int f(int* t)
{
printf("f:%d\n", *p);
return 0;
}

int g(void)
{
printf("g:%d\n", *p); /* Line A */
p=x;
return 0;
}

int main(void)
{
return f(p++) + g();
}


There is a sequence point after the p++ is evaluated and before f is
called. Note that there is ambiguity about what this "before" means.
6.5.2.2 #10 states that "there is a sequence point before the actual
call" while annex C states that "The call to a function ..." is a
sequence point but then references 6.5.2.2 for that claim. (Annex C is
informative, not normative)

If f is called before g there is no problem. p gets incremented before
the call to f and the output is 3 3 and p is left pointing at x[0]

If g is called before f and we assume annex C for the sequence point at g
being called then is it required that p++ either not have started or
have completed? If so, what wording in the standard requires it and
cannot be interpreted in any other way?

If we take the meaning that the sequence point at g occurs before the
call to g then the first sequence point after the p++ can be at the
printf (Line A) in g. Does this force the side effects of p++ to have
completed. If we remove Line A then does this make this have undefined
behaviour because the p++ (assuming it has started) and the p=x; both
complete at the same sequence point?

I would have expected that the standard authors intended the main
function to have defined behaviour (assuming well behaved f() and g() )
regardless of what f and g might have done. But the _next_ sequence
point to p++ that any implementation can be sure about is the one before
the call to f. (I think that the compiler can deduce that there must be
a sequence point inside g)

Tim.
 
T

Tim Woodall

All ok up to this point.


This step is where your reasoning goes wrong.

Consider this sequence of evaluation, writing {f(1)} to
mean the temporary value that resulted from a previous
evaluation of f(1), and similarly {f(0)}, {f(2)}, {f(3)}:

f(0)
(*{f(0)})++
f(2)
(*{f(2)})++
f(1)
f(3)
(*{f(1)})++
(*{f(3)})++

For the last two lines, {f(1)} == {f(3)}. There are
no sequence points, but both lines update what the
pointers point to, in other words they update the
same object. That's undefined behavior.

I think I've worked out what he is on about here. I think he is taking
the "sequence point before a function is called" to mean "sequence point
_at_ a function call". (This is what appendix C says but appendix C is
informative, not normative)

He is then additionally claiming that the sequence point when f(3) is
called forces the evaluation of f(1)++ to complete. (I can see why this
is a valid interpretation of the standard but not why it's the only
interpretation. In fact, AFAICT this would require x=(x=5,11) to have
defined behaviour as it's a stronger constraint on sequence points than
you are proposing. If there is a DR out there that clarifies this then I
wish somebody would point it out)

Unfortunately, his best argument as to why other possible
interpretations are not allowed is "in your dreams".

Tim.
 
E

ena8t8si

Harald said:
Even if f() and g() are atomic, unless there is a sequence point
between assignments to the same variable in both, the behaviour is
explicitly undefined, and you have not explained where there is one.
...

The sequence points before function calls behave
differently than other sequence points. The Standard
doesn't always make this clear but it is spelled
out in the DR's. Read the DR's.
 
E

ena8t8si

Tim said:
The evaluation of the _operands_ of the = operator can overlap
each other. The evaluation of the operator itself must wait
for the operands to have been evaluated.
So you keep saying. And I think this is probably what the standard
authors intended but I still can't see where the standard _requires_ it
unless you assume a particular meaning to previous and subsequent in
5.1.2.3 #2[/QUOTE]

Right, the meaning of previous and subsequent is crucial.
Read on.
5.1.2.3 #5 and 5.1.2.3 #8 imply to me that x=(x=5,11); must have defined
behaviour if x is volatile but leave the question open if x is not
volatile and so require us to go back to 5.1.2.3 #2

There are at least three reasonable interpretations of previous and
subsequent in 5.1.2.3 #2

1. the strongest - that previous and next refer to the particular
ordering of sequence points that the particular abstract machine that
the implementation is emulating would imply.

2. that previous and next refer to the ordering that is required by all
possible abstract machines. This means that a side effect can cross a
sequence point if, in another abstract machine, the sequence point would
not have fallen between the two sequence points bounding the side
effect. [1]

3. the weakest - that previous and next refer to the sequence points
that _must_ bound the expression being evaluated. (I do not think the
standard authors intended this interpretation because it would leave a
lot of aparently reasonable code undefined but I do not think the
standard prohibits this interpretation)

Your choices here are a bit confused. There is only
one abstract machine.

In any case the answer is #3. See example 15 in 5.1.2.3.
In the statement

sum = sum * 10 - '0' + (*p++ = getchar());

the explanation says

but the actual increment of p can occur at any time
between the previous sequence point and the next
sequence point (the ;) and the call to getchar can
occur at any point prior to the need of its returned
value.

Since the call to getchar() can occur after the ++
operator, but the next sequence point after the ++
is the semicolon, sequence point ordering is a static
condition based on the syntax, not evaluation order.

Incidentally, there is always only one next sequence
point, but there can be more than one previous sequence
point. In the expression

t = (x ? p : q)[ j = i+1, j ];

both the ?: and the comma operator are sequence points
that are previous sequence points for the assignment
to t.
[1] It is hard to come up with an example where this can differ from 1.
I had thought my previous example achieved this but I now think it
doesn't. The best I've managed (which doesn't really achieve what I want
but does show another place where "before" is ambiguous) is

#include <stdio.h>

int x[2]={0,1};
int y[2]={2,3};

int *p=y;

int f(int* t)
{
printf("f:%d\n", *p);
return 0;
}

int g(void)
{
printf("g:%d\n", *p); /* Line A */
p=x;
return 0;
}

int main(void)
{
return f(p++) + g();
}


There is a sequence point after the p++ is evaluated and before f is
called. Note that there is ambiguity about what this "before" means.
6.5.2.2 #10 states that "there is a sequence point before the actual
call" while annex C states that "The call to a function ..." is a
sequence point but then references 6.5.2.2 for that claim. (Annex C is
informative, not normative)

If f is called before g there is no problem. p gets incremented before
the call to f and the output is 3 3 and p is left pointing at x[0]

First notice that we don't need f() at all. The
expression

*p++ + g();

has the same potential problems for update of p as
regards the use of p in g().
If g is called before f and we assume annex C for the sequence point at g
being called then is it required that p++ either not have started or
have completed? If so, what wording in the standard requires it and
cannot be interpreted in any other way?

Unfortunately the Standard isn't very good at explaining
what happens with sequence points in the presence of
function calls. The DR's explain the case where both
accesses are inside functions (unspecified, not undefined,
behavior). As far as I know the particular case where
one access is inside a function and another access is
parallel to a call on that function isn't specifically
addressed.
If we take the meaning that the sequence point at g occurs before the
call to g then the first sequence point after the p++ can be at the
printf (Line A) in g. Does this force the side effects of p++ to have
completed. If we remove Line A then does this make this have undefined
behaviour because the p++ (assuming it has started) and the p=x; both
complete at the same sequence point?

Whether the first access in g() is *p or p=x doesn't affect the
undefinedness of behavior; if p=x is undefined then so is *p,
and vice versa.
I would have expected that the standard authors intended the main
function to have defined behaviour (assuming well behaved f() and g() )
regardless of what f and g might have done. But the _next_ sequence
point to p++ that any implementation can be sure about is the one before
the call to f. (I think that the compiler can deduce that there must be
a sequence point inside g)

The DR's make it clear that sequence points before function
calls behave differently than other sequence points.
Unfortunately the DR's don't specifically address the
question of one access inside a function and another
access parallel to a call on that function. Time to
write a new DR.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,338
Messages
2,571,777
Members
48,578
Latest member
fruitionskin

Latest Threads

Top