C Test Incorrectly Uses printf() - Please Confirm

S

Shao Miller

pete said:
Yes;
it doesn't matter whether or not the undefined value
of the extra expression is shown.
It's the evaluation of the extra expression
which makes the program be undefined.
Some other people have agreed to this, too. Will someone please explain
why 'a + 5' has an undefined value? The value of 'a' is guaranteed
intact between the last sequence point and the sequence point before the
call to 'printf'. Furthermore, no expression in the list of arguments
attempts to simultaneous modify 'a' and read 'a' for some purpose
unrelated to the modification.

'++a' and 'a + 5' are separate expressions.

The value of 'a' is intact until the sequence point. 'a' is not
'volatile', since we can see its declaration.

How about reading 'n1256.pdf', 6.5.3.1,p2:

"The expression ++E is equivalent to (E+=1)."

Then on to 6.5.16,p4:

"...If an attempt is made to modify the result of an assignment operator
or to access it after the next sequence point, the behavior is undefined.
"

It says "after" there, not "before".
 
S

Shao Miller

christian.bau said:
Absolutely wrong.
Please read the post which the inner post was a response to. Please
address deficiencies in interpretation of referenced text therein.
There is a sequence point between the end of the preceeding statement
and the printf statement. Agreed.

After that sequence point, the arguments to
the printf call and the printf function specifier are evaluated. Agreed.

Then
comes another sequence point, followed by the actual function call. Agreed.

The problem is that during the argument evaluation the object "a" is
modified,
I disagree. The object's value is intact between the previous sequence
point and the one before the function call. See the other post for
references.
and it is accessed without using the accessed value to
determine the new value
I disagree. '++a' is defined as equivalent to '(a+=1)'. '(a+=1)'
differs from 'a=a+1' in that the value of 'a' is only read once. The
accessed value is used to determine the new value. See the prefix
increment operator definitions for details. See the compound assignment
definitions for details.
(a parameter "a = a + 1" would be Ok, because
the access is used to determine the new value).
'++a' is very nearly identical to 'a = a + 1', with the difference noted
just above.
This combination of
modification and access leads to undefined behaviour, as the C
Standard expressly says.
Which combination, exactly? In which section does it expressly say that,
if you please?
Once we have undefined behaviour,
Well do we? Which access leads to the undefined behaviour, exactly?
we need not bother looking at which
function is called and what that function does.
If we agreed to undefined behaviour, I'd agree with this.
So what could go
wrong? According to the rules of the C language the compiler can
_assume_ that you don't do anything that leads to undefined behaviour.
In this case, the compiler can _assume_ that a is not modified,
The side effect of modification does not occur until just before the
sequence point for the 'printf' call. The compiler can assume that with
confidence. Would you agree?
even
though it is perfectly clear that a _is_ modified.
Agreed. Before 'printf' is called, the value of 'a' is modified.
> So an optimiser can
for example remove the code that updates the memory location
containing a. So a second printf statement printf ("%d\n", a); could
again print the original value 1.
The modification of the value of 'a' should not be skipped past sequence
points in a conforming implementation. Would you agree?
 
S

Shao Miller

pete said:
What is the value of (a + 5) ?
6 for a conforming implementation. 6 or 7 for a non-confrming one. In
a conforming implementation, the value of 'a' (which is non-'volatile')
is not changed until the sequence point immediately before the function
call to 'printf'.

No evaluation of any of the multiple expressions in the argument list
leads to undefined behaviour, so the integrity of the abstract machine's
operation in also intact. Would you agree that the expressions in the
argument list are not a single expression (although it might look like
an expression with multiple comma operators, for example)?

Thanks, pete. :)
 
K

Keith Thompson

Shao Miller said:
6 for a conforming implementation. 6 or 7 for a non-confrming one. In
a conforming implementation, the value of 'a' (which is non-'volatile')
is not changed until the sequence point immediately before the function
call to 'printf'.

Where does the standard say that the side effect of incrementing a must
not occur until the sequence point? (Hint: It doesn't.)
No evaluation of any of the multiple expressions in the argument list
leads to undefined behaviour, so the integrity of the abstract machine's
operation in also intact. Would you agree that the expressions in the
argument list are not a single expression (although it might look like
an expression with multiple comma operators, for example)?

The expression in question is the entire function call.
By themselves, the subexpressions ++a and a + 5 are well defined.
But because they appear together as part of a larger expression
with no intervening sequence points, the behavior is undefined.

Once again, 6.5p2:

Between the previous and next sequence point an object shall
have its stored value modified at most once by the evaluation
of an expression. Furthermore, the prior value shall be read
only to determine the value to be stored.

The printf violates the second sentence; the second subexpression a +
5 reads the value of a, but not to determine the value to be stored.

That second sentence can be confusing (it confused me for a long
time). The point is that if an object is read just to determine
the value to store in that same object, there's no problem, since
the read logically has to occur before the write. But if the object
is read in a part of the expression that doesn't contribute to the
computation of the value to be stored, then the read and the write
could occur in either order, at the whim of the compiler.

The authors of the standard *could* have placed some limitations
on how such expressions are evaluated, saying, for example, that
either ++a must be evaluted (including its side effect) before a +
5, or vice versa. But that would constrain the optimizations that
compilers are able to perform and make the standard more complex.
And it's generally easy enough to avoid writing such code in the
first place.
 
E

Eric Sosman

[...]
"...At certain specified points in the execution sequence called
sequence points, all side effects of previous evaluations shall be
complete and no side effects of subsequent evaluations shall have taken
place."

The object designated by 'a' has its stored value modified not before
the sequence point just before calling the function. The side effects
are effectively coalesced at that point in time, by this definition.

No, the object designated `a' *definitely* has its stored value
modified before the sequence point that precedes the function call.
Have you confused "before" and "after," or are you smoking something?
In any event, it makes no never-mind: The damage is done before any
sequence point intervenes to prevent it, and that's all she wrote.
The argument list in a function call is a comma-separated list of
expressions to be evaluated, not a single expression (the syntax of
6.5.2,p1 and the semantics of 6.5.2.2,p3). 6.5,p2 states "an expression"
rather than "an expression or a comma-separated list of expressions".

Yes, but the Standard speaks of "sequence points," not of syntactic
forms. There is no sequence point associated with the , that divides
argument expressions from each other. (Do not confuse the , separator
with the , operator.)
I do not agree that there is no undefined behaviour. I believe that the
correct answer is (b).

I think you've mis-counted the parity of your negatives, because
your first sentence (as written) contradicts the second (as written).
The value of 'a' is stable between its initializing declaration and the
instant before 'printf' is called. No single expression attempts to
modify the value and read the value for some other purpose. Would you
agree?

No, absolutely not. With the two arguments `++a, a+5', the value
of `a' is *un*stable, unstable to the point where (in theory) it need
not even exist.
 
E

Eric Sosman

6 for a conforming implementation. 6 or 7 for a non-confrming one. In a
conforming implementation, the value of 'a' (which is non-'volatile') is
not changed until the sequence point immediately before the function
call to 'printf'.

No. The Standard localizes side-effects to the intervals[*] between
sequence points, but no more tightly. The side-effect can occur at any
moment between the preceding and succeeding SP's, or may even be spread
out over a lengthy span between them. There is no requirement that all
side-effects occur at the "barrier" of the succeeding SP, no more than
there is a requirement for them all to occur just after the "admitter"
of the preceding SP.

[*] Note that the "net" of sequence points need not be linear, so
the notion of "interval" pertains only to a particular traversal. In
the situation at hand there is only one branch, but other situations
may be more intricate.
No evaluation of [... fantasy snipped ...]
 
E

Eric Sosman

Eric said:
[...]
Is it also undefined behaviour for:

int i = 1;
i = i + i;

No. The Standard says "shall be read," not "shall be read
exactly once."
I think I understand. So in:

int i = 1, j;
i = j = i + i * 9 + 3;

we are likewise protected because we can conclude that sooner or later
within some expression, we are modifying the stored value of 'i'. Is
that right?

Sorry; I do not see what you're driving at.
 
S

Shao Miller

Keith said:
Where does the standard say that the side effect of incrementing a must
not occur until the sequence point? (Hint: It doesn't.)
Thanks for pointing this out, Keith! :) This is a critical flaw to my
argument.

"...At certain specified points in the execution sequence called
sequence points, all side effects of previous evaluations shall be
complete and no side effects of subsequent evaluations shall have taken
place."

This does not imply that side effects are prohibited between sequence
points and only completed as a sequence point is passed.
The expression in question is the entire function call.
By themselves, the subexpressions ++a and a + 5 are well defined.
But because they appear together as part of a larger expression
with no intervening sequence points, the behavior is undefined.
This is another critical flaw to my argument. Treating the entirety of
'printf("%d", ++a, a + 5)' as an expression qualifies it as a single
expression for the treatment of:

"Between the previous and next sequence point an object shall have its
stored value modified at most once by the evaluation of an expression..."

This particular argument you offer requires that:

For any expression, if a side effect during evaluation of that
expression would modify the value of an object, then the previous value
of that object shall be read only as part of a computation for the new
value of the object.

In fact, that's nearly what the text states, and why we could expect:

int i = 1, b;
b = ++i + i;

to be undefined. But please also recognize the possible implication of
"only". Does this prevent a larger expression with the same sequence
point boundaries from doing anything else but a modification of the
object? This larger expression can have other side effects.

So an interpretation could be that the value cannot be used outside of
the sub-expression which directly provides the value to be stored. Then
the latter line of:

int i = 1, j;
i = j = i + i;

would be grouped as:

i = (j = (i + i));

and we are fine in that 'i' is only read within the sub-expression
immediately responsible for the new value. We should not:

int i, j;
/* Uh oh */
j = (i + (i = (1)));

because there's a read of 'i' outside of the the sub-expression for the
new value. Would you agree? :)

int i = 1, j = 1, k = 1;
/* Fine */
k = ((j = ((i = (i + j + k)) + j + k)) + k);

int i = 0, j;
int *ip = &i;
/* Uh oh, 'i' reads outside of the sub-expression */
*(ip + i) = (i + 5);
*(ip + (j = i)) = (i + 5);

Would you agree? :)
Once again, 6.5p2:

Between the previous and next sequence point an object shall
have its stored value modified at most once by the evaluation
of an expression. Furthermore, the prior value shall be read
only to determine the value to be stored.

The printf violates the second sentence; the second subexpression a +
5 reads the value of a, but not to determine the value to be stored.

That second sentence can be confusing (it confused me for a long
time). The point is that if an object is read just to determine
the value to store in that same object, there's no problem, since
the read logically has to occur before the write. But if the object
is read in a part of the expression that doesn't contribute to the
computation of the value to be stored, then the read and the write
could occur in either order, at the whim of the compiler.
Agreed. lacos and Willem explained how simultaneous reads and writes
could be a problem.
The authors of the standard *could* have placed some limitations
on how such expressions are evaluated, saying, for example, that
either ++a must be evaluted (including its side effect) before a +
5, or vice versa. But that would constrain the optimizations that
compilers are able to perform and make the standard more complex.
That would be a Bad Thing. :)
And it's generally easy enough to avoid writing such code in the
first place.
Agreed. I appreciate your explanatory efforts here, Keith. :)
 
S

Shao Miller

Eric said:
[...]
"...At certain specified points in the execution sequence called
sequence points, all side effects of previous evaluations shall be
complete and no side effects of subsequent evaluations shall have taken
place."

The object designated by 'a' has its stored value modified not before
the sequence point just before calling the function. The side effects
are effectively coalesced at that point in time, by this definition.

No, the object designated `a' *definitely* has its stored value
modified before the sequence point that precedes the function call.
Have you confused "before" and "after,"
Thanks, Eric. :) No. That particular argument was based on the faulty
assumption that side effects were coalesced at sequence point boundaries
and that evaluations must be based on the set of values yielded by the
last sequence point before any side effects are possible. This can
hardly be justified, given:

"Accessing a volatile object, modifying an object, modifying a file, or
calling a function that does any of those operations are all side
effects, which are changes in the state of the execution environment.
Evaluation of an expression may produce side effects..."

I apologize for the confusion. :)
or are you smoking something?
In any event, it makes no never-mind: The damage is done before any
sequence point intervenes to prevent it, and that's all she wrote.
Ok.


Yes, but the Standard speaks of "sequence points," not of syntactic
forms. There is no sequence point associated with the , that divides
argument expressions from each other. (Do not confuse the , separator
with the , operator.)
I'd already recommended that, so definitely agree that the two should
not be confused.

The trick here is in treating those sub-expressions as part of a larger
expression to which 6.5,p2 applies, and not each on their own. :) If
each sub-expression were mandated to execute in _a_ sequence (rather
than in parallel), then though we mightn't know what 'a + 5' gets, it
would get _something_. That something would then be discarded in the
'printf'. There doesn't appear to be such a mandate, does there? :)
I think you've mis-counted the parity of your negatives, because
your first sentence (as written) contradicts the second (as written).
I've noticed that some posters give responses to posts where they are
potentially disadvantaged by not having read more than the one which
they wish to reply to. You think I've mis-counted, I think I corrected
it long before your response. No sweat. ;)
No, absolutely not. With the two arguments `++a, a+5', the value
of `a' is *un*stable, unstable to the point where (in theory) it need
not even exist.
Ok. Thanks again, Eric. :) Seems pretty convincing for answer (d) now,
I'll happily admit. Heh.
 
S

Shao Miller

Eric said:
Eric said:
On 8/8/2010 4:47 PM, Shao Miller wrote:
[...]
Is it also undefined behaviour for:

int i = 1;
i = i + i;

No. The Standard says "shall be read," not "shall be read
exactly once."
I think I understand. So in:

int i = 1, j;
i = j = i + i * 9 + 3;

we are likewise protected because we can conclude that sooner or later
within some expression, we are modifying the stored value of 'i'. Is
that right?

Sorry; I do not see what you're driving at.
And I am sorry for not explaining with the detail you require. :)

What I was trying to ask was if 'i' in the last code line given above is
used only to determine the value to be stored in 'i'. It seems that a
compiler could make that determination by noting that within that
expression, 'i' is modified.

That is, it needn't be the case that both of "'i' is read only to
determine the value to be stored in 'i'" and "'i' is not read to
determine the value to be stored in any other object" are defined.

As in, we can assign to 'j' and even though that's a side effect of its
own, the reading of 'i' is still part of a larger computation with
responsibility for determining the new value for 'i'.

Does that make sense?
 
S

Seebs

No, the correct answer is (b). Answer (d)
would apply where the printf arguments were
the other way round.

Not so.

There is no sequence point between "++a" and "a + 5".

Therefore the behavior is undefined. Not just the result of the
expression; the *behavior*. The compiler can reject the program, or
compile it into code which dumps core, or ANYTHING ELSE.

-s
 
S

Shao Miller

pete said:
Shao said:
6 for a conforming implementation.
6 or 7 for a non-confrming one.
In a conforming implementation,
the value of 'a' (which is non-'volatile')
is not changed until the sequence point
immediately before the function call to 'printf'.

There's no ordering of evaluations or side effects
in between sequence points.

N869
5.1.2.3 Program execution

[#16] EXAMPLE 7 The grouping of an expression does not
completely determine its evaluation. In the following
fragment
#include <stdio.h>
int sum;
char *p;
/* ... */
sum = sum * 10 - '0' + (*p++ = getchar());
the expression statement is grouped as if it were written as
sum = (((sum * 10) - '0') + ((*(p++)) = (getchar())));
but the actual increment of p can occur at any time between
the previous sequence point and the next sequence point (the
;), and the call to getchar can occur at any point prior to
the need of its returned value.
Thanks, pete. :)

The difference would be treating each of the arguments as its own
expression. In that example, each of the operators is either unary or
binary, so the whole expression is a tree with at most two branches at
any branching. With a function call, the arguments can be N branches at
the branch-point, and they are all grouped together for theoretically
simultaneous evaluation.

A argument of mine went: Each sub-expression for each argument is where
the constraints regarding reading previous values is. If you discard
that and say the the constraint is on the whole expression, well-defined
behaviour softly and suddenly vanishes away.
 
E

Eric Sosman

Eric said:
Eric Sosman wrote:
On 8/8/2010 4:47 PM, Shao Miller wrote:
[...]
Is it also undefined behaviour for:

int i = 1;
i = i + i;

No. The Standard says "shall be read," not "shall be read
exactly once."

I think I understand. So in:

int i = 1, j;
i = j = i + i * 9 + 3;

we are likewise protected because we can conclude that sooner or later
within some expression, we are modifying the stored value of 'i'. Is
that right?

Sorry; I do not see what you're driving at.
And I am sorry for not explaining with the detail you require. :)

What I was trying to ask was if 'i' in the last code line given above is
used only to determine the value to be stored in 'i'. It seems that a
compiler could make that determination by noting that within that
expression, 'i' is modified.

I still don't see what you're driving at. You seem to be saying
that if `i' were not modified -- if the l.h.s. were `k', say -- then
the behavior would be undefined. That's too absurd even for a fever
dream, so you must mean something else -- but I can't imagine what
that something else might be.
That is, it needn't be the case that both of "'i' is read only to
determine the value to be stored in 'i'" and "'i' is not read to
determine the value to be stored in any other object" are defined.

If the first is true, it implies the second. Beyond that, I still
don't discern your point.
As in, we can assign to 'j' and even though that's a side effect of its
own, the reading of 'i' is still part of a larger computation with
responsibility for determining the new value for 'i'.

What "new value for `i'?" If you're assigning to `j' instead,
there's nothing that would modify `i'. If you're going to hypothecate
a different construct, it might help if you'd exhibit same ...
Does that make sense?

I am, as Anna Russell put it, "as befogged as before."
 
S

Shao Miller

Eric said:
Eric said:
On 8/8/2010 5:39 PM, Shao Miller wrote:
Eric Sosman wrote:
On 8/8/2010 4:47 PM, Shao Miller wrote:
[...]
Is it also undefined behaviour for:

int i = 1;
i = i + i;

No. The Standard says "shall be read," not "shall be read
exactly once."

I think I understand. So in:

int i = 1, j;
i = j = i + i * 9 + 3;

we are likewise protected because we can conclude that sooner or later
within some expression, we are modifying the stored value of 'i'. Is
that right?

Sorry; I do not see what you're driving at.
And I am sorry for not explaining with the detail you require. :)

What I was trying to ask was if 'i' in the last code line given above is
used only to determine the value to be stored in 'i'. It seems that a
compiler could make that determination by noting that within that
expression, 'i' is modified.

I still don't see what you're driving at. You seem to be saying
that if `i' were not modified -- if the l.h.s. were `k', say -- then
the behavior would be undefined. That's too absurd even for a fever
dream, so you must mean something else -- but I can't imagine what
that something else might be.
I'm trying to ask you:
- Does the expression on the last line in the code example have
well-defined behaviour? (I believe it does.)
- If so, the reads of 'i' are being used to determine the value to write
to 'j', right?
- That value computed for 'j' is being used to determine the value to
write to 'i', right?
- The whole line only has a sequence point at the semi-colon, right?
If the first is true, it implies the second. Beyond that, I still
don't discern your point.
In the code, 'i' is being read to determine which, if any, of these:
- The value to be stored in 'j'
- The value to be stored in 'i'
- Both
- Neither

?
What "new value for `i'?" If you're assigning to `j' instead,
there's nothing that would modify `i'. If you're going to hypothecate
a different construct, it might help if you'd exhibit same ...
'i' is being assigned to in that last line of the code example, right?
I am, as Anna Russell put it, "as befogged as before."
Perhaps I can attempt to elaborate based on any kind responses you might
provide to the above. :) I believe that we're in agreement here based
on your earlier points, but do wish to be sure.
 
N

Nick Keighley

yes

what?!

I think the fact that it exhibts Undefined Behaviour is highly
relevent!
The standard is not intended only for compiler writers.  It is, in
effect, a contract between implementers and programmers; as such,
both implementers and programmers can benefit from being familiar
with it.

In practice, the fact that the presented code is ugly and should
never pass a code review, whether its behavior is defined or not,
may be more important than the reasons why its behavior is undefined.
But I'd be interested in knowing why you think that reading the
standard (and no, footnotes aren't relevant here) is "something
only a compiler writer should do.".

I think reading standards is a pretty productive thing to do. And as
these things go, the C standard is pretty good. Its certainly a very
readable document.
 
I

Ike Naar

[ about printf("%d", ++a, a + 5); ]
'++a' and 'a + 5' are separate expressions.
The value of 'a' is intact until the sequence point.

It looks like you are mistaken about the nature of sequence points.
A sequence point is a point in time where nothing happens,
a stability point, a point of rest. It is _not_ a point of action.
Transitions happen _between_ sequence points; at the sequence point
itself, the machine for a moment has come to a halt, all transitions
that happened before the sequence point have completed, and all
transitions that will happen after the sequence point have not yet started.

We could draw a timeline, like this:

activity rest activity rest activity
---------------+--------------------+--------------
seq seq

From what you write, it looks like you think that all the action
happens _at_ sequence points (sorry if I'm understanding you wrong).

In `` printf("%d", ++a, a + 5); '' the first sequence point, S0, is
before the statement starts. The next sequence point, S1, is before the
call to printf. In between those sequence points, the arguments
(the expressions ``"%d"'', ``++a'' and ``a+5'') are evaluated.
Several accesses to ``a'' will happen here: the ``a'' in ``++a''
will be read to obtain its previous value (event ar0),
the ``a'' in ``++a'' will be modified to store
the incremented value (event aw0) and ``a'' in ``a+5'' will be
read to compute the sum of ``a'' and ``5'' (event ar1).

Necessarily, ar0 must happen before aw0, because the value written
at aw0 depends on the value read at ar0. But ar1 can happen at any
time, so, the following interleavings are possible:

ar1,ar0,aw0
--+---------------+--
S0 S1

ar0,ar1,aw0
--+---------------+--
S0 S1

ar0,aw0,ar1
--+---------------+--
S0 S1

That's why the value of the third argument ``a+5'' is not well-defined.
It depends on the order in which aw0 and ar1 occur, and any order is
possible. In standardese: the variable ``a'' is both read and modified
between two sequence points, and the value is read (at ar1) for another
purpose than to compute the value written (at aw0).

When the write does not depend on the read, the implementation has
the freedom to schedule the read and the write in any order it wishes.
 
M

Malcolm McLean

In practice, the fact that the presented code is ugly and should
never pass a code review, whether its behavior is defined or not,
may be more important than the reasons why its behavior is undefined.
But I'd be interested in knowing why you think that reading the
standard (and no, footnotes aren't relevant here) is "something
only a compiler writer should do.".
Because it could lead to people writing code which is strictly
conforming, but squeaks through on a technicality.

However I referred to thumbing through the standard, in this
particular case, to check the precise rules for complex expressions
involving preincrement operators and two instances of the same
variable. Not to reading the standard generally.
 
M

Martin O'Brien

My thanks to you all for your responses. I will abstract the details
into an e-mail and send it to the test-setter. Will you be interested
in seeing the response?
 
N

Nick Keighley

6 for a conforming implementation.  6 or 7 for a non-confrming one.

why is a non-conforming implementaion restricted in this way? What
restricts it? Why can't it yeild 6.02e23 or a-suffusion-of-yellow ?

<snip>
 
E

Ersek, Laszlo

Ersek, Laszlo wrote:
int i = 1;
i = i + i;

could yield undefined behaviour due to the double-read and the single
store.

No, both reads are unavoidably necessary to determine the next value.
(We're talking double reads of the abstract machine, not actual machine
instructions.) See also:

From: Eric Sosman <[email protected]>
Date: Sun, 08 Aug 2010 17:10:49 -0400
Message-ID: <[email protected]>

> Is it also undefined behaviour for:
>
> int i = 1;
> i = i + i;

No. The Standard says "shall be read," not "shall be read
exactly once."

The paragraph quoted at the top allows the following:

- any number of reads and no writes,
- a single write and no reads,
- a single write and any number of reads, but all of those reads must
be required for the computation of the value to be stored.

lacos
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,085
Messages
2,570,597
Members
47,220
Latest member
AugustinaJ

Latest Threads

Top