By translation, I mean compiling. What right would any implementation
have to refuse to compile the code? I don't mean run-time.
The implementation could reboot the computer during evaluation of the
second statement, noting multiple writes to the same object.
Do they? It might be syntactically valid, but could a compiler
determine that there are multiple attempts to write to 'i' within the
same expression and the same sequence point bounds? Could it make that
determination at compile-time (translation-time)? It seems like that
would be an easy determination to make. Then a compiler could stop
compilation and output an error, such as "Violation of C99 6.5p2." Then
the code needn't compile.
If a 'volatile' access is implementation-defined and operand evaluation
order and side effect order are unspecified, how can the run-time
behaviour be within the scope of the Standard? We know which side
effects are warranted, but what else?
arr[arr[0]-1] could be arr[0] meaning the other
arr[arr[0]] references don't have a compile time defined behaviour
[hence it's UB].
I don't follow you here. What is it that's missing in order for
translation behaviour to be well-defined? If you are suggesting that
the run-time behaviour is dependent on factors outside of the scope of
the Standard, then that's part of what I was asking about, for sure.
I think that's what you're suggesting, here.
What is missing here is the standard does not prescribe the order of
evaluation of the individual terms.
But can that impact compilation (translation)?
You meant '-' rather than '+', but I follow you.
What is the difference between these two examples? It seems you are
suggesting a temporal order as the difference. What about:
A1 = ++i and A2 = i++ (same time)
B = i
j = B * A1 - A2
That might be worse than the other two examples, since the two writes
happen at the same time, which might lead to very odd consequences.
Furthermore, even if it were valid [which I contend it's not] it's
horrible and should not in any form be encouraged.
I guess we would have to come to an agreement about "valid" first,
then proceed with agreements from there.
Well it's syntactically valid code, it will compile, but the standard
doesn't prescribe the BEHAVIOUR of the code.
There are different bits we can call "behaviour", including behaviour
during compilation and behaviour during execution. I believe that you
are referring to the execution behaviour, here.
One possible use is to attempt to determine characteristics about an
implementation which are not required to be documented. A battery of
tests that exercise potentially awkward situations can possibly reveal
some statistics that could be used towards Bayesian inferences about
behaviour to rely upon. There're no guarantees, so such tests might
give the next best thing, when an implementation's vendor is not
interested or no longer available. Who will do this? Maybe nobody.
Other uses are possible, but I don't think it matters. We can invent
uses on-the-fly, so "horrible" is subject to change, or is a predicate
in certain contexts such as "in most cases". That's fine by me.
Well you wouldn't have to guess what is UB or not UB if you read the
damn spec. It specifically denotes things like that [lacking a
sequence point] as causing UB.
We have a disconnect. The "guessing" was about the implementation's
design decisions and operational characteristics, not about a C
Standard. Sorry for the confusion.
Basically the rule is simple, if a single statement [more or less]
uses a variable [or through an array potentially the same variable]
multiple times with modification it's likely to be UB.
Right.
so just like i = ++i * i--; is likely to be a bad idea so is i =
arr[0]++ * --arr[0]; or i = arr[arr[0]]++ * --arr[arr[0]]; ...
A bad idea indeed and in general, sure.
That is indeed what I meant. Thanks for helping to clarify where my
terminology was failing.
[...]
What is missing here is the standard does not prescribe the order of
evaluation of the individual terms. Even in something like
j = ++i * i - i++;
What's missing is that it's not just about the order of evaluation.
The behavior is completely undefined; it's not just a choice among
the possible orders of evaluation.
[...]
Well it's syntactically valid code, it will compile, but the standard
doesn't prescribe the BEHAVIOUR of the code.
It won't necessarily compile. See the note under the standard's
definition of "undefined behavior":
NOTE Possible undefined behavior ranges from ignoring the
situation completely with unpredictable results, to behaving
during translation or program execution in a documented manner
characteristic of the environment (with or without the issuance
of a diagnostic message), to terminating a translation or
execution (with the issuance of a diagnostic message).
[...]
Well I don't quite see how the compilation behaviour could be undefined.
How can the compiler make the determination that the original post's
code is a violation of 6.5p2? By completely discarding 'volatile' and
assuming that the array values are what they were at the last sequence
point?
If we used a user-input value as an index, would that leave a chance for
undefined behaviour during compilation? That was roughly one of the
ideas for the code example; a determination that cannot be made at
compile-time.
If 'volatile' can be discarded, then perhaps a compiler could refuse to
compile because it determines a violation of 6.5p2. Is that really
"allowed"? It seems like it is, but seems like a fair question to ask.
Fair enough. But that's mostly a way of saying "the compiler ain't
broke so stop trying to use that code."
However, I think most compilers will compile said UB code just fine.
The output code will be unpredictable garbage but it'll translate just
fine.
What makes it undefined behaviour, exactly? In the original post, we
have the assumption that no factor modifies the 'arr' array. With that
assumption, it's a clear violation of 6.5p2 and thus undefined
behaviour. Is that what you are referring to? Or are you suggesting
the it's undefined behaviour by the Standard, regardless of any
assumptions or factors?
I was mostly trying to correct the point that just because it compiles
doesn't mean it's not UB.
Where was that point that you intended to correct in the thread? I
think that perhaps it was just a miscommunication due to terminology.
Translation-time/compile-time versus execution-time/run-time.
Thanks, Mr. T. St. Denis.