Nobody said:
The problem is that the compiler is *not* free to do this (as far as I can
see). Surely clearing the buffer *is* a side effect?
That seems to me to be what C99 says:
5.1.2.3 Program execution
...
[#2] Accessing a volatile object, modifying an object,
modifying a file, or calling a function that does any of
those operations are all side effects,10) which are changes
in the state of the execution environment. Evaluation of an
expression may produce side effects. At certain specified
points in the execution sequence called sequence points, all
side effects of previous evaluations shall be complete and
no side effects of subsequent evaluations shall have taken
place. (A summary of the sequence points is given in annex
C.)
This implies that all writes are side-effects, as are reads of volatile
objects.
If buffer originally contained non-zero values, and another thread
was monitoring its contents via a "volatile char *", it should be
guaranteed to see the elements being cleared in ascending order, with
buffer_ready only being set after all elements of buffer were cleared.
The statement about seeing elements cleared in ascending order is
wrong. Even under the most stringent reading of 5.1.2.3 p 5 and the
description of volatile in 6.7.3 p 6,
I never mentioned 5.1.2.3 p 5 or 6.7.3 p 6. I did mention 5.1.2.3 p 2,
which you decline to address.
Maybe I'm misinterpreting it; if you think so, say so (saying *why* would
also be useful).
Yes, 5.1.2.3 p 2 certainly bears on the discussion, and it would
be good to address it.
It's important to understand, when considering how 'volatile' behaves,
that there are two "machines" under consideration: the physical
machine, and the abstract machine.
The physical machine is the computer as we experience it in our
program and how they behave. (Note: I'm speaking as though there is
only one persective on a physical machine, but in actuality there are
(at least) several. I'm going to ignore these distinctions for the
moment.) A physical machine always does /something/ -- possibly only
probabilistically, but still something -- and we can find out what it
does through experimentation. The physical machine exists in the
physical universe, and we can discover what it does in different
situations.
The abstract machine is a conceptual notion; it has no physical
existence but "exists" mainly in the minds of implementors. The
abstract machine is sort of a mathematical tool for defining
behavior -- C is defined in terms of how the "abstract machine"
behaves, not how a physical machine behaves.
The first and most important point of contact between the abstract
machine and the physical machine is the so-called "as-if" rule.
What this rule says, basically, is that the physical machine can
do anything at all, as long as the 'outputs' of a program match
what would happen if the physical machine and abstract machine
were always in lock step agreement.
The second point of contact between the abstract machine and
the physical machine is volatile-qualified access. Basically,
using volatile places additional restrictions on how aligned
(or unaligned) the abstract machine and the physical machine
may be.
The question you raised (about another thread monitoring the state of
different elements in the 'buffer' array) is concerned with the
physical machine. The reason for this is, the abtract machine
concerns only what happens /inside/ an implementation, so what happens
for another thread is determined not by the abstract machine but by
the physical machine. Threads are not a part of C; the Standard
doesn't say anything about them (at least not directly).
The paragraph you mention (5.1.2.3 p 2) imposes a requirement on the
/abstract/ machine, not on the /physical/ machine. In the abstract
machine the writes to 'buffer
' must occur before the subsequent
assignment to 'buffer_ready'. However, they don't have to actually
occur that way in the physical machine. In fact, frequently they
don't, because (to name one example) stores done in a particular order
can be rearranged by the memory management unit. The stores are /in
order/ as seen by the abstract machine, but /out of order/ as seen by
the actual memory -- that is, the physical machine of the other
thread.
So, what the other thread sees has to match what the abstract machine
does (as explained in 5.1.2.3 p 2) /only if/ the physical machine is
required to match the abstract machine through additional requirements
that occur because of using 'volatile'. Because (as I explained
earlier) the use of 'volatile' in the example is not enough to make
the 'buffer' writes in the /abstract/ machine match up with what
happens in the /physical/ machine, in the physical machine (which is
what the other thread sees) those writes can happen in any order.
Does that all make sense?
the stores into buffer are
not guaranteed to occur in any particular order, because the
assignements to buffer are not made through a volatile-qualified
type.
5.1.2.3 p2 (which no-one seems to want to mention) seems to imply that
volatile makes no difference to writes, only to reads:
[#2] Accessing a volatile object, modifying an object, ...
are all side effects ...
At certain specified
points in the execution sequence called sequence points, all
side effects of previous evaluations shall be complete and
no side effects of subsequent evaluations shall have taken
place.
Modifying an object is a side-effect, and side-effects are supposed to
have completed at the end of an expression statement (e.g. "buffer=0;").
AFAICT, most of the problems with "volatile" appear to rely upon ignoring
5.1.2.3 p2, which may be why everyone seems to avoid mentioning 5.1.2.3 p2.
Again, 5.1.2.3 p 2 is talking only about the abstract machine, not
about the physical machine. Using 'volatile' doesn't affect what
happens in the abstract machine (except for the two special cases
named explicitly in the Standard, setjmp/longjmp and signal
handlers). Using 'volatile' does impose additional requirements on
how and where the physical machine and the abstract machine must be
in alignment, but those requirements do not extend to imposing
5.1.2.3 p 2 in each previous statement (that doesn't use a
volatile-qualified access) before a volatile access. There are
different opinions about just how lax or how strict these additional
requirements are, but even in the most strict interpretation it's
only required that all the assignments to 'buffer' be completed
before the store into the (volatile) buffer_ready; the previous
stores don't have to be done in any particular order in the
/physical/ machine, even though they must occur in a particular
order in the /abstract/ machine.
Furthermore 5.1.2.3 p3 says:
[#3] In the abstract machine, all expressions are evaluated
as specified by the semantics. An actual implementation
need not evaluate part of an expression if it can deduce
that its value is not used and that no needed side effects
are produced (including any caused by calling a function or
accessing a volatile object).
IOW, if an implementation wishes to elide any side-effects as "unneeded",
the onus is on the implementation to deduce that the side-effects really
are unneeded (e.g. if the value isn't used inside the translation unit and
there is no way it could be used from outside of the translation unit).
In a sense this paragraph is just a special case of the "as if"
rule -- in the abstract machine certain operations are required
to happen, and in a particular order, but in the physical machine
they don't have to happen in that order, or even happen at all,
/provided/ the end result is "as if" they happened as the abstract
machine would do them.
Note that using 'volatile' either would, or might, (some people
would say "would", others would only say "might") force some
expressions to be evaluated that could remain unevaluated if
'volatile' weren't used. (I think most people would say "would",
and personally I believe that's the most defensible interpretation.
However I don't want to dismiss the considered statements of
those who have expressed the less restrictive viewpoint here.)