J
Joshua Maurice
Ok. This is coming from a recent thread in comp.programming.threads,
talking about C++0x. (You don't need to review it. I'll provide the
relevant context here.)
"questions about memory_order_seq_cst fence"
http://groups.google.com/group/comp.programming.threads/browse_thread/thread/0158f1f2bca1019e#
Consider the following code.
/*** code 1 ***/
// Initially
atomic<int> x(0), y(0);
// Thread 1:
y.store(1, memory_order_release);
atomic_thread_fence(memory_order_seq_cst); //fence 1
r1 = x.load(memory_order_acquire);
// Thread 2:
x.store(1, memory_order_release);
// Thread 3:
r2 = x.load(memory_order_acquire);
atomic_thread_fence(memory_order_seq_cst); //fence 3
r3 = y.load(memory_order_acquire);
/***************/
Q1. First, am I right in concluding that in the above code, all of the
loads and stores might as well be memory_order_relaxed, right?
Q2. Second, is the following end condition possible?
r1 == 0 && r2 == 1 && r3 == 0
Let's try a simple proof.
Sequentially consistent memory operations exist in a single global
total order S. (29.3 / 3)
Thus fence 1 precedes fence 2 in S, or fence 2 precedes fence 1 in S.
Case 1 - fence 1 precedes fence 3 in S.
I believe that 29.3/6 means, in my own translation:
[]
If atomic write A on M is sequenced-before seq_cst fence X, and
seq_cst fence X precedes seq_cst fence Y in the total order S, and
seq_cst fence Y is sequenced-before atomic read B on M,
then B sees the value of the write A, or B sees a write later in the
modification order.
[/]
We know that:
"y.store(1, memory_order_release);" is sequenced-before fence 1,
and
fence 1 precedes fence 3 in S, and
fence 3 is sequenced-before "r3 = y.load(memory_order_acquire);"
Thus we can use 29.3/6 to conclude: r3 sees the value from "y.store(1,
memory_order_release);", or it sees a writer later in the modification
order. There is no such later write, and thus r3 must be 1.
Case 2 - fence 3 precedes fence 1 in S.
Now, if we could prove that NOT (r1 == 0 && r2 == 1), then we would be
done with the proof overall. The difficulty is in trying to prove
this.
The most relevant text appears to be 29.3/7. The exact text from n3242
is:
[]
For atomic operations A and B on an atomic object M, if there are
memory_order_seq_cst fences X and Y such that A is sequenced before X,
Y is sequenced before B, and X precedes Y in S, then B occurs later
than A in the modification order of M.
[/]
There is a typo, or at least an ambiguity. Only writes (or read-modify-
write operations) appear in a modification order. Reads do not. I
initially thought that the intention was the following reading:
[]
For atomic >>write (and/or atomic read-modify-write operations)<< A
and B on an atomic object M, if there are memory_order_seq_cst fences
X and Y such that A is sequenced before X, Y is sequenced before B,
and X precedes Y in S, then B occurs later than A in the modification
order of M.
[/]
With that understanding, we cannot prove NOT (r1 == 0 && r2 == 1) with
just 29.3/7.
Anthony Williams in the comp.programming.threads (who is apparently on
the concurrency working group), initially had a different
interpretation. I just reread it several times, and I'm not quite sure
what he's trying to say. He initially thought that you can use it
directly to prove that NOT (r1 == 0 && r2 == 1).
I /think/ I might be able to phrase Anthony's interpretation in the
following way. Here's Q3:
For
seq_cst fences X and Y,
atomic reads A and B on atomic object M,
if
A sees the value from an atomic write C, and
B sees the value from an atomic write D (which may be C), and
A is sequenced-before X, and
X precedes Y in the total order S on seq_cst operations, and
Y is sequenced-before B,
then
Can D precede C in the modification order of M?
I /think/ Anthony is interpreting 29.3/7 as meaning that: D must be
the same write as C, or C must precede D in the modification order of
M. If we have that, then we can prove that NOT (r1 == 0 && r2 == 1).
Is that the intent of 29.3/7? Should we submit this as a defect?
PS: As normal please forgive me for any mistakes or typos, and please
point them out. Non-sequentially consistent stuff is /really hard/ to
reason about.
talking about C++0x. (You don't need to review it. I'll provide the
relevant context here.)
"questions about memory_order_seq_cst fence"
http://groups.google.com/group/comp.programming.threads/browse_thread/thread/0158f1f2bca1019e#
Consider the following code.
/*** code 1 ***/
// Initially
atomic<int> x(0), y(0);
// Thread 1:
y.store(1, memory_order_release);
atomic_thread_fence(memory_order_seq_cst); //fence 1
r1 = x.load(memory_order_acquire);
// Thread 2:
x.store(1, memory_order_release);
// Thread 3:
r2 = x.load(memory_order_acquire);
atomic_thread_fence(memory_order_seq_cst); //fence 3
r3 = y.load(memory_order_acquire);
/***************/
Q1. First, am I right in concluding that in the above code, all of the
loads and stores might as well be memory_order_relaxed, right?
Q2. Second, is the following end condition possible?
r1 == 0 && r2 == 1 && r3 == 0
Let's try a simple proof.
Sequentially consistent memory operations exist in a single global
total order S. (29.3 / 3)
Thus fence 1 precedes fence 2 in S, or fence 2 precedes fence 1 in S.
Case 1 - fence 1 precedes fence 3 in S.
I believe that 29.3/6 means, in my own translation:
[]
If atomic write A on M is sequenced-before seq_cst fence X, and
seq_cst fence X precedes seq_cst fence Y in the total order S, and
seq_cst fence Y is sequenced-before atomic read B on M,
then B sees the value of the write A, or B sees a write later in the
modification order.
[/]
We know that:
"y.store(1, memory_order_release);" is sequenced-before fence 1,
and
fence 1 precedes fence 3 in S, and
fence 3 is sequenced-before "r3 = y.load(memory_order_acquire);"
Thus we can use 29.3/6 to conclude: r3 sees the value from "y.store(1,
memory_order_release);", or it sees a writer later in the modification
order. There is no such later write, and thus r3 must be 1.
Case 2 - fence 3 precedes fence 1 in S.
Now, if we could prove that NOT (r1 == 0 && r2 == 1), then we would be
done with the proof overall. The difficulty is in trying to prove
this.
The most relevant text appears to be 29.3/7. The exact text from n3242
is:
[]
For atomic operations A and B on an atomic object M, if there are
memory_order_seq_cst fences X and Y such that A is sequenced before X,
Y is sequenced before B, and X precedes Y in S, then B occurs later
than A in the modification order of M.
[/]
There is a typo, or at least an ambiguity. Only writes (or read-modify-
write operations) appear in a modification order. Reads do not. I
initially thought that the intention was the following reading:
[]
For atomic >>write (and/or atomic read-modify-write operations)<< A
and B on an atomic object M, if there are memory_order_seq_cst fences
X and Y such that A is sequenced before X, Y is sequenced before B,
and X precedes Y in S, then B occurs later than A in the modification
order of M.
[/]
With that understanding, we cannot prove NOT (r1 == 0 && r2 == 1) with
just 29.3/7.
Anthony Williams in the comp.programming.threads (who is apparently on
the concurrency working group), initially had a different
interpretation. I just reread it several times, and I'm not quite sure
what he's trying to say. He initially thought that you can use it
directly to prove that NOT (r1 == 0 && r2 == 1).
I /think/ I might be able to phrase Anthony's interpretation in the
following way. Here's Q3:
For
seq_cst fences X and Y,
atomic reads A and B on atomic object M,
if
A sees the value from an atomic write C, and
B sees the value from an atomic write D (which may be C), and
A is sequenced-before X, and
X precedes Y in the total order S on seq_cst operations, and
Y is sequenced-before B,
then
Can D precede C in the modification order of M?
I /think/ Anthony is interpreting 29.3/7 as meaning that: D must be
the same write as C, or C must precede D in the modification order of
M. If we have that, then we can prove that NOT (r1 == 0 && r2 == 1).
Is that the intent of 29.3/7? Should we submit this as a defect?
PS: As normal please forgive me for any mistakes or typos, and please
point them out. Non-sequentially consistent stuff is /really hard/ to
reason about.