atomic flag

M

Michael Furman

Jerry Coffin said:
[ ... ]


In the specific case of a boolean, most of these problems are likely to
be covered up by the fact that in a boolean, only one bit is really
significant. The rest of the storage unit in which the boolean is
stored may be modified non-atomically, but it's difficult for the
modification of one bit to be non-atomic.

OTOH, I don't think there's any requirement or guarantee that a boolean
be stored as a single bit -- an implementation might (for example) store
0 and 0xffff for false and true respectively. In this case, if it reads
what's supposed to be a boolean, but happens to contain (for example)
0x00ff, it might decide something is defective, and halt the program
entirely. I'm not sure such an implementation exists, but I'm far from
certain it can't either.

End even more - even in case of one bit, I don't think that there's any
requirement or guarantee, that if you change false->true, every reader
will see sequence "false.false.....true.true..." rather then
"false.false.true.false.true.true.....".
For example I worked with flash memory, that gave some strange values
for some time period after write (actually one bit was flashing and other
bits were undefined).

Regards,
Michael
 
S

SenderX

In fact, if thread 1 writes
succesfully to a variable, and thread 2 immediately reads from the same
variable and sees anything other than the value written by thread 1,
then thread 2 is not accessing the same variable.

:O

Try that on an Itanium...

Ever heard of memory visibility?!?!?!

;(
 
A

Attila Feher

Jeff Schwab wrote:
[SNIP]
Mutexes can be provided without any such instructions. It might
interest you to look at some of the many-splendored implementations of
the "wait" and "signal" semaphore functions. At any rate, you don't
need a mutex to make a write atomic, unless you're implementing a
cache and are using "write" to mean "input to cache."

And as far as portable C++ goes, a write certainly is atomic, since
the standard provides no support for multiple simultaneous threads of
execution. Beyond that, I don't believe I've ever seen a system
where a write was not an atomic operation.

I have only seen such systems. For example on Intel, writing a word sized
thing in C/C++ is not guaranteed to be atomic, since it might happen not to
start on word boundary. But being atomic is more than being
non-interruptable!
In fact, if thread 1
writes succesfully to a variable, and thread 2 immediately reads from
the same variable and sees anything other than the value written by
thread 1, then thread 2 is not accessing the same variable.

And this is exactly what happens in a multi-CPU (SMP) system, if you write
without locking and flushing (memory read/write barriers). Even reading
such thing might need a barrier, depending on the architecture.
A write is atomic by definition.
If you don't believe so, then we mean
different things by "write."

A write does not even happen - by definition - when you think it should.
CPUs might rearrange reads and writes to different areas into a more
preferrable (faster) sequence. Defferred reads and writes they are called,
IIRC. In any case: write (on the C and C++ level) is not atomic and it is
definitely not guaranteed to be atomic in regards to threads. Especially on
SMP. In C and C++ *only* the reading and writing the type sig_atomic_t is
guaranteed to be atomic and *only* in the face of signals.

And I guess I know what I am talking about, because I have implemented
atomic read, write and icrement/decrement etc. in assembly for Sparc as well
as for Intel. And - believe me - it looks pretty different from the code a
C or C++ compiler generates.
 
S

SenderX

Try that on an Itanium...
Ever heard of memory visibility?!?!?!

This applies only to cpus that require barriers.

In order to get it to work you would use simple atomic ops coupled with
acquire/release memory semantics on the shared value.

Mutexs of all sorts have to use some sort of acquire/release barriers:

Lock()
{
int iSpins = 0;

// Atomic, with acquire
while ( CAS_Acquire( &m_lockstate, 0, 1 ) )
{
// Contention point!

// See if we should spin or wait
if ( ( ++iSpins ) < CPU's*2 )
{
// pause opcode

sched_yield();
}

else
{
// Atomic add to kernel waitset, and wait
}
}

// Acquire says we own all the updates, AFTER good cas
}


Unlock()
{
// Atomic, with release
int iRet = CAS_Release( &m_lockstate, 1, 0 );

assert( iRet == 1 );

// Release says we released all the updates we made, BEFORE good cas

// Atomic check for waiters, and wake one
}



For a lock-free version you would simply omit the slow path and treat it as
a traditional CAS loop, and have the CAS work DIRECTLY on your own provided
state.
 
S

SenderX

And I guess I know what I am talking about, because I have implemented
atomic read, write and icrement/decrement etc. in assembly for Sparc as well
as for Intel. And - believe me - it looks pretty different from the code a
C or C++ compiler generates.

Imagine if the compiler treated every ' someval++ ' as:

lock xadd [...], ... /* with a barrier */

!

:O
 
A

Attila Feher

SenderX said:
And I guess I know what I am talking about, because I have
implemented atomic read, write and icrement/decrement etc. in
assembly for Sparc as well as for Intel. And - believe me - it
looks pretty different from the code a C or C++ compiler generates.

Imagine if the compiler treated every ' someval++ ' as:

lock xadd [...], ... /* with a barrier */

Still would not work on certain architectures.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,159
Messages
2,570,886
Members
47,419
Latest member
ArturoBres

Latest Threads

Top