How to make this program more efficient?

James Kanze · Sep 24, 2008

The two are quite different: although Posix (and doubtlessly
Windows as well) guarantees synchronization accross a lock, lock
free algorithms exist, but they still also require
synchronization.

Are memory barriers a form of synchronization?

On many machines (e.g. Sparc), they're the only form of memory
synchronization. (I think that Intel refers to them as fences.
I think that Intel also offers some additional guarantees, and
that in particular---if I've understood correctly---it
implicitly generates full memory synchronization around an xchg
instruction. I'm more familiar with Sparc: for Sparc, you
should read section 3.2 of the "Sparc Architecture Manual",
http://www.sparc.org/standards/SPARCV9.pdf)

courpron · Sep 24, 2008

Not really. Almost all of the actual specification of volatile
is implementation defined, so an implementation pretty much do
what it wants. The expressed intent in C++ is that it should
work as in C, and the expressed intent in C corresponds pretty
much to what you say, but it remains intent, and not a formal
requirement.

It is implementation defined, but there is one formal requirement :

[1.9/11] :
The least requirements on a conforming implementation are:
— At sequence points, volatile objects are stable in the sense that
previous evaluations are complete and subsequent evaluations have not
yet occurred.

So, what gpderetta said is correct in the sense of the standard (but,
yes, in practice you have to check the compiler's behavior).

Alexandre Courpron.

courpron · Sep 24, 2008

It is implementation defined, but there is one formal requirement :

Click to expand...

[1.9/11] :
The least requirements on a conforming implementation are:
— At sequence points, volatile objects are stable in the sense that
previous evaluations are complete and subsequent evaluations have not
yet occurred.

Click to expand...

So, what gpderetta said is correct in the sense of the standard (but,
yes, in practice you have to check the compiler's behavior).

Click to expand...

Not quite: that part of 1.9/11 says that modifications of volatile
objects can't be reordered if there is an intervening sequence point.
There is no ordering requirement for two modifications without an
itervening sequence point.

Between sequence points, there is no order at all, the compiler then
doesn't *REorder* accesses. The compiler can reorder accesses across
sequence points unless the accesses concern a volatile variable.

Alexandre Courpron.

James Kanze · Sep 24, 2008

It is implementation defined, but there is one formal requirement :

[1.9/11] :
The least requirements on a conforming implementation are:
? At sequence points, volatile objects are stable in the sense that
previous evaluations are complete and subsequent evaluations have not
yet occurred.

Which means? (For starters, a sequence point is a compile time
characterization, and doesn't exist at runtime.)

Accessing a volatile object is observable behavior, but what
consitutes an access is implementation defined. Technically,
that means that the implementation is required to document it;
I've never been able to find such documentation, but judging
from the code generated by the implementations I've used, the
definition is something along the lines: a load or store
instruction has been emitted. Since on a modern machne, a load
instruction doesn't guarantee a write to memory, and a store
doesn't guarantee a read from memory, that's a totally useless
(albeit legal) definition.

Erik WikstrÃ¶m · Sep 24, 2008

But what is a sequence point in terms of C or C++ source code? AFAICT, they
have no way to express sequence points.

From what I can gather, GCC is not reordering the stores of volatile
variables in this case when optimizing but it is also not emitting memory
barriers that would prevent the CPU from breaking the ordering. So the
lock-free code output by GCC is useless.

This raises some questions for me:

1. Is volatile completely useless?

No, it can among other things be useful when writing device-drivers
where a certain memory-location is mapped to registers on the device.
The usage of volatile then assures that the value of the memory-location
is used and not some value in a register or cache. You should probably
also use it for data shared between threads (such as your pointer) but
you need to provide separate synchronisation.

Erik WikstrÃ¶m · Sep 24, 2008

Is it possible to provide the necessary synchronization without locks from C
or C++?

Not in C++03 (nor C99 AFAIK), since it has no notion of multiple threads
of execution. POSIX extends C to allow threaded execution and you can
use MS specific stuff on Windows but C++0x is the first version of the
C++ standard that handles threads, and it does have some primitives
which can be used to synchronise without using normal locks.

courpron · Sep 24, 2008

Also, IIRC the standard prohibits only reordering of
accesses to volatile objects. Reads and writes to non
volatile objects can still be reordered even across a
volatile.
Not really. Almost all of the actual specification of
volatile is implementation defined, so an implementation
pretty much do what it wants. The expressed intent in C++
is that it should work as in C, and the expressed intent in
C corresponds pretty much to what you say, but it remains
intent, and not a formal requirement.

Click to expand...

It is implementation defined, but there is one formal requirement :
[1.9/11] :
The least requirements on a conforming implementation are:
? At sequence points, volatile objects are stable in the sense that
previous evaluations are complete and subsequent evaluations have not
yet occurred.

Click to expand...

Which means? (For starters, a sequence point is a compile time
characterization, and doesn't exist at runtime.)

That means volatile prevents static reordering of volatile variables
accesses, and just that (we were dissecting all the effects of
volatile). But, for quite a moment, we have already said that volatile
is useless because it doesn't put runtime constraints too (i. e.
memory barriers). So we certainly don't disagree here.

Alexandre Courpron.

peter koch · Sep 24, 2008

Is it possible to provide the necessary synchronization without locks from C
or C++?

The current version of C++ has no support for threads whatsoever, so
we are talking platform specific options. In that context there are
varying degrees of support. The platform, I know best (Microsoft C++)
has lots of stuff to support threading without locking - I believe
documentation is available on the internet.
The new version of C++ has direct support for threading, also many low-
level primitives.

/Peter

James Kanze · Sep 25, 2008

It depends on the compiler, but as implemented by Sun CC and g++
on Sparcs, yes. (Or almost. It does provide certain guarantees
with respect to setjmp/longjmp. But that shouldn't be relevant
to any C++ code. There are also guarantees when it is used on a
sig_atomic_t, but those guarantees only apply to signal handlers
and the thread they interrupted.)

No, it can among other things be useful when writing
device-drivers where a certain memory-location is mapped to
registers on the device.

That's certainly the intent. It appears as an example in the C
standard, and is mentionned in the C rationale. On the other
hand, it's not true with Sun CC or g++ on a Sparc, where the
architecture specification is quite clear: membar instructions
are needed for memory mapped IO. (I don't know what actual
hardware does here. If the memory mapped IO is at specific
addresses, the hardware could recognized it, and implicitly
behave as if the membar instructions were there. Even though
the Sparc specification doesn't require it.)

The usage of volatile then assures that the value of the
memory-location is used and not some value in a register or
cache.

Again, that's the intent. It's not the case for g++ or Sun CC
on Sparc.

You should probably also use it for data shared between
threads (such as your pointer) but you need to provide
separate synchronisation.

There's no point in using it on data shared between threads
(except to slow things down). It's not sufficient, and when the
other mechanisms are used correctly, it's not necessary.

James Kanze · Sep 25, 2008

And what defines a sequence point in C or C++?

The language standard

.

The definition is in §1.9/7: "At certain specified points in the
execution sequence called sequence points, all side effects of
previous evaluations are complete and no side effects of
subsequent evaluations shall have taken place." At various
points throughout the standard, it is specified that such and
such is a sequence point. Off the top of my head (so I may have
forgotten one or two): the end of a full expression, function
calls, function returns, and the operators ?:, &&, || and , (the
comma operator).

Note that they do not provide a complete ordering; the classical
case is in expressions with two or more functions, where each of
the function calls is a sequence point, but there are no
sequence points between the evaluation of the different
arguments, even the arguments to two different functions.

They don't work well with regards to describing multi-threaded
behavior, and the next version of the C++ standard (which treats
threading) will not use them.

James Kanze · Sep 25, 2008

Also, IIRC the standard prohibits only reordering of
accesses to volatile objects. Reads and writes to non
volatile objects can still be reordered even across a
volatile.
Not really. Almost all of the actual specification of
volatile is implementation defined, so an implementation
pretty much do what it wants. The expressed intent in C++
is that it should work as in C, and the expressed intent in
C corresponds pretty much to what you say, but it remains
intent, and not a formal requirement.
It is implementation defined, but there is one formal requirement :
[1.9/11] :
The least requirements on a conforming implementation are:
? At sequence points, volatile objects are stable in the sense that
previous evaluations are complete and subsequent evaluations have not
yet occurred.

Click to expand...

Which means? (For starters, a sequence point is a compile time
characterization, and doesn't exist at runtime.)

Click to expand...

That means volatile prevents static reordering of volatile
variables accesses, and just that (we were dissecting all the
effects of volatile).

But only if there is a sequence point between them. And only
for whatever the implementation defines "access" to mean: as I
think I said, the Sparc compilers I have access to consider
accessing the value in the memory pipeline sufficient, even
though the memory pipeline is local to each CPU (core, not
visible except from the CPU, and not synchronized with anything
outside the CPU.

STL: Could you make this snippet more efficient	12	Dec 3, 2007
FAQ 3.19 How can I make my CGI script more efficient?	0	Jan 28, 2011
How to use SQLite (sqlite3) more efficiently	0	Jun 5, 2014
Efficient Christmas to everyone	5	Dec 23, 2009
How to use SQLite (sqlite3) more efficiently	0	Jun 6, 2014
Make Python Compilable, convert to Python source to Go	12	May 25, 2014
FAQ 3.16 How can I make my Perl program take less memory?	0	Feb 4, 2011
Is there a better, more efficient way to write this program?	8	Nov 10, 2005

How to make this program more efficient?

James Kanze

courpron

courpron

James Kanze

Erik WikstrÃ¶m

Erik WikstrÃ¶m

courpron

peter koch

James Kanze

James Kanze

James Kanze

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads