Lock-free binary trees

Joe Seigh · Jul 14, 2006

David said:
The problem with this is that you're still relying on the assumption that your
compiler does not perform certain kinds of valid optimization. This is already
false for some compilers (e.g. llvm-gcc), and will probably be false for future
main-line versions of gcc that will do link-time optimization.

<http://llvm.org>
<http://gcc.gnu.org/ml/gcc/2005-11/msg00888.html>

(Although the specific proposal for integrating LLVM into gcc has not been agreed
and seems to be at least temporarily stalled, there was concensus on the general
idea of supporting link-time inter-procedural optimizations in gcc.)

I suppose you could change gcc so as to break any threading api, except Posix
of course. But then you'd have to go into the witless protection program to
protect you from the wrath of Linus and all the other Linux kernel developers.

David Hopwood · Jul 14, 2006

Joe said:
I suppose you could change gcc so as to break any threading api, except
Posix of course. But then you'd have to go into the witless protection
program to protect you from the wrath of Linus and all the other Linux
kernel developers.

<http://gcc.gnu.org/onlinedocs/gcc-3.2.3/gcc/i386-and-x86-64-Options.html>
# These -m switches are supported in addition to the above on AMD x86-64
# processors in 64-bit environments. [...]
# -mcmodel=kernel
# Generate code for the kernel code model. The kernel runs in the
# negative 2 GB of the address space. This model has to be used for
# Linux kernel code.

<http://gcc.gnu.org/onlinedocs/gcc/S_002f390-and-zSeries-Options.html>
# [...] In order to build a linux kernel use -msoft-float.

<http://gcc.gnu.org/ml/gcc-patches/2004-01/msg01619.html>
# > People run more and more into problems with kernel modules on x86-64
# > and the new gcc because they don't specify the -mno-red-zone parameter.
# > With the old gcc 3.2 they got away, but with the new one it breaks.

<http://www.faqs.org/docs/kernel/x204.html>
# Kernel modules need to be compiled with certain gcc options to make
# them work. [...]
# * -O2: The kernel makes extensive use of inline functions, so modules
# must be compiled with the optimization flag turned on. Without
# optimization, some of the assembler macros calls will be mistaken
# by the compiler for function calls. This will cause loading the
# module to fail, since insmod won't find those functions in the kernel.

<http://www.luv.asn.au/overheads/compile.html>
# From /usr/doc/gcc/README.Debian.gz in gcc-2.95.2-0pre2:

[an old version, but I found it amusing considering the previous quote:]

# As of gcc-2.95, optimisation at level 2 (-O2) and higher includes an
# optimisation which breaks C code that does not adhere to the C standard.
# Such code occurs in the Linux kernel, and probably other places.

Anyway, the point is that the Linux kernel is a law unto itself, which may
or may not work with any particular combination of gcc version and options
(it certainly won't work with other compilers, except maybe icc with a patch).

If you are writing a general purpose library or application, you want it to
be less compiler-version/option-dependent than that; preferably, you want
something that works for a documented reason, rather than just by coincidence
of the compiler not doing some optimizations. As I said earlier,

Using inline asm with a "clobbers memory" directive before and after the load
should solve this particular problem. However, I am skeptical of reliance on
dependent load ordering in general. I've never seen any rigorous definition of
it (as a property of a memory model, rather than in discussions of processor
optimizations), or any explicit statement in the architecture manuals of commonly
used processor lines saying that it is supported. In fact, I wonder if the idea
that there is an implicit commitment for future versions of these processors
to guarantee dependent load ordering is not just wishful thinking.

Joe Seigh · Jul 14, 2006

David Hopwood wrote:
....

Anyway, the point is that the Linux kernel is a law unto itself, which may
or may not work with any particular combination of gcc version and options
(it certainly won't work with other compilers, except maybe icc with a patch).

If you are writing a general purpose library or application, you want it to
be less compiler-version/option-dependent than that; preferably, you want
something that works for a documented reason, rather than just by coincidence
of the compiler not doing some optimizations. As I said earlier,

Using inline asm with a "clobbers memory" directive before and after the load
should solve this particular problem. However, I am skeptical of reliance on
dependent load ordering in general. I've never seen any rigorous definition of
it (as a property of a memory model, rather than in discussions of processor
optimizations), or any explicit statement in the architecture manuals of commonly
used processor lines saying that it is supported. In fact, I wonder if the idea
that there is an implicit commitment for future versions of these processors
to guarantee dependent load ordering is not just wishful thinking.

Hypothetically, anything is possible. So gcc could do something extremely stupid.
Especially since the current C standard does not recognise threads. However
C++0x is discussing threaded support so unless there is a complete break between
C and C++, something will have to be worked out.

Worst case is that gcc eliminates itself as a research tool for multi-threading and
that putting support for any new threading features in C will become as tedious
as the JSR process except the place where the new features came from will be a lot
smaller. Think about it. The prototypes for the stuff that was put in JSR-166
came from somewhere. Some of it may have been assembler but I doubt all of it was.

The commercial compiler vendors would be happy with gcc doing that. Intel provides
a library that uses lock-free algorithms. That you'd have to buy a compiler
from them as well would make their day.

David Hopwood · Jul 14, 2006

Joe said:
David Hopwood wrote:
...

Hypothetically, anything is possible. So gcc could do something
extremely stupid.

Link-time optimization is *not* extremely stupid. Relying on it not being
done is arguably stupid, when there are alternatives that rely only on
documented (if architecture-dependent) features.

Especially since the current C standard does not recognise threads.
However C++0x is discussing threaded support so unless there is a complete
break between C and C++, something will have to be worked out.

I have no confidence in the C++0x committee to produce anything usable.
I will be pleasantly surprised if they don't make things worse.

Joe Seigh · Jul 15, 2006

David said:
Link-time optimization is *not* extremely stupid. Relying on it not being
done is arguably stupid, when there are alternatives that rely only on
documented (if architecture-dependent) features.

Not that. I was refering to some gcc developer completely destroying
gcc's ability to tolerate multi-threading since the standards would
allow them to do that.

I have no confidence in the C++0x committee to produce anything usable.
I will be pleasantly surprised if they don't make things worse.

Hell freezes over happens-before cpp-threads figures out how to
define multi-threading semantics.

To slightly change the topic from my alledged claim to 100% accurately
predict the future with respect to what will or will not be in C/C++,
let's assume for the sake of argument that C/C++ will no longer be around.
The question becomes how would you develop and test new concurrent
programming facilities and constructs? With C/C++, you could
prototype and compare against alternative techniques. Without C/C++,
you'd have to add support for in in a language and modify the compiler
for that language. And you'd test it against what? Not having
the facility? You could test it against another language but that
would be an apples and oranges comparison due to the number of
overall differences between the languages. All you'd get would
be a flame war. There must be some great flame wars out there
on Java vs. Erlang vs. Occam. None of which conclusively proved
anything.

Chris Thomasson · Jul 15, 2006

David Hopwood said:
Chris said:

David Hopwood said:

Chris Thomasson wrote:

[...]

void* vzAtomicLoadPtr_mbDepends(void* volatile* d) {
return dependant_load(d);
}

DOH. I should of omitted volatile:

void* vzAtomicLoadPtr_mbDepends(void** d) {
return dependant_load(d);
}

In that case, how do you know that the compiler isn't optimizing away the
call (which it should, assuming we're not on an Alpha machine), and then
reordering the load relative to things that it shouldn't be reordered
relative to?

Answer: you don't.

Click to expand...

Well, the call would ideally be an external assembled function:

http://groups.google.com/group/comp.programming.threads/msg/423df394a0370fa6

Click to expand...

The problem with this is that you're still relying on the assumption that
your
compiler does not perform certain kinds of valid optimization. This is
already
false for some compilers (e.g. llvm-gcc), and will probably be false for
future
main-line versions of gcc that will do link-time optimization.

<http://llvm.org>
<http://gcc.gnu.org/ml/gcc/2005-11/msg00888.html>

(Although the specific proposal for integrating LLVM into gcc has not been
agreed
and seems to be at least temporarily stalled, there was concensus on the
general
idea of supporting link-time inter-procedural optimizations in gcc.)
http://groups.google.com/group/linux.kernel/msg/e10f88ada7cd04f4

Humm...

Using inline asm with a "clobbers memory" directive before and after the
load
should solve this particular problem. However, I am skeptical of reliance
on
dependent load ordering in general. I've never seen any rigorous
definition of
it (as a property of a memory model, rather than in discussions of
processor
optimizations), or any explicit statement in the architecture manuals of
commonly
used processor lines saying that it is supported.

https://coolthreads.dev.java.net/servlets/ProjectForumMessageView?forumID=1797&messageID=11068

- I received a request from SUN to talk to the "press" about the CoolThreads
contest, and basically vZOOM techniques'/coolthreads technology/"uses" in
general...

I actually "may have a chance" to bring all of this "stuff" up, and perhaps
get it published in a "respectable medium"...

In fact, I wonder if the idea
that there is an implicit commitment for future versions of these
processors
to guarantee dependent load ordering is not just wishful thinking.

I will address this "issue" in detail...

Chris Thomasson · Jul 16, 2006

Chris Thomasson said:
David Hopwood said:

Chris said:

Chris Thomasson wrote:

[...]

Click to expand...

[...]
In fact, I wonder if the idea
that there is an implicit commitment for future versions of these
processors
to guarantee dependent load ordering is not just wishful thinking.

Click to expand...

An atomic api abstraction should provide the means for loads with #LoadLoad
barrier semantics. So, if dependant loads are no longer supported, we switch
to the #LoadLoad version.... Simple...

Chris Thomasson · Jul 16, 2006

Chris said:
Chris said:

Dependant loads:

<psuedo>

#if ! defined (ArchAlpha)
#define dependant_load(x) (*x)
#else
dependant_load(x) {
l = *x;
rmb();
return l;
}
#endif

void* vzAtomicLoadPtr_mbDepends(void* volatile* d) {
return dependant_load(d);
}

Click to expand...

so for the typical case of one time init / DCLP:

if (!initted) // [1]
{
Lock lock;
init(foo);
}

// use foo:

foo->doSomething(); // [2]

you need (or get for free) a dependant load on the init flag at [1]?

No. You need #LoadLoad after loading the init flag, and a #StoreStore before
setting the flag. You have introduced a level of indirection via. the "init
flag", so you need #LoadLoad... Dependent ordering is not strong enough
here...

and/or when accessing foo, at [2]?

Dependent ordering applies to the load of foo, but not the load of the init
flag...

bench: binary trees	10	Jul 29, 2012
Binary search trees (AVL trees)	34	Jan 3, 2010
Left leaning red black trees	10	Dec 2, 2009
comparing binary trees in C	12	May 1, 2009
How can I structure the final array to meet the requirements of Bootstrap Tree View for building a tree in JavaScript?	1	Mar 29, 2024
Empty binary trees	7	Jun 3, 2008
trees, iterations and adding leaves	1	Dec 31, 2006
Behavior trees in ruby	2	Oct 24, 2010

Lock-free binary trees

Joe Seigh

David Hopwood

Joe Seigh

David Hopwood

Joe Seigh

Chris Thomasson

Chris Thomasson

Chris Thomasson

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads