Joe said:
I suppose you could change gcc so as to break any threading api, except
Posix of course. But then you'd have to go into the witless protection
program to protect you from the wrath of Linus and all the other Linux
kernel developers.
<
http://gcc.gnu.org/onlinedocs/gcc-3.2.3/gcc/i386-and-x86-64-Options.html>
# These -m switches are supported in addition to the above on AMD x86-64
# processors in 64-bit environments. [...]
# -mcmodel=kernel
# Generate code for the kernel code model. The kernel runs in the
# negative 2 GB of the address space. This model has to be used for
# Linux kernel code.
<
http://gcc.gnu.org/onlinedocs/gcc/S_002f390-and-zSeries-Options.html>
# [...] In order to build a linux kernel use -msoft-float.
<
http://gcc.gnu.org/ml/gcc-patches/2004-01/msg01619.html>
# > People run more and more into problems with kernel modules on x86-64
# > and the new gcc because they don't specify the -mno-red-zone parameter.
# > With the old gcc 3.2 they got away, but with the new one it breaks.
<
http://www.faqs.org/docs/kernel/x204.html>
# Kernel modules need to be compiled with certain gcc options to make
# them work. [...]
# * -O2: The kernel makes extensive use of inline functions, so modules
# must be compiled with the optimization flag turned on. Without
# optimization, some of the assembler macros calls will be mistaken
# by the compiler for function calls. This will cause loading the
# module to fail, since insmod won't find those functions in the kernel.
<
http://www.luv.asn.au/overheads/compile.html>
# From /usr/doc/gcc/README.Debian.gz in gcc-2.95.2-0pre2:
[an old version, but I found it amusing considering the previous quote:]
# As of gcc-2.95, optimisation at level 2 (-O2) and higher includes an
# optimisation which breaks C code that does not adhere to the C standard.
# Such code occurs in the Linux kernel, and probably other places.
Anyway, the point is that the Linux kernel is a law unto itself, which may
or may not work with any particular combination of gcc version and options
(it certainly won't work with other compilers, except maybe icc with a patch).
If you are writing a general purpose library or application, you want it to
be less compiler-version/option-dependent than that; preferably, you want
something that works for a documented reason, rather than just by coincidence
of the compiler not doing some optimizations. As I said earlier,
Using inline asm with a "clobbers memory" directive before and after the load
should solve this particular problem. However, I am skeptical of reliance on
dependent load ordering in general. I've never seen any rigorous definition of
it (as a property of a memory model, rather than in discussions of processor
optimizations), or any explicit statement in the architecture manuals of commonly
used processor lines saying that it is supported. In fact, I wonder if the idea
that there is an implicit commitment for future versions of these processors
to guarantee dependent load ordering is not just wishful thinking.