"Nils O. Selåsdal"
You're basically right, but not about trusting the compiler. Let's say that
the XOR hack saves a cycle over use of a temporary, due to the architecture
of the instruction set. Seeing that assignment to a temporary is only for
the purposes of swapping is the sort of thing compilers tend not to be good
at, because they don't see the source with a human eye.
No; they see it with one that makes no mistakes, instead.
This is actually a very simple optimization in any compiler that
does live-range analysis of variables. The "live range" of a
variable begins from when it is first assigned and ends at the
last reference to it before it is assigned another value (or the
last reference, if never re-assigned). This means that, for
instance, in code like:
tmp = a; /* line 17 */
a = b;
b = tmp; /* line 19 */
tmp = c; /* line 20 */
c = d;
d = tmp; /* line 22 */
/* code that does not refer to "tmp" again */
the "tmp" variable has two separate live ranges: one covers lines
17 through 19, and the other lines 20 through 22.
Given this information, and assuming that a, b, c, and d are in
processor registers and the CPU has a "swap" instruction for swapping
a pair of registers, it is easy to generate code of the form:
swap rA, rB
swap rC, rD
for these six lines of code. The GNU C compiler does it, for
instance. (Of course, the person who writes the ".md" file for
the machine has to define the swap instruction.)
However is saving a cycle really worth it? If you are so time
critical, wouldn't it be better to resort to assembly?
This is a valid point -- but it turns out that many C programmers
can no longer write "fast assembly code" for modern (pipelined)
processors -- or at least, not as well as compilers with good
pipeline schedulers (gcc3's is far better than gcc2's, for instance).
To beat the compiler, you may need one of those special assembly
programmers stored in the glass cases (with the signs that read
"in case of emergency, break glass").