T
Tim Prince
In SSE/SSE2 code, there's no implicit promotion of float to double.Stephen Sprunk said:Chris Torek said:It its probably worth noting that the hardware architecture with
which most people are familiar -- the Intel IA32 -- is one of
these. All of its calculations use "long double" internally
(assuming, of course, that your C compiler maps "long double"
onto this 80-bit internal format!).
The IA32 does have a "precision" field in its FPU control word,
but this does not work in quite the same way as actually converting
the final result down to the 32 or 64 bit "memory format" for float
and double. In particular, setting the precision to "float" causes
mantissa rounding, but leaves the exponent range +/- 16383 instead
of +/- 127. In other words, the internal format is as if you had
39 bits to work with, instead of the "expected" 32. This means
that infinities and NaNs do not occur when one might expect them.
These "problems" go away when switching to SSE[12] for FP, right?
My understanding is that the SSE unit only has 32-bit floats and 64-bit
doubles, and the x87 unit is still used for long doubles (but nothing else).
This is why I was curious about promotion of floats to doubles and back to
floats -- that seems like it would seriously mess up the SSE registers.
"Vector" parallel code requires going through memory to cast between float
and double. In Windows, there's no support for long double wider than
double. The Windows-64 ABI doesn't even support use of the x87
instructions, although the influence of Kahan may live on in the linux ABI.