If I'm doing thousands of them it starts to add up
IIRC it's 6 cycles or so, that hurts when I've managed
to finese away just about every expensive calculation
in an inner loop. It's all relative I suppose.
However, FIST won't in most cases convert between
float and int as the processor is almost certainly in round mode. Therefore
you have two options:
1. Modify the fcw to change the state to truncation, do the fist, then put it
back. This is EGREGIOUSLY SLOW.
2. Subtract .5 from the value before doing the fist. (This is actually pretty fast
but I don't know ANY compiler that does this).
Well okay, you're right of course, but generally when I've had to hit
the assembler for performance reasons I've usually been quite happy to
accept this single bit error.