Bug in floating-point addition: is anyone else seeing this?

C

Carl Banks

Are you running your simulations on a system that does or does not
support the "useless bell and whistle" of correct rounding? If not,
how do you prevent regression towards 0?

The "useless bell and whistle" is switching to multiprecision.

I'm not sure whether our hardware has a rounding bias or not but I
doubt it would matter if it did.

For example, one of the things that caused the PS3 to be in 3rd place
behind the Wii and XBox 360 is that to save a cycle or two, the PS3
cell core does not support rounding of single precision results -- it
truncates them towards 0. That led to horrible single-pixel errors in
the early demos I saw, which in term helped contribute to game release
delays, which has turned into a major disappointment for Sony.

And you believe that automatically detecting rounding errors and
switching to multi-precision in software would have saved Sony all
this?


Carl Banks
 
H

Henrique Dante de Almeida

10000000000000000.0

Notice that 1e16-1 doesn't exist in IEEE double precision:
1e16-2 == 0x1.1c37937e07fffp+53
1e16 == 0x1.1c37937e08p+53

(that is, the hex representation ends with "7fff", then goes to
"8000").

So, it's just rounding. It could go up, to 1e16, or down, to 1e16-2.
This is not a bug, it's a feature.
 
H

Henrique Dante de Almeida

 Notice that 1e16-1 doesn't exist in IEEE double precision:
 1e16-2 == 0x1.1c37937e07fffp+53
 1e16 == 0x1.1c37937e08p+53

 (that is, the hex representation ends with "7fff", then goes to
"8000").

 So, it's just rounding. It could go up, to 1e16, or down, to 1e16-2.
This is not a bug, it's a feature.

I didn't answer your question. :-/

Adding a small number to 1e16-2 should be rounded to nearest (1e16-2)
by default. So that's strange.

The following code compiled with gcc 4.2 (without optimization) gives
the same result:

#include <stdio.h>

int main (void)
{
double a;

while(1) {
scanf("%lg", &a);
printf("%a\n", a);
printf("%a\n", a + 0.999);
printf("%a\n", a + 0.9999);
}
}
 
H

Henrique Dante de Almeida

 I didn't answer your question. :-/

 Adding a small number to 1e16-2 should be rounded to nearest (1e16-2)
by default. So that's strange.

 The following code compiled with gcc 4.2 (without optimization) gives
the same result:

#include <stdio.h>

int main (void)
{
        double a;

        while(1) {
                scanf("%lg", &a);
                printf("%a\n", a);
                printf("%a\n", a + 0.999);
                printf("%a\n", a + 0.9999);
        }

}

However, compiling it with "-mfpmath=sse -msse2" it works. (it
doesn't work with -msse either).
 
H

Henrique Dante de Almeida

 However, compiling it with "-mfpmath=sse -msse2" it works. (it
doesn't work with -msse either).

Finally (and the answer is obvious). 387 breaks the standards and
doesn't use IEEE double precision when requested to do so.

It reads the 64-bit double and converts it to a 80-bit long double.
In this case, 1e16-2 + 0.9999 == 1e16-1. When requested by the printf
call, this 80-bit number (1e16-1) is converted to a double, which
happens to be 1e16.
 
R

Ross Ridge

Henrique Dante de Almeida said:
Finally (and the answer is obvious). 387 breaks the standards and
doesn't use IEEE double precision when requested to do so.

Actually, the 80387 and the '87 FPU in all other IA-32 processors
do use IEEE 745 double-precision arithmetic when requested to do so.
The problem is that GCC doesn't request that it do so. It's a long
standing problem with GCC that will probably never be fixed. You can
work around this problem the way the Microsoft C/C++ compiler does
by requesting that the FPU always use double-precision arithmetic.
That way your answers are only wrong when you use long double or float.

Ross Ridge
 
D

Diez B. Roggisch

Dave said:
Are you running your simulations on a system that does or does not
support the "useless bell and whistle" of correct rounding? If not,
how do you prevent regression towards 0?

For example, one of the things that caused the PS3 to be in 3rd place
behind the Wii and XBox 360 is that to save a cycle or two, the PS3
cell core does not support rounding of single precision results -- it
truncates them towards 0. That led to horrible single-pixel errors in
the early demos I saw, which in term helped contribute to game release
delays, which has turned into a major disappointment for Sony.

First of all - calling the PS3 technologically behind the WII (that is on
par with the PS2 wrt to it's computational power) is preposterous.

And that put aside, I don't get why a discussion about single or double
precision floats that SHARE THE SAME ROUNDING BEHAVIOR - just in different
scales - has to do with automatically adapting calculations to higher
precision numbers such as decimals or any other arbitrary precision number
format.

Diez
 
D

Diez B. Roggisch

This person who started this thread posted the calculations showing
that Python was doing the wrong thing, and filed a bug report on it.

If someone pointed out a similar problem in Flaming Thunder, I would
agree that Flaming Thunder was doing the wrong thing.

I would fix the problem a lot faster, though, within hours if
possible. Apparently this particular bug has been lurking on Bugzilla
since 2003: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=323

I wonder how you would accomplish that, given that there is no fix.

http://hal.archives-ouvertes.fr/hal-00128124

Diez
 
M

Mark Dickinson

Actually, the 80387 and the '87 FPU in all other IA-32 processors
do use IEEE 745 double-precision arithmetic when requested to do so.
The problem is that GCC doesn't request that it do so.  It's a long
standing problem with GCC that will probably never be fixed.  You can
work around this problem the way the Microsoft C/C++ compiler does
by requesting that the FPU always use double-precision arithmetic.

Even this isn't a perfect solution, though: for one thing, you can
only
change the precision used for rounding, but not the exponent range,
which remains the same as for extended precision. Which means you
still don't get strict IEEE 754 compliance when working with very
large or very small numbers. In practice, I guess it's fairly
easy to avoid the extremes of the exponent range, so this seems like
a workable fix.

More seriously, it looks as though libm (and hence the Python
math module) might need the extended precision: on my machine
there's a line in /usr/include/fpu_control.h that says

#define _FPU_EXTENDED 0x300 /* libm requires double extended
precision. */

Mark
 
H

Henrique Dante de Almeida

Actually, the 80387 and the '87 FPU in all other IA-32 processors
do use IEEE 745 double-precision arithmetic when requested to do so.

True. :-/

It seems that it uses a flag to control the precision. So, a
conformant implementation would require saving/restoring the flag
between calls. No wonder why gcc doesn't try to do this.

There are two possible options for python, in that case:

- Leave it as it is. The python language states that floating point
operations are based on the underlying C implementation. Also, the
relative error in this case is around 1e-16, which is smaller than the
expected error for IEEE doubles (~2e-16), so the result is non-
standard, but acceptable (in the general case, I believe the rounding
error could be marginally bigger than the expected error in extreme
cases, though).

- Use long doubles for archictectures that don't support SSE2 and use
SSE2 IEEE doubles for architectures that support it.

A third option would be for python to set the x87 precision to double
and switch it back to extended precision when calling C code (that
would be too much work for nothing).
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,968
Messages
2,570,150
Members
46,697
Latest member
AugustNabo

Latest Threads

Top