jacob navia wrote On 06/29/07 13:43,:
Eric said:
jacob navia wrote On 06/29/07 10:56,:
Eric Sosman wrote:
Eric Sosman wrote On 06/29/07 07:31,:
jacob navia wrote:
[...]
If you add a number smaller than
this to 1.0, the result will be 1.0. For the different representations
we have in the standard header <float.h>:
Finally, we get to an understandable definition: x is the
FP epsilon if 1+x is the smallest representable number greater
than 1 (when evaluated in the appropriate type). [...]
Now that I think of it, the two descriptions are
not the same. Mine is correct, as far as I know, but
Jacob's is subtly wrong. (Hint: Rounding modes.)
The standard says:
5.2.4.2.2:
DBL_EPSILON
the difference between 1 and the least value greater than 1 that is
representable in the given floating point type, b^(1-p)
Yes, but what you said in your tutorial was "If you add
a number smaller than this [epsilon] to 1.0, the result will
be 1.0," and that is not necessarily true. For example, on
the machine in front of me at the moment,
1.0 + DBL_EPSILON * 3 / 4
is greater than one, even though `DBL_EPSILON * 3 / 4' is
25% smaller than DBL_EPSILON.
This is because your machine makes calculations in extended
precision since DBL_EPSILON must by definition be the smallest
quantity where 1+DBL_EPSILON != 1
Wrong again, Jacob. Twice.
First, the machine I'm using does not even have an
extended precision, much less use one. Its double is
a plain vanilla IEEE 64-bit double, and that's that.
What it does have, though, in common with other IEEE
implementations, is the ability to deliver a correctly
rounded answer. 1+DBL_EPSILON*3/4 is (mathematically)
closer to 1+DBL_EPSILON than it is to 1, so the sum
rounds up rather than down. (Did you not read the hint
I wrote earlier, and that you have now quoted twice?
"Not," I guess.)
Second, you have again mis-stated what the Standard
says (even after quoting the Standard's own text). The
Standard does *not* say that epsilon is the smallest
value which when added to 1 yields a sum greater than 1;
it says that epsilon is the difference between 1 and the
next larger representable number. If you think the two
statements are equivalent, you have absolutely no call
to be writing a tutorial on floating-point arithmetic!
For IEEE double, there are about four and a half
thousand million million distinct values strictly less
than DBL_EPSILON which when added to 1 will produce the
sum 1+DBL_EPSILON in "round to nearest" mode, which is
the mode in effect when a C program starts.