converting float to double

E

Ernie Wright

christian.bau said:
If the situation is as described, then I would be rather sure that your
compiler is broken, and I wouldn't go anywhere near it. If you rely on
that strange behavior of the compiler, you are skating on very thin
ice. 59.889999389648437 looks like a bona fide 32 bit floating point
number. The C Standard requires one hundred percent that f is rounded
to single precision, the multiplication by 1000.0 and division by
1000.0 are one hundred percent required to be done in double precision,
which means that getting a result of 59.890000000000001 is absolutely
impossible unless the compiler is broken.

Are you sure about that?

We start with

double d;
float f = 59.89f;

after which f is represented internally as

(1 + 7311196 / 8388608) * 32 = 59.8899993896484375

This is as close as a float can get to 59.89. Next we do

f *= 1000.0;

after which f is represented internally as

(1 + 6943232 / 8388608) * 32768 = 59890.0 exactly

You don't have to take my word for it, but I won't show how the multiply
is done in binary. Finally,

d = f / 1000.0;

after which d is represented as

(1 + 3925168550230098 / 4503599627370496) * 32
= 59.89000000000000056843418860808

And this is as close as a *double* can be to 59.89. Not only is this
not broken, it's exactly what's required by the IEEE 754 standard. Nor
do you have to go through all of the above steps.

f = 59.89f;
d = 59.89;

produces the same bit patterns and the same slight difference in value.

- Ernie http://home.comcast.net/~erniew
 
C

CBFalconer

Mark said:
Old vs new Lira, both still in use.
Anyway, vietnamese dong are a better example...


What makes it interesting is that the "predefined unit" is
variable by currency, market and calculation.

Listen to Mark. He is one of those well beloved creatures, a
banker. Usually considered infinitely superior to a lawyer, or a
politician. :_)
 
D

Dik T. Winter

> Dik T. Winter wrote: ....
>
> ??. You cannot do non-trivial calculations to a precision
> greater that 1/2 of your smallest currency unit?

Eh? I can not follow this.
>
> No, you need a large enough integer type as well (scaling
> will not help if you are using char). Also, how do you
> do compound intererst calculations "using integer arithmetic"
> and how does the fact that you are using integer arithmetic
> help you to determine a "correct' answer based on a non-mathematical
> standard of correctness.

See below.
>
> Indeed. But why can't this "something larger" be a floating
> point type with sufficient precision? If there is sufficient
> precision you can still determine the correct answer.

As long as you use only integer representations in your floating point
type there is no problem. However, be aware that each time you should
be sure that your floating point variable represents an integer.
>
> This can be done as conveniently using floating point arithmetic
> as using fixed point arithmetic. In either case you have to
> do rounding.

If you need rounding at each step of your floating point calculations, I
do not see any advantage using that type over an integral type. On the
other hand, when you are using floating-point it is likely that you will
use the exp function to calculate compound interest. That will, almost
certainly, give the wrong result.
>
> If this is a requirement, such a system can be written
> as easily using high precision floating point as large
> integer. However, note that such a law would
> regularly require inputs to be specified with an error
> of less than 10^-9. It is unlikely to be honoured or
> enforced.

Why? When the euro was introduced, the associated countries decided on
precise values that must be used when converting old currency to the
new currence. They were stated (in decimal) to a particular figure
after the decimal point. It required quite some effort to show that
doing it in floating-point would not be wrong *if* properly rounded
at the appropriate stages. But even there, there was still some
uncertainty. The one cent difference is quite important in financial
institutions. Now, Dutch tax laws are quite lax with respect to
rounding. Any figure you state on your tax return form you can round
to one of the two nearest plain euro values anyway you see appropriate.
I know that there are countries that are more strict.
 
D

Dik T. Winter

> If the situation is as described, then I would be rather sure that your
> compiler is broken, and I wouldn't go anywhere near it. If you rely on
> that strange behavior of the compiler, you are skating on very thin
> ice. 59.889999389648437 looks like a bona fide 32 bit floating point
> number. The C Standard requires one hundred percent that f is rounded
> to single precision, the multiplication by 1000.0 and division by
> 1000.0 are one hundred percent required to be done in double precision,
> which means that getting a result of 59.890000000000001 is absolutely
> impossible unless the compiler is broken.

Christian. I thought you knew better than this. With the multiplication
by 1000 we get another rounding. (The multiplication is done in single
precision.) But apparently you should avoid gcc on both Linux and
Solaris. Try the following program with your favourite compiler:

#include <stdio.h>

int main()
{
float f = 59.89F;
double d = f;
double dd;
f *= 1000.0;
dd = (double)f / 1000.0;
printf("%30.17g %30.17g\n", d, dd);
return 0;
}
and look at the result. On the systems I use it looks like:
59.889999389648438 59.890000000000001
> Switch compilers. Maybe switch to Java, using a Sun
> certified implementation, and turn "strict floating point arithmetic"
> on; that way you can rely on your results (that means if anything is
> wrong, it is your code that is wrong and not the compiler).

Right. The results are exactly conforming to IEEE arithmetic.
And you would have found that also if you had considered the actual
program and analyzed it.
 
D

Dik T. Winter

> It will give the right result for positive values, but is overly
> complicated. On the other hand, according to the information that you
> have given us, your compiler cannot be trusted, so both your code and
> my code below cannot be guaranteed to work.

Can you tell me a compiler that can be trusted? Gcc is not in that class
as it shows the "bug".

You are ridiculously harsh here on people that do not understand the
working of floating point. I think you did not even try the initial
program. The behaviour quoted is exactly according to the IEEE standard
of floating point arithmetic. Pray, calm down a bit.
 
W

William Hughes

Dik said:
Eh? I can not follow this.

The question is. Can we determine the correct answer from
the floating point value.

Assume possible precise answers are spaced at intervals of the smallest
currency unit.

We convert the floating point value to a precise answer by
finding the closest precise answer.

We now see that as long as the error is less that 1/2 of the smallest
currency spacing, the closest precise answer will also be the
correct answer.

See below.


As long as you use only integer representations in your floating point
type there is no problem.

No, as long as the floating point representation remains within
half of your smallest currency unit there is not problem.
However, be aware that each time you should
be sure that your floating point variable represents an integer.

Why? All I need is to be sure that I can recover the correct value.
If you need rounding at each step of your floating point calculations, I
do not see any advantage using that type over an integral type.

No, you round where required by accountancy rules or when
needed for precision. This may be
much less than every step. And there is a big advantage
in being able to use the existing floating point system rather
than having to obtain or roll you own fixed point system.
On the
other hand, when you are using floating-point it is likely that you will
use the exp function to calculate compound interest. That will, almost
certainly, give the wrong result.

Why? There is no reason to turn you brain off when you use floating
point. If the exp function does not give the accounting answer, and
you
want the accounting answer, then don't use the exp function (Duh!).

- William Hughes
 
A

av

You are out of date. 1 US dollar = 1.4283 Turkish Lira.


You still fail to see how the financial world is looking at it.
They require exactness to some predefined unit. And if some
system requires precision to the cent, do the calculations to
the cent.

i think that
"exactness to some predefined unit"
not exist if i have to do divisions

all you that have the experience that i don't have

What about to
considering money like a fixed point number of type
xIntegers.10Integers
than i show the number in form "x.xx" (with round to 3 decimal digits
afther the point)?
(but that number is stored like its format e.g.
in hex 123.12345678 12345678 12345678 12345678 12345678
12345678 12345678 12345678 12345678 12345678
)

how many divisions, sum, differences i have to do for to have
an error of 1 cent show when print it, with above numbers? (if it is
supposed implementation bug-free)
thank you
 
M

Mark McIntyre

Assume possible precise answers are spaced at intervals of the smallest
currency unit.

We convert the floating point value to a precise answer by
finding the closest precise answer.

We now see that as long as the error is less that 1/2 of the smallest
currency spacing, the closest precise answer will also be the
correct answer.

This doesn't follow, unless the currency rounding rule is to select
the closest precise answer, and in that case its tautological.

Several markets and currencies do not follow this rule - eg they
always round down, or always discard pennies, or round up if the
integer part is even, and down if its odd. Or whatever.

If you want "correct" values, you need to implement special handling
routines for rounding.
--
Mark McIntyre

"Debugging is twice as hard as writing the code in the first place.
Therefore, if you write the code as cleverly as possible, you are,
by definition, not smart enough to debug it."
--Brian Kernighan
 
W

William Hughes

Mark said:
This doesn't follow, unless the currency rounding rule is to select
the closest precise answer, and in that case its tautological.

You are confusing levels.

Calculate the precise answer.
(This may include arbitrary rules, such as
"every second thursday of a month that doesn't
contain r, add 1). It is at this level
that the rounding rules you are talking about
are relevent

Represent the precise answer. (assumed to
be an integral number of currency units)
One method is to store the precise answer
as an integer. However, any method of
coding that can be unabiguously solved
will work. An obvious alternative is to
code the precise answer as any floating
point number with the property that it is
closer to the desired value, than any
non desired value. The "rounding" here is
just a decoding mechanism. It has nothing
to do with the accounting rules of rounding.

Several markets and currencies do not follow this rule - eg they
always round down, or always discard pennies, or round up if the
integer part is even, and down if its odd. Or whatever.

If you want "correct" values, you need to implement special handling
routines for rounding.

Duh! If you want the correct answer you have to calculate it.
I am not saying that floating point arithmetic mirrors financial
rules, only that financial rules can be expressed as conveniently
in floating point as in integer.

- William Hughes
 
R

Random832

2006-12-22 said:
Is that a float precise to one cent?

Not as such. But generally when doing such a thing we would want it to
properly translate 19.9999980926513671875 (i.e. a single value off in
either direction) to 20.00
 
D

Dik T. Winter

>
> Not as such. But generally when doing such a thing we would want it to
> properly translate 19.9999980926513671875 (i.e. a single value off in
> either direction) to 20.00

Indeed. The statement for dollars should have been:
dollars = (int)(f * 100 + 0.5) / 100;
 
C

christian.bau

Dik said:
Christian. I thought you knew better than this. With the multiplication
by 1000 we get another rounding. (The multiplication is done in single
precision.) But apparently you should avoid gcc on both Linux and
Solaris. Try the following program with your favourite compiler:

Yes, I got this wrong. I missed that f * 1000.0 was again assigned to a
float. Which makes the situation more interesting. So here is my
analysis:

Given is a number f, which is equal to an integer x, divided by 100,
rounded to the nearest float value - which obviously gives a rather
large rounding error. An attempt to recover a more precise value is
done by calculating t = (float) (f * 1000.0), then calculating d =
(double) t / 1000.0.

Assume 2^k <= f < 2*2^k. f is equal to the correct x/100 + eps, where
|eps| is less than 1/2 of the least significant bit. f * 1000 is equal
to 10x + 1000eps. t will often be in the range 2^(k+10) <= t <
2*2^(k+10). The difference between f * 1000 and 10x is 1000eps, the
absolute value of this is less than 500 times the least significant
bit.

If 10x can be exactly represented as a float, and 10x >= 2^(2k+10),
then f * 1000 is guaranteed to be so close to 10x that it will be
rounded to 10x, so t = 10x, and therefore d will be a much more precise
result. However, this is not the case if 10x < 2^(2k+10) = 1024 * 2^k
or 1000f < 1024 * 2^k or f < 1.024 * 2^k. So any dollar values between
2^k and 1.024 * 2^k are suspicious, that is values 1.01 and 1.02, 2.01
to 2.04, 4.01 to 4.09 and so on.

The first number where this algorithm fails is 4.01, and it fails for
many values slightly above 4, 8, 16, 32 and so on. It also fails when
10x cannot be represented in a float value; since 10x is even this will
be the case when 10x > 2^25 (using 32 bit float) or when the dollar
value is more than 1.024 * $32768.

There is a twist if this is used for share prices: While most share
prices are a whole number of cents, prices for low valued shares can be
in tenths of cents or hundredths of cents. This algorithm also recovers
most, but not all, prices that are in tenths of cents correctly (and
will usually fail for hundredths of cents). Replacing the code with
something that rounds to the nearest cent will make this stop working.
 
C

CBFalconer

Dik T. Winter said:
.... snip ...

Indeed. The statement for dollars should have been:
dollars = (int)(f * 100 + 0.5) / 100;

Assuming dollars is an int (or even a double or float) the usual
rule about casts applies. They are almost always an error, or
useless. Simply omit them unless you have a very good and clear
reason for them. In this particular case the compiler will happily
adapt to declaring dollars as a long, or even a long long (for
C99). In fact, the code shown will happily discard the cents field
when destined for a float or double. Maybe that is what you
intended.

Assuming that is the intention (i.e. exact float representations by
making them integral) you can improve efficiency by:

int incents;
float dollars, cents, f;

incents = f * 100 + 0.5;
dollars = incents / 100;
cents = incents % 100;

avoiding the run time expense of multiple float to int conversions
(which the optimizer might have handled) and avoiding all casts.
The code will remain correct if the floats are changed to doubles,
simplifying maintenance.
 
D

Dik T. Winter

....
I:
>
> The question is. Can we determine the correct answer from
> the floating point value.

And I state: no.
> Assume possible precise answers are spaced at intervals of the smallest
> currency unit.
>
> We convert the floating point value to a precise answer by
> finding the closest precise answer.
>
> We now see that as long as the error is less that 1/2 of the smallest
> currency spacing, the closest precise answer will also be the
> correct answer.

Care to explain? I find that with an initial amount of $100 and with
an interest rate over some period of 2.5%, where the rules state that
the interest has been truncated to the nearest cent, the following
statements (assuming amount to be a long, damount and eamount doubles):
amount += amount * 5 / 200;
damount += damount * 5 / 200;
eamount = rint(eamount * 205 / 200);
there are already differences after the third calculation. You really
need "floor" rather than "rint" in the eamount statement to get the
correct results. And this is only a simple problem of compound
interest.

Determining whether the final result after a number of f-p
calculations is within 1/2 of the correct result is not doable.
>
> No, as long as the floating point representation remains within
> half of your smallest currency unit there is not problem.

Depends on the kind of rounding needed for the particular problem.
And when the calculations are only slightly difficult you get problems.
>
> Why? All I need is to be sure that I can recover the correct value.

And how can you be sure of that?
>
> No, you round where required by accountancy rules or when
> needed for precision. This may be much less than every step.

How do you keep track when you should round? In the above compound
interest example I see already that if I do not round at every step
that I get already a wrong result with f-p after three iterations.
> And there is a big advantage
> in being able to use the existing floating point system rather
> than having to obtain or roll you own fixed point system.

I do not think so. 64-bit integers are becoming quite common, and
they give better precision than double precision f-p.
>
> Why? There is no reason to turn you brain off when you use floating
> point. If the exp function does not give the accounting answer, and
> you want the accounting answer, then don't use the exp function (Duh!).

And if floating point does not give the accounting answer, and you want
the accounting answer, then don't use floating point.

But try it out with the above rules and an interest rate of 0.5 % per
period (which means that if that is a monthly period, the yearly
interest rate is about 6.12 %). After about 2 to 4 periods there is
a difference between integer and f-p when you do not do intermittently
rounding. (And if the initial capital is $101 the difference shows up
earlier...)
 
W

William Hughes

Dik said:
And I state: no.


Care to explain? I find that with an initial amount of $100 and with
an interest rate over some period of 2.5%, where the rules state that
the interest has been truncated to the nearest cent, the following
statements (assuming amount to be a long, damount and eamount doubles):
amount += amount * 5 / 200;
damount += damount * 5 / 200;
eamount = rint(eamount * 205 / 200);
there are already differences after the third calculation. You really
need "floor" rather than "rint" in the eamount statement to get the
correct results. And this is only a simple problem of compound
interest.

You are confusing levels. If the rule says truncate, then
truncate do not round. The "rounding" required to
convert a double representation to an integer
representation has little to do with the algorithm used
to compute the double representation.

There are two questions:

What algorithm should be implemented?

What data type should be used to implement the algorithm?

The two questions are different, and the first dominates. When you
change
data type should not change the algorithm!

Consider an algorithm for compound intererest with slightly
modified rounding, principal
p, interest rate per period i, number of periods n, currency dollar.

For period 1 to n

p = round to nearest cent ( p + i p)

We can calculate this using a sufficiently large integer type
or a sufficiently large floating point type. Assume p = 100.00
i= .05/year, the period is 6 months and there are 10 periods.


An integer implementation might look like

long p = 10000;
int 1000_i = 50;
int number_of_periods n=10;
int 1000_i_per_n;

1000_i_per_n = 1000_i / 2;

for(i = 1;i<=n;i++){

p += p*1000_i_per_n/1000

if( 2*( p%1000_i_per_n) >= 1000_i_per_n ){
p++;
}

}

At the end we convert to dollars and cents, dollars = p/100,
cents = p%100.

A floating point implemenation might look like

double p = 10000;
double i = .05;
double i_per_n;
int number_of_periods n = 10;

i_per_n = i/2.0;


for(i = 1;i<=10;i++){

p += p*i_per_n;

if( fmod(100*p,1.0) > 0.495) {
p += .01;
}


}

At the end you determine dollars = floor(p), cents = floor(100*p + 0.5)

I see no clear winner here. Both give exaclty the
same answer (even for unrealistically large values
of n) The rounding rule is a bit clearer
in the integer form (and if we specify truncation it can be
ommited altogether in the integer form, not so in the floating
point form) but the interest rate calculation is a bit more
natural in the floating point form. The integer form only works with
interest rate
per period in multiples of .001. This may need to be made
more precise. If accounting rules so specifiy, the interest rate
per period of the floating point form might have to be made less
precise.
Determining whether the final result after a number of f-p
calculations is within 1/2 of the correct result is not doable.


Depends on the kind of rounding needed for the particular problem.

No. The rounding needed for the particluar problem
is performed. The "rounding" needed to convert from the floating
point representation to an integer representation is something
different.
And when the calculations are only slightly difficult you get problems.

Not different than the problems associated with integer
calculations (you have the advantage that a truncation is
a nop with integer math. Given that you need to implement
general rounding rules in any case, I do not see this as
a big advantage)
And how can you be sure of that?


How do you keep track when you should round? In the above compound
interest example I see already that if I do not round at every step
that I get already a wrong result with f-p after three iterations.

If you do not implement the correct algorithm you get
the wrong answer (Duh!).

You have to round after every step with integer arithmetic.
The fact that if the rounding rule is truncation you can
ommit this step is not true in general.
I do not think so. 64-bit integers are becoming quite common, and
they give better precision than double precision f-p.

And both are too small for some real world problems.
The class of problems for which double is insufficient, but
64 bit integer is sufficient, is not very large.

You still need to deal with fractions. Whether
you do so by using a fixed point system or
approximate rational arithmetic, native
support for 64 bits will only take you so far.
And if floating point does not give the accounting answer, and you want
the accounting answer, then don't use floating point.

A sufficiently precise floating point can be used to implement
any accounting algorithm. The question is whether
this is less convenient that using (sufficiently large)
integer arithmetic.

- William Hughes
 
W

William Hughes

William said:
A floating point implemenation might look like

double p = 10000;
double i = .05;
double i_per_n;
int number_of_periods n = 10;

i_per_n = i/2.0;


for(i = 1;i<=10;i++){

p += p*i_per_n;

Correction

if( fmod(100*p,1.0) > 0.495) {
p += .01;
}

Should be

if( fmod(100*p,1.0) > 0.495) {
p += .01 - fmod(p,0.01);

}
else {
p -= fmod(p,0.01);
}

- William Hughes
 
D

Dik T. Winter

> Dik T. Winter wrote: ....
> Consider an algorithm for compound intererest with slightly
> modified rounding, principal
> p, interest rate per period i, number of periods n, currency dollar.
>
> For period 1 to n
> p = round to nearest cent ( p + i p)
> We can calculate this using a sufficiently large integer type
> or a sufficiently large floating point type. Assume p = 100.00
> i= .05/year, the period is 6 months and there are 10 periods.

If the period is 1/2 year, the interest per period is not one half of
the year interest. Try to do it over a five year period. A yearly
interest rate of .05 gives after five years $ 127.63 and an half-yearly
interest rate of .025 gives after five years $ 128.01. A difference
that some accountants worry about. There are specific rules how to
do interest about a period smaller than the base period (although the
rules depend on the situation).

But let me assume that the interest is 0.025 / 6 months.
> An integer implementation might look like
> long p = 10000;
> int 1000_i = 50;
> int number_of_periods n=10;
> int 1000_i_per_n;
>
> 1000_i_per_n = 1000_i / 2;
> for(i = 1;i<=n;i++){
> p += p*1000_i_per_n/1000
> if( 2*( p%1000_i_per_n) >= 1000_i_per_n ){
> p++;
> }
> }

I would do it as:
long amount = 20000; /* 2 times the amount in cents */
...
for(i = 1; i <= n; i++) {
amount += amount * 1000_i_per_n / 1000;
amount += (amount & 1);
}
amount /= 2; /* the real amount in cents after the calculations */
Your formulation is wrong. When the amount is 10769 cents (3-rd
iteration), the interest calculations give an interest of 269.225 cents,
that is rounded 269 cents. The next result is 11038 cents. Your formula
gives 11039.
What is wrong is that your formulation uses cents as units for p and
you try to find out what has been rounded off from that value. But your
formulation is about twice as slow as mine. (For 1,000,000 iterations
52 vs. 32 seconds.) Improving the condition when a cent should be added
would increase the calculation time.
> A floating point implemenation might look like
> double p = 10000;
> double i = .05;
> double i_per_n;
> int number_of_periods n = 10;
>
> i_per_n = i/2.0;
> for(i = 1;i<=10;i++){
> p += p*i_per_n;
> if( fmod(100*p,1.0) > 0.495) {
> p += .01 - fmod(p, 0.01);
> } else {
> p -= fmod(p, 0.01);
> }
> }

I incorporated your later improvement. But the result still is wrong.
You get already a difference after the 18-th iteration with the
formulation I did give. And when I introduce a "p = floor(p + 0.5);"
in each iteration it goes wrong at the 63-rd step (strange enough
the first case where the error is in favour of the bank). And without
the ftrunc the calculation (with 1,000,000 iterations) takes 0.73
seconds, with it it takes 0.90 seconds. You would be better off if
you had eliminated the complete conditional statement and replaced it
by a simple "p = floor(p + 0.5)". In that case the formulation would
be correct (and it would win in terms of speed on the machine I did
try it on: 0,22 seconds for 1,000,000 iterations). But now you simply
do an emulated version of integer arithmetic with floating point.
That can or can not be faster than a true integer formulation, depending
on machine architecture.
> I see no clear winner here. Both give exaclty the
> same answer (even for unrealistically large values
> of n)

I have indicated above where your calculations give wrong results.
> The rounding rule is a bit clearer
> in the integer form (and if we specify truncation it can be
> ommited altogether in the integer form, not so in the floating
> point form) but the interest rate calculation is a bit more
> natural in the floating point form.

But interest calculations for part periods are *far* from natural.
In my opinion, my formulation is clear, concise, and fast. *And*
you need proper scaling.
>
> Not different than the problems associated with integer
> calculations (you have the advantage that a truncation is
> a nop with integer math. Given that you need to implement
> general rounding rules in any case, I do not see this as
> a big advantage)

See above.
>
> If you do not implement the correct algorithm you get
> the wrong answer (Duh!).

Well, you did not do that either.
> You have to round after every step with integer arithmetic.
> The fact that if the rounding rule is truncation you can
> ommit this step is not true in general.

You also have to round after every step with f-p arithmetic. And it
does not matter whether the rounding rule is round to nearest, truncate
or the bankers rule. You have to do it. And I admit that bankers rule
is not easy in integer arithmetic, but it is also not easy in f-p
arithmetic. Let me make an attempt. a is the amount in cents, interest
is the interest per 1000 per period.
i = a * interest / 500;
if(i & 1) { /* if the interest truncates to half a cent */
if(i * 500 > a * interest) { /* if less than 0.5 is truncated */
i++; /* round up */
} else if(i & 2) { /* amount in cents is odd */
i++; /* round up */
}
}
a += (i >> 1);
This, indeed, can be done better in IEEE f-p:
i = rint(a * interest / 1000);
a += i;

But you are still emulating integer arithmetic in f-p.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,997
Messages
2,570,240
Members
46,828
Latest member
LauraCastr

Latest Threads

Top