Bizarre Floating point errors in Ruby? Serious bug?

  • Thread starter space.ship.traveller
  • Start date
S

space.ship.traveller

Hi,

I've come across a strange bug in ruby (running 1.8.6 on Linux, but
confirmed also in 1.8.5 on an older Mac).
=> -14013

The desired result is, obviously, -14014. Strangely enough:
=> -14014.0

And:
=> -14013

HOWEVER......
=> -14014

Is this a strange behaviour or what?

Workaround is like this:
=> -14014

Can someone please confirm this strange behaviour?... I know it is
also happening for some other numbers too:

(0..1000).each do |n|
n = n.to_f + (n.to_f / 100)
k = (n * 100)
if k != k.to_s.to_f
puts "K does not equal itself!? #{k} != #{k.to_s.to_f}"
end

if k.to_i != k.to_s.to_i
puts "Error with k = #{k}!? #{k.to_i} != #{k.to_s.to_i}"
end
end

I know a lot about floating point numbers, but this is really bizarre
behaviour.

Expected behaviour would be for no errors in the above test example. I
don't expect floating point to be accurate (this is obvious), but I do
expect floating point to be consistent (whole number floating point is
guaranteed to be accurate with IEEE floating point standard right??).

Thanks to anyone who can test this out.

Kind Regards,
Samuel Williams
 
C

Cameron McBride

Samuel,

Nothing is amiss. You're just not interpreting the floating point correctly.

Hi,

I've come across a strange bug in ruby (running 1.8.6 on Linux, but
confirmed also in 1.8.5 on an older Mac).

=> -14013

Not strange at all, look at this:-14013.9999999999981810

So, when you truncate with the #to_i, you get -14013 -- perfectly
consistent and logical.

Cameron
 
A

Alex Young

Hi,

I've come across a strange bug in ruby (running 1.8.6 on Linux, but
confirmed also in 1.8.5 on an older Mac).

=> -14013

The desired result is, obviously, -14014. Strangely enough:

=> -14014.0
This might explain it:

irb(main):007:0> sprintf("%0.12f" % (-140.14 * 100))
=> "-14013.999999999998"
And:

=> -14013

HOWEVER......

=> -14014

Is this a strange behaviour or what?

Workaround is like this:

=> -14014
Or:

irb(main):014:0> (-140.14 * 100).round
=> -14014
Can someone please confirm this strange behaviour?... I know it is
also happening for some other numbers too:

(0..1000).each do |n|
n = n.to_f + (n.to_f / 100)
k = (n * 100)
if k != k.to_s.to_f
puts "K does not equal itself!? #{k} != #{k.to_s.to_f}"
end

if k.to_i != k.to_s.to_i
puts "Error with k = #{k}!? #{k.to_i} != #{k.to_s.to_i}"
end
end

I know a lot about floating point numbers, but this is really bizarre
behaviour.

Expected behaviour would be for no errors in the above test example. I
don't expect floating point to be accurate (this is obvious), but I do
expect floating point to be consistent (whole number floating point is
guaranteed to be accurate with IEEE floating point standard right??).
The product of two whole numbers isn't, though. Apparently. It's been
a while since I knew lots about floating point representation, but
that's what this test case is telling me.
 
S

space.ship.traveller

Ah, that makes sense. I kind of suspected it may be something like
this. However, it seems awfully strange that this is a good default
behaviour (from IRB, or Ruby in general)?

For example, if I do the same in python:
-14013.999999999998

vs Ruby:

irb(main):005:0> (-140.14*100)
=> -14014.0

I think that Python is much better in this situation - -14014.0 is
obviously not the correct value of the floating point number.

Even PHP (!) has more "reliable" behaviour in this case (Possibly -
looks as if the return type is getting formatted as an int, don't
quote me on PHP being reliable... ^_^):

<? echo (-140.14*100.0) ?>
-14014

To be honest, I am not sure what is desirable behaviour. But.. I can
say that Python behaviour in this case is much clearer... to say "=>
-14014.0" is not clear at all, at the very least. The number with xyz.
0 indicates in mathematics that a number is accurate to one d.p., no?
If the number is actually xyz.abc - then we should be aware of that,
especially from something such as IRB where people debug algorithms,
or puts, often also used for algorithms...

This also causes special case behaviour... when you use .to_s we get a
different value completely - i.e. if we write k to a file, then read
it again, to_i will behave differently.. is this a good semantic to
have in place?

Regards,
Samuel
 
C

Cameron McBride

Ah, that makes sense. I kind of suspected it may be something like
this. However, it seems awfully strange that this is a good default
behaviour (from IRB, or Ruby in general)?

For example, if I do the same in python:

-14013.999999999998

vs Ruby:

irb(main):005:0> (-140.14*100)
=> -14014.0


This has nothing to do with the languages themselves. Both python and
ruby will agree. The difference is *completely* with how it's output
via the prompt. Try some of the following:

irb(main):001:0> "%g" % (-140.14*100)
=> "-14014"
irb(main):002:0> "%f" % (-140.14*100)
=> "-14014.000000"
irb(main):003:0> "%.10f" % (-140.14*100)
=> "-14014.0000000000"
irb(main):004:0> "%.16f" % (-140.14*100)
=> "-14013.9999999999981810"

printed format is NOT the same as internal format.
I think that Python is much better in this situation - -14014.0 is
obviously not the correct value of the floating point number.

Actually, they are the same according to double precision.
Even PHP (!) has more "reliable" behaviour in this case (Possibly -
looks as if the return type is getting formatted as an int, don't
quote me on PHP being reliable... ^_^):

<? echo (-140.14*100.0) ?>
-14014

To be honest, I am not sure what is desirable behaviour. But.. I can
say that Python behaviour in this case is much clearer... to say "=>
-14014.0" is not clear at all, at the very least. The number with xyz.
0 indicates in mathematics that a number is accurate to one d.p., no?
If the number is actually xyz.abc - then we should be aware of that,
especially from something such as IRB where people debug algorithms,
or puts, often also used for algorithms...

This also causes special case behaviour... when you use .to_s we get a
different value completely - i.e. if we write k to a file, then read
it again, to_i will behave differently.. is this a good semantic to
have in place?

This is known, and well discussed and misunderstood in many, many,
many places (I'm talking well outside of ruby here). You can also
create this problem in C. An ascii formatted number is *not* going to
be the same as the machine representation. It's sometimes a reason
people use a binary format.

In any case, you confusion is understandable - but there isn't
anything inconsistent or problematic here. It's just a rehashing of
the standard floating point representation.

Cameron
 
S

space.ship.traveller

This is known, and well discussed and misunderstood in many, many,
many places (I'm talking well outside of ruby here). You can also
create this problem in C. An ascii formatted number is *not* going to
be the same as the machine representation. It's sometimes a reason
people use a binary format.

In any case, you confusion is understandable - but there isn't
anything inconsistent or problematic here. It's just a rehashing of
the standard floating point representation.

Cameron

I understand this is not a problem for just Ruby, but in this case the
Ruby behaviour is not as obvious as Python. I am going to guess that
it is IRB doing the formatting, so it may be something that can be
changed in IRB.

When you are debugging something, it is important that you see the
actual data. For example, I can forgive puts for writing "-14014.0",
but I can't forgive IRB for printing that as the value of a variable.
Even in GDB, we will not get this kind of result

--- test.cpp ---
#include <iostream>

int main (int argc, char ** argv) {
double v = (-140.14 * 100.0);

std::cout << v << std::endl;
}

(gdb) break test.cpp:7
Breakpoint 1 at 0x1da8: file test.cpp, line 7.
(gdb) run
Starting program: /private/tmp/test
Reading symbols for shared libraries +++. done

Breakpoint 1, main (argc=1, argv=0xbffff9dc) at test.cpp:7
7 std::cout << v << std::endl;
(gdb) p v
$1 = -14013.999999999998 <=== This line is like "=> v"
(gdb) step
-14014 <=== This line is like "=> puts v"
8 }
(gdb)

So does this make clear what my issue is with this kind of formatting?
For example, in this situation, the debugger correctly shows me the
value, this is what I would expect:
=> -14013.999999999998 <=== Great!

We know the number isn't -14014.0 - which is mathematically incorrect
- this is the kind of information we need to know in a debugger.
=> -14014.0 <=== Acceptable

Having .0 on the output is not really a good default. The output
should be -14014 in this case, without any trailing .0 - as it stands,
this indicates that that number is accurate to 1dp. If numbers are
going to be formatted and rounded by default, best to do it correctly,
right?

Obviously, string representation is not accurate, I'm not stupid
enough to dispute that! However, I think it is important these things
are done consistently for the benefit of the programmer. Both GDB and
Python, and many other languages are consistently different from Ruby
in this respect.

I understand that you are trying to tell me that the number is the
same - I'm not arguing that, what I am arguing is that the way this is
revealed to the programmer is a problem.

Regards,
Samuel
 
M

MonkeeSage

I understand this is not a problem for just Ruby, but in this case the
Ruby behaviour is not as obvious as Python. I am going to guess that
it is IRB doing the formatting, so it may be something that can be
changed in IRB.

Float#to_s is doing the formatting. If you want it like python, do it
like this:

class Float
def to_s
"%.12f" % self
end
end

(-140.14 * 100)
# => -14013.999999999998

Regards,
Jordan
 
S

space.ship.traveller

Float#to_s is doing the formatting. If you want it like python, do it
like this:

class Float
def to_s
"%.12f" % self
end
end

(-140.14 * 100)
# => -14013.999999999998

Regards,
Jordan

Thanks for this useful information. After further investigation,
Python also has the same problem with its pretty printing:
-14014.0

Well, actually this is the correct answer.... i.e. the result we are
looking for if we were using real math. But, it is not the correct
way to round 14013.999999, which is the actual value that we get with
floating point math... so I guess it is hard to pick what is the ideal
behaviour... both have there pros and cons

Rounding 9.9 to 10.0 is like rounding 99 to 100, but we consider that
only 1 significant figure is important in 100, ie 1xx. So, we are left
with 10.0 with 1SF, but .0 conveys the idea of 1DP correctness, which
is definitely not correct.

I'd need to consult a mathematician, but I still don't think the Ruby
behaviour is correct, mathematically. I'll post here again once I have
more information about the mathematical correctness of this issue.

Thanks for your great comments,
Samuel
 
R

Robert Klemme

2007/11/27 said:
Thanks for this useful information. After further investigation,
Python also has the same problem with its pretty printing:

-14014.0

Well, actually this is the correct answer.... i.e. the result we are
looking for if we were using real math. But, it is not the correct
way to round 14013.999999, which is the actual value that we get with
floating point math... so I guess it is hard to pick what is the ideal
behaviour... both have there pros and cons

Rounding 9.9 to 10.0 is like rounding 99 to 100, but we consider that
only 1 significant figure is important in 100, ie 1xx. So, we are left
with 10.0 with 1SF, but .0 conveys the idea of 1DP correctness, which
is definitely not correct.

I'd need to consult a mathematician, but I still don't think the Ruby
behaviour is correct, mathematically. I'll post here again once I have
more information about the mathematical correctness of this issue.

You can save yourself the effort. No computational machine that uses
float math can guarantee to be "mathematical correct" in all
situations. The reason is fairly simple: machines have limited
resources to represent numbers while in math there are a lot of real
numbers around that cannot be represented with finite resources with
pi and e only being the most famous ones. q.e.d.

In your case however BigDecimal is sufficient:

$ irb -r bigdecimal
irb(main):001:0> a=BigDecimal.new '-140.14'
=> #<BigDecimal:7ff96e2c,'-0.14014E3',8(12)>
irb(main):002:0> a*100
=> #<BigDecimal:7ff92034,'-0.14014E5',8(20)>
irb(main):003:0> (a*100).to_i
=> -14014
irb(main):004:0> (a*100).to_f
=> -14014.0

Kind regards

robert
 
A

Alex Young

Thanks for this useful information. After further investigation,
Python also has the same problem with its pretty printing:

-14014.0

Well, actually this is the correct answer.... i.e. the result we are
looking for if we were using real math. But, it is not the correct
way to round 14013.999999, which is the actual value that we get with
floating point math... so I guess it is hard to pick what is the ideal
behaviour... both have there pros and cons

Rounding 9.9 to 10.0 is like rounding 99 to 100, but we consider that
only 1 significant figure is important in 100, ie 1xx. So, we are left
with 10.0 with 1SF, but .0 conveys the idea of 1DP correctness, which
is definitely not correct.

I'd need to consult a mathematician, but I still don't think the Ruby
behaviour is correct, mathematically. I'll post here again once I have
more information about the mathematical correctness of this issue.

#to_i doesn't round, it truncates. If you want to round, use #round.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,968
Messages
2,570,153
Members
46,701
Latest member
XavierQ83

Latest Threads

Top