extend methods of decimal module

M

Mark H. Harris

. . . Calling Decimal on a float performs an exact binary to
decimal conversion. Your reasoning essentially assumes that every
float should be interpreted as an approximate representation for a
nearby decimal value.

That is the whole point exactly. Yes, the exact binary to decimal conversion for float is the problem precisely. But my solution does not assume anysuch thing... because the decimal module is "advertised" to support what I'm doing. In other words, decimal correctly builds from a string literal in any case; even intermediary values. Well, the digits past 16 (for a double) aren't valid anyway... and the ones before that (when passed to decimalas a string) get correctly created as a decimal object.

But here is the other point... I am not planning on passing *any* of thesefunctions a float... my system that uses dmath uses strings only, or decimals. str(Decimal) works, as does Decimal(str()). So, I'm not reallyinterested in floats at all... but, and here's the big BUT, I'm expecting folks to use dmath.py from the console (as I plan to) and they are going to punch in *floats*. why? because its natural.

Its just easier to type D(2.78) than Deciaml('2.78').

Neither here nor there.... but the frustration is the fact that floats are so 1982. Know what I mean? This is the 21st century, and you know what, we have got to get past this:
In other words, we need to move to a numeric system of computation (decimal is a good start) that does not keep track of real values as IEEE floats.Back in the day, that was ok... today, its unacceptable.

So, Krah has a really nice (and fast) module that will help at least the python community (and users) to move into the 21st century. My hat is off to Stefan, that's for sure.

I am going to keep playing with this, and making the changes you have suggested... I'll put the code back up there on code.google and see if I can make the repository work like its supposed to..

Thanks again for your assistance, I really appreciate it, Oscar.
 
S

Steven D'Aprano

Decimal does not keep 0.1 as a floating point format (regardless of
size) which is why banking can use Decimal without having to worry about
the floating formatting issue... in other words, 0.0 is not stored in
Decimal as any kind of floating value... its not rounded.... it really
is Decimal('0.1').

I'm sorry, but that is incorrect. Decimal is a floating point format,
same as float. Decimal uses base 10, so it is a better fit for numbers we
write out in base 10 like "0.12345", but otherwise it suffers from the
same sort of floating point rounding issues as floats do.

py> a = Decimal("1.1e20")
py> b = Decimal("1.1e-20")
py> assert b != 0
py> a + b == a
True


In the case of 0.1 (I assume your "0.0" above was a typo), it is a
floating point value. You can inspect the fields' values like this:

py> x = Decimal("0.1")
py> x.as_tuple()
DecimalTuple(sign=0, digits=(1,), exponent=-1)

There's a sequence of digits, and an exponent that tells you where the
decimal point goes. That's practically the definition of "floating
point". In Python 3.2 and older, you can even see those fields as non-
public attributes:

py> x._int
'1'
py> x._exp
-1

(In Python 3.3, the C implementation does not allow access to those
attributes from Python.)

This is perhaps a better illustrated with a couple of other examples:


py> Decimal('1.2345').as_tuple()
DecimalTuple(sign=0, digits=(1, 2, 3, 4, 5), exponent=-4)

py> Decimal('1234.5').as_tuple()
DecimalTuple(sign=0, digits=(1, 2, 3, 4, 5), exponent=-1)


[...]
The reason is that Decimal(.1) stores the erroneous float in the Decimal
object including the float error for .1 and D(.1) works correctly
because the D(.1) function in my dmath.py first converts the .1 to a
string value before handing it to Decimal's constructor(s)

That *assumes* that when the user types 0.1 as a float value, they
actually intend it to have the value of 1/10 rather than the exact value
of 3602879701896397/36028797018963968. That's probably a safe bet, with a
number like 0.1, typed as a literal.

But how about this number?

py> x = 3832879701896397/36028797218963967
py> Decimal(x)
Decimal('0.10638378180104603176747701809290447272360324859619140625')
py> Decimal(str(x))
Decimal('0.10638378180104603')

Are you *absolutely* sure that the user intended x to have the second
value rather than the first? How do you know?

In other words, what you are doing is automatically truncating calculated
floats at whatever string display format Python happens to use,
regardless of the actual precision of the calculation. That happens to
work okay with some values that the user types in by hand, like 0.1. But
it is a disaster for *calculated* values.

Unfortunately, there is no way for your D function to say "only call str
on the argument if it is a floating point literal typed by the user".

But what you can do is follow the lead of the decimal module, and leave
the decision up to the user. The only safe way to avoid *guessing* what
value the caller wanted is to leave the choice of truncating floats up to
them. That's what the decimal module does, and it is the right decision.
If the user passes a float directly, they should get *exact conversion*,
because you have no way of knowing whether they actually wanted the float
to be truncated or not. If they do want to truncate, they can pass it to
string themselves, or supply a string literal.
 
C

Chris Angelico

Its just easier to type D(2.78) than Deciaml('2.78').

It's easier to type 2.78 than 2.718281828, too, but one of them is
just plain wrong. Would you tolerate using 2.78 for e because it's
easier to type? I mean, it's gonna be close.

Create Decimal values from strings, not from the str() of a float,
which first rounds in binary and then rounds in decimal.

ChrisA
 
M

Mark H. Harris

Decimal uses base 10, so it is a better fit for numbers we
write out in base 10 like "0.12345", but otherwise it suffers from the
same sort of floating point rounding issues as floats do.
py> Decimal('1.2345').as_tuple()
DecimalTuple(sign=0, digits=(1, 2, 3, 4, 5), exponent=-4)

py> Decimal('1234.5').as_tuple()
DecimalTuple(sign=0, digits=(1, 2, 3, 4, 5), exponent=-1)

Steven, thank you, your explanation here is splendid, and has cleared up some of my confusion about Decimal, and given me a tool besides ('preciate it!) I did not investigate .as_tuple() nice.

Take a look at this: From IDLE

Big difference, yes? You have hit the nail on the head, because as you say, it is very unfortunate that the function does not know whether it has been typed in by hand (a big problem) or whether it comes from an intermediate calculated result (maybe an even bigger problem. rats().

So, I am thinking I need to mods... maybe an idmath.py for interactive sessions, and then dmath.py for for running within my scientific scripts... ??

Thanks for your input.

kind regards,
 
C

Chris Angelico

So, I am thinking I need to mods... maybe an idmath.py for interactive sessions, and then dmath.py for for running within my scientific scripts... ??

No; the solution is to put quotes around your literals in interactive
mode, too. There's no difference between interactive and script mode,
and adding magic to interactive mode will only cause confusion.

Alternatively, there is another solution that's been posited from time
to time: Decimal literals. We currently have three forms of numeric
literal, which create three different types of object:
<class 'complex'>

If we had some other tag, like 'd', we could actually construct a
Decimal straight from the source code. Since source code is a string,
it'll be constructed from that string, and it'll never go via float.
Something like this:
<class 'decimal.Decimal'>

which currently is a SyntaxError, so it wouldn't collide with
anything. The question is how far Python wants to bless the Decimal
type with syntax - after all, if Decimal can get a literal notation,
why can't Fraction, and why can't all sorts of other types? And that's
a huge can of worms.

ChrisA
 
M

Mark H. Harris

Create Decimal values from strings, not from the str() of a float,
which first rounds in binary and then rounds in decimal.

Thanks Chris... another excellent point... ok, you guys have about convinced me (which is spooky) but, hey, I'm teachable... what is the best strategy then? Many of the functions of my dmath.py are algorithms which calculate infinite series to convergence out there at some number of precision .... do I make the assumption that all functions will take a string as argument and then let interactive users bare the responsibility to enter a string or decimal... avoiding floats... or use try and toss and error if the rules are not followed, or what? I do see that I'm trying to solve a problem the wrong way,... just not sure what is the better approach.

If you get a chance, take a look at the dmath.py code on:

https://code.google.com/p/pythondecimallibrary/

I got the repository working correctly for me, and the files can be viewed on-line ... its a long script, but not that hard to look through because all the functions pretty much work the same... when you've seen one converging series function in python you've seen them all!

Thanks again, Chris.
 
C

Chris Angelico

do I make the assumption that all functions will take a string as argument and then let interactive users bare the responsibility to enter a string or decimal... avoiding floats...

Just have your users pass in Decimal objects. They can construct them
however they wish.

ChrisA
 
S

Steven D'Aprano

If we had some other tag, like 'd', we could actually construct a
Decimal straight from the source code. Since source code is a string,
it'll be constructed from that string, and it'll never go via float.

Now that Python has a fast C implementation of Decimal, I would be happy
for Python 4000 to default to decimal floats, and require special syntax
for binary floats. Say, 0.1b if you want a binary float, and 0.1 for a
decimal.

But for now, backwards-compatibility requires that the default floating
point type remains binary float. But we could maybe agitate for a 1.234d
Decimal literal type. Care to write a PEP?

:)

The question is how far Python wants to bless the Decimal type with
syntax - after all, if Decimal can get a literal notation, why can't
Fraction, and why can't all sorts of other types? And that's a huge can
of worms.

I like Fractions, but I don't think they're important enough for the
average users to require literal notation.
 
C

Chris Angelico

Now that Python has a fast C implementation of Decimal, I would be happy
for Python 4000 to default to decimal floats, and require special syntax
for binary floats. Say, 0.1b if you want a binary float, and 0.1 for a
decimal.

Maybe, but I believe the cdecimal module is still slower than typical
floating point. There'd also be considerations regarding NumPy and how
you'd go about working with an array of non-integer values, and so on.
Certainly this will be an extremely reasonable topic of discussion
once there's any notion of a Py4K on the cards. There'll be arguments
on both sides.
But for now, backwards-compatibility requires that the default floating
point type remains binary float. But we could maybe agitate for a 1.234d
Decimal literal type. Care to write a PEP?

:)

Heh. Strong consideration here: it would mean importing the decimal
module on startup.
0.0

A dummy import (when it's already loaded) is so fast that it's
immeasurable, but four and a half seconds to load up decimal? This is
3.4.0b2 on Windows, btw. It was a lot quicker on my Linux box,
probably because the OS or disk cache had the file. So maybe it
wouldn't be too bad in practice; but it's still a cost to consider.
I like Fractions, but I don't think they're important enough for the
average users to require literal notation.

Yeah, but where do you draw the line? Either decimal.Decimal becomes a
built-in type, or there needs to be a system for constructing literals
of non-built-in types. And if Decimal becomes built-in, then why that
and not <<insert type name here>>?

Also, if Decimal becomes a built-in type, does that affect the numeric tower?

ChrisA
 
W

Wolfgang Maier

Mark H. Harris said:
Thanks Chris... another excellent point... ok, you guys have about
convinced me (which is spooky) but,
hey, I'm teachable... what is the best strategy then?

If you get a chance, take a look at the dmath.py code on:

https://code.google.com/p/pythondecimallibrary/

I got the repository working correctly for me, and the files can be viewed
on-line ... its a long script, but
not that hard to look through because all the functions pretty much work
the same... when you've seen one
converging series function in python you've seen them all!

Thanks again, Chris.

Hi Mark,
I quickly skimmed through your code and I don't think there is a need for
your D() function at all. I thought it was a class adding some extra
functionality to Decimal (that's because you used a capital letter for its
name), but now I realized that it's just a function returning a Decimal from
the string representation of its argument.
Since by now, I guess, we all agree that using the string representation is
the wrong approach, you can simply use Decimal instead of D() throughout
your code.
Best,
Wolfgang
 
C

casevh

No... was not aware of gmpy2... looks like a great project! I am wondering
why it would be sooo much faster?

For multiplication and division of ~1000 decimal digit numbers, gmpy2 is ~10x
faster. The numbers I gave were for ln() and sqrt().
I was hoping that Stefan Krah's C accelerator was using FFT fast fourier
transforms for multiplication at least...
.. snip ..
I have not looked at Krah's code, so not sure what he did to speed things
up... (something more than just writing it in C I would suppose).

IIRC, cdecimal uses a Number Theory Transform for multiplication of very large
numbers. It has been a while since I looked so I could be wrong.

casevh
 
S

Steven D'Aprano

Maybe, but I believe the cdecimal module is still slower than typical
floating point.

Yes, cdecimal is about 10 times slower than binary floats.

But the point is, for most applications, that will be plenty fast enough.
And for those where it isn't, there's always binary.

There'd also be considerations regarding NumPy and how
you'd go about working with an array of non-integer values, and so on.

I would expect that numpy 4000 would still use binary floats internally,
and simply so a one-off conversion of decimals to floats when you
initalise the array. Converting decimals to floats is no less accurate
than converting base-ten strings to floats, so there's no loss there.

[...]
Heh. Strong consideration here: it would mean importing the decimal
module on startup.

0.0

A dummy import (when it's already loaded) is so fast that it's
immeasurable, but four and a half seconds to load up decimal? This is
3.4.0b2 on Windows, btw. It was a lot quicker on my Linux box, probably
because the OS or disk cache had the file. So maybe it wouldn't be too
bad in practice; but it's still a cost to consider.

I would expect that by the time Python 4000 has a concrete
implementation, that figure will be a lot lower. Either due to software
optimisations, or just the general increase in speed in computer hardware.


Yeah, but where do you draw the line? Either decimal.Decimal becomes a
built-in type,

That's where you draw the line. Binary floats for speed, decimals for
compatibility with school arithmetic. (Well, mostly compatible.)

or there needs to be a system for constructing literals
of non-built-in types. And if Decimal becomes built-in, then why that
and not <<insert type name here>>?

'Cos we have ten fingers and in count in decimal :p

Also, if Decimal becomes a built-in type, does that affect the numeric
tower?

I don't see why whether the type is built-in or not should affect its
position in the numeric tower. (I would expect that by the time Python
4000 comes around, Decimal will be nicely integrated in the tower.)
 
C

Chris Angelico

'Cos we have ten fingers and in count in decimal :p

We talk in words and characters, so we have an inbuilt Unicode type.
We count in decimal using Arabic numerals, so we have an inbuilt
Decimal type. We also learn, in grade school, to manipulate vulgar
fractions, so should Fraction be inbuilt? And we use transcendental
numbers, too. And ordered mappings - most real-world interpretations
of "dictionary" include that it's sorted alphabetically. Not all of
them need to be inbuilt.
I don't see why whether the type is built-in or not should affect its
position in the numeric tower. (I would expect that by the time Python
4000 comes around, Decimal will be nicely integrated in the tower.)

Well, it's more important if it's the default (and asking for an
explicit float if you want it), but it would still be a bit odd for
just one of the built-in numeric types to not have a place in an
otherwise-tidy tower. But definitely, if it's the default, we have to
ask: what about complex numbers? Are they now two Decimals? Can we get
complex floats? And does all this mean there's a massive duplication
going on? What happens if you sum() a Decimal, a float, a Decimal
complex, and a float complex? What's the resulting type? All these
questions would have to be answered.

That said, though, I would support the addition of a Decimal literal,
and start encouraging its use. Python startup performance is a cost,
but maybe the cost of Decimal could be either reduced or deferred till
first use, so that's not so obvious. Being able to tell people "Just
type 0.1d and it'll be more accurate at the expense of being slower"
would be a significant gain.

But someone else can champion that PEP :)

ChrisA
 
M

Mark H. Harris

Since by now, I guess, we all agree that using the string representation is
the wrong approach, you can simply use Decimal instead of D() throughout
your code.

Wolfgang


hi Wolfgang, ...right... I'm going to clean it up. Thanks, 'preciate it.
 
M

Mark H. Harris


Yes. ... and for clarification back to one of my previous comments, when Irefer to 'float' I am speaking of the IEEE binary floating point representation built-in everywhere... including the processor! ... not the concept of tracking a floating point, say in decimal for instance.

Isn't it amazing... the IEEE binary float or double is so entrenched (including the processor) that even though it has inherent problems... we are kinda stuck with it for a while.

I have been thinking that we need extended instructions in the processor (similar to MMX) to handle accelerating decimal float floating point arithmetic, as in the decimal module. Back in the day when memory was expensive binary floats made sense... but today there is no reason to continue to stick with that limitation.

And on the other hand, think of all the amazing things folks have done with floats and doubles... all these years.

marcus
 
M

Mark H. Harris

Are you aware that IEEE 754 includes specs for decimal floats? :)

Yes. I am from back in the day... way back... so 754 1985 is what I havebeen referring to.

IEEE 854 1987 and the generalized IEEE 754 2008 have the specs for decimal floating point included. Everyone is thinking in the same general direction... its kinda like the Y2K problem based on a stupid limit that because of entrenchment takes forever to resolve moving forward with technological advance.

marcus
 
M

Mark H. Harris

Now that Python has a fast C implementation of Decimal, I would be happy
for Python 4000 to default to decimal floats, and require special syntax
for binary floats. Say, 0.1b if you want a binary float, and 0.1 for a
decimal.

Steven

Just a side note on how fast... Stefan Krah's performance specs state 120x improvement on many multiplication computations (like PI for instance)... well, its not hype.

On my P7350 dual core 2Ghz Intel box (2009 mac mini) running Gnu/Linux, I used the piagm(n) AGM routine from my dmath.py library to benchmark against my own C routines, BC, and a couple of other math packages. The results were phenomenal... my processor is a low-end gem as compared to modern SOTA processors out there, and even yet:

1 million digits of PI --- 13 minutes
10 million digits of PI --- 3 hours 55 minutes

Those were new digit/time PRs for me, by-the-by... and the other methods I've used don't even come close... so, Stefan is doing some kind of transform in "decimal" over and above just compiling the extension in C that isreally speeding things up quite a bit.

(that was just a random side note, sorry)
 
M

Mark H. Harris

Now that Python has a fast C implementation of Decimal, I would be happy
for Python 4000 to default to decimal floats, and require special syntax
for binary floats. Say, 0.1b if you want a binary float, and 0.1 for a


Steven



Just a side note on how fast... Stefan Krah's performance specs state 120x improvement on many multiplication computations (like PI for instance)...well, its not hype.



On my P7350 dual core 2Ghz Intel box (2009 mac mini) running Gnu/Linux, Iused the piagm(n) AGM routine from my dmath.py library to benchmark against my own C routines, BC, and a couple of other math packages. The results were phenomenal... my processor is a low-end gem as compared to modern SOTA processors out there, and even yet:



1 million digits of PI --- 13 minutes

10 million digits of PI --- 3 hours 55 minutes
[/QUOTE]

Oh, rats(), I forgot the comparison....

Py3.2 [ 1 million digits : 21 hours 21 minutes ] --> Py3.3.4 [ 1 million digits : 13 minutes ]

... that is astounding. All I did was install 3.3.4, no change in the AGM routine.

Cheers
 
A

Anssi Saari

Terry Reedy said:
As one of 'them', thank you for the feedback. There are still some
bugs, but I hit them seldom enough that I am now looking at
enhancement issues.

I recently watched a presentation by Jessica McKellar of PSF about what
Python needs to stay popular. Other than the obvious bits (difficulties
or limited support of Python on major platforms like Windows and mobile)
the slight lack of perfection in IDLE was mentioned. Specifically the
old blog post titled "The Things I Hate About IDLE That I Wish Someone
Would Fix" at
http://inventwithpython.com/blog/20...ate-about-idle-that-i-wish-someone-would-fix/

It lists 17 issues and some more in the comments. Are those things
something that could be considered? Or have maybe been done already? I'm
not an IDLE user so this is mostly an academic interest on my part.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,994
Messages
2,570,222
Members
46,810
Latest member
Kassie0918

Latest Threads

Top