str(bigint) is slow

B

Bryan

does anyone know how to make this faster? it seems that str(x) is the slow part.

.... t1 = time.time()
.... x = 19 ** 314159
.... t2 = time.time()
.... y = str(x)
.... t3 = time.time()
.... print y
.... t4 = time.time()
.... print t2-t1, t3-t2, t4-t3
....<snip a print out of a billion numbers>
3.78499984741 230.490999937 0.0700001716614


thanks,

bryan
 
T

Tim Roberts

Bryan said:
does anyone know how to make this faster? it seems that str(x) is the slow part.

Do you understand how much arithmetic is required to do this? You're
talking about more than 400,000 division/modulo operations on a multi-byte
number that is 166,000 bytes long. I, for one, am not surprised that it
takes a long time.
 
D

David Eppstein

Tim Roberts said:
Do you understand how much arithmetic is required to do this? You're
talking about more than 400,000 division/modulo operations on a multi-byte
number that is 166,000 bytes long. I, for one, am not surprised that it
takes a long time.

It doesn't have to be that slow -- you could do a single div/mod by
10**(n/2) (n=#digits of input) to split it into two numbers of roughly
half the number of digits, and continue recursively in both halves. The
bottleneck would be the first div/mod -- Python currently uses Karatsuba
for that, right?
 
T

Tim Peters

[David Eppstein]
It doesn't have to be that slow -- you could do a single div/mod by
10**(n/2) (n=#digits of input) to split it into two numbers of roughly
half the number of digits, and continue recursively in both halves. The
bottleneck would be the first div/mod -- Python currently uses Karatsuba
for that, right?

Computing the power in a decimal-friendly base to begin with runs much
faster than that (and I posted code the last time this came up, a
couple of weeks ago).

Python uses Karatsuba for multiplication, not division. Karatsuba may
or may not trigger during the computation of 10**(n/2) (depending on
how large n is), but never triggers during division.
 
B

Bryan

tim,

your code is incredibly fast... thank you

bryan


Tim said:
[David Eppstein]
It doesn't have to be that slow -- you could do a single div/mod by
10**(n/2) (n=#digits of input) to split it into two numbers of roughly
half the number of digits, and continue recursively in both halves. The
bottleneck would be the first div/mod -- Python currently uses Karatsuba
for that, right?


Computing the power in a decimal-friendly base to begin with runs much
faster than that (and I posted code the last time this came up, a
couple of weeks ago).

Python uses Karatsuba for multiplication, not division. Karatsuba may
or may not trigger during the computation of 10**(n/2) (depending on
how large n is), but never triggers during division.
 
D

David Eppstein

Tim Peters said:
[David Eppstein]
It doesn't have to be that slow -- you could do a single div/mod by
10**(n/2) (n=#digits of input) to split it into two numbers of roughly
half the number of digits, and continue recursively in both halves. The
bottleneck would be the first div/mod -- Python currently uses Karatsuba
for that, right?

Computing the power in a decimal-friendly base to begin with runs much
faster than that (and I posted code the last time this came up, a
couple of weeks ago).

Python uses Karatsuba for multiplication, not division. Karatsuba may
or may not trigger during the computation of 10**(n/2) (depending on
how large n is), but never triggers during division.

I wonder what the breakpoint would be for doing divisions by Newton
iteration with Karatsuba multiplication (asymptotic complexity = same as
multiplication) vs naive long division. Probably too many digits to
make it worthwhile most of the time.
 
T

Tim Peters

[David Eppstein]
I wonder what the breakpoint would be for doing divisions by Newton
iteration with Karatsuba multiplication (asymptotic complexity = same as
multiplication) vs naive long division. Probably too many digits to
make it worthwhile most of the time.

GNU's GMP developers spend their lives finding these tradeoffs, so
read the GMP docs and mailing lists for more than you can likely bear
on this,

Python's bigint implementation is aimed at 100% portability and
correctness more than speed. GMP doesn't hesitate to drop into
hyper-optimized hand-coded assembler when it helps. Partly as a
consequence of the resulting speed difference in the lowest-level
primitives, the cutoff for plain Karatsuba multiplication in Python is
so high that Karatsuba rarely triggers in Python (and probably never
in 99.99% of Python applications).

Do look at the last, recent iteration of this topic. The asymptotics
of elementary arithmetic operations are independent of base, so
computing powers in a decimal-friendly base to begin with wins big by
allowing conversion to decimal string to be done in time linear in the
number of digits. This allowed a pure-Python program to run circles
around GMP in producing the decimal representation of a very large
power.
 
B

Bryan

Tim said:
Python's bigint implementation is aimed at 100% portability and
correctness more than speed.

tim,

your BigDec class is so fast it's like night and day compared to the standard functions. are you implying that your
BigDec isn't 100% portable or correct or has some limitations? why can't python longs or bigints use your
implementation under the covers?

thanks,

bryan
 
T

Tim Peters

[Bryan]
your BigDec class is so fast it's like night and day compared to the standard
functions.

Not so: it does arithmetic much slower than the builtin arithmetic.
The only thing it's faster at is producing a decimal string at the
very end, and that only pays when the integer is so large that the
quadratic-time nature of the builtin binary-to-decimal conversion
becomes apparent. BigDec still uses the builtin binary-to-decimal
conversion for integers up to 5000 decimal digits, which is already
larger than virtually any real app ever uses. For ints much longer
than that, BigDec conversion to decimal string takes time linear in
the number of decimal digits, which has an unboundedly growing
advantage over the quadratic-time builtin conversion as the number of
digits increases. So the only thing it's good for is printing very
large powers in decimal. If you were willing to see your very large
powers in hex (base 16) instead, the builtin "print hex(large_power)"
is again much faster than BigDec.
are you implying that your BigDec isn't 100% portable or correct or has some
limitations?

Limitations? Heck, it doesn't even implement addition. The only
things it implements are multiplication and exponentiation of positive
integers (and relying heavily on the builtin binary longs to do most
of that), and efficient conversion to decimal string. That's a tiny
subset of what builtin longs support.
why can't python longs or bigints use your implementation under the covers?

Because it would be insane <wink>. After a few years of work (if
speed-obsessed volunteers appear to do it), the new-in-2.3.4 decimal
module should approach comparable speeds. For more reasons than I
count now, trying to use the current version of the decimal module to
print large powers in decimal is slower than using builtin longs.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,995
Messages
2,570,226
Members
46,815
Latest member
treekmostly22

Latest Threads

Top