Why is this loop heavy code so slow in Python? Possible Project Euler spoilers

  • Thread starter Bruno Desthuilliers
  • Start date
D

Diez B. Roggisch

Wildemar said:
OK, good. Naive question now comming to mind: Why doesn't Python do the
latter as well?

because of the dynamic nature of it. Java is statically typed, so the
JIT can heavily optimize.

OTH psyco IS a JIT-compiler to optimize certain calculations which are
mostly of a numerical nature. But this can't be done to the extend it is
possible in java.

Diez
 
C

Chris Mellon

because of the dynamic nature of it. Java is statically typed, so the
JIT can heavily optimize.

OTH psyco IS a JIT-compiler to optimize certain calculations which are
mostly of a numerical nature. But this can't be done to the extend it is
possible in java.

Original code: 3 min, 9 seconds
Original code with psyco: 30.28 seconds
Original code, compiled with Pyrex: 1min 39 seconds (moves the for loops into C)
Same, with a,b,c declared with cdef int: 20 seconds (uses c pow()
instead of going through Python).
Same, with the for..xrange loops rewritten to use C integer loops: 13
seconds (saves xrange use, but since the python loop was already gone,
not too much savings).

With a small amount of work, you should be able to implement the C
algorithm in Pyrex (or even just use the C algorithm, in a wrapper
that converts the return value to an int) and get the same speed as
the C version + a constant marshalling factor.

Adding pysco took all of 20 seconds (half that because I needed to
move the module scope code into a function), and re-writing with pyrex
took a minute or two. So, as a demonstration of numeric optmization,
both of them can give you quite good rewards with minimal effort.
 
N

Neil Cerutti

Python is not implemented like Java. In Java (at least in
HotSpot), the byte code is further compiled to machine code
before execution; in Python, the byte code is interpreted.

Whether this makes Python an interpreter or a compiler, I don't
know.

I'd call it an integrated compiler and virtual machine--a classic
combination.
 
N

Neil Cerutti

A big question mark in my mind is Lisp, which according to
aficionados is just as dynamic as Python, but has native
compilers that generate code running as fast as highly
optimized C. I'm not qualified to judge whether the lessons
learnt from Lisp can be applied to Python, but in any case Lisp
is an old, old language -- only Fortran is older. The amount of
development effort and money put into Lisp dwarfs that put into
Python by possibly a hundred or more.

Lisp, as far as I know, requires type declarations, discipline,
deep knowledge of Lisp, and more than passing knowledge of your
Lisp implementation in order to generate code that's competitive
with C. On the other hand, Lisp afficionados don't have a problem
meeting those requirements. ;)
So... if you'd like to see Python run as fast as C or Lisp, and
you have a few tens of millions of dollars spare to invest in
development, I think the Python Software Foundation would love
to hear from you.

Hmmm. I have four dollars... almost five...
 
P

Paul Rubin

Steven D'Aprano said:
question mark in my mind is Lisp, which according to aficionados is
just as dynamic as Python, but has native compilers that generate
code running as fast as highly optimized C. I'm not qualified to
judge whether the lessons learnt from Lisp can be applied to Python,
but in any case Lisp is an old, old language -- only Fortran is
older. The amount of development effort and money put into Lisp
dwarfs that put into Python by possibly a hundred or more.

Writing a simple Lisp compiler is not all that difficult. There's a
chapter in SICP about how to do it, so it's sometimes part of
introductory courses. To get C-like performance you end up having to
rely on user-supplied type declarations and/or type inference, but
even without that you can still do ok. I'd say Lisp is a less dynamic
language than Python. Lisp doesn't have duck typing and doesn't
represent class instance variables as dictionary elements that have to
be looked up at runtime all the time. This maybe even applies to
locals, e.g. in Python if you say

x = 3
print "hello"
y = x + 4

you're possibly not guaranteed that y=7, because you might have bound
sys.stdout to something with a .write method that reaches up the call
stack and messes with the caller's local variables, and if the langref
manual says this is supposed to work, then the compiler has to do it
like CPython. Maybe Python 4.0 will fix some of this stuff.
 
J

jwrweatherley

Thanks for all the answers to my question. I think what I need to take
away from this is that xrange is an object - I thought it was just
some loop construct, and that maths is slow in python - so avoid
pathological looping.I remember the first time I tried Objective-C on
OS X I used the NSNumber class for arithmetic - the results that time
were pretty awful too!
 
P

Paul Boddie

Thanks for all the answers to my question. I think what I need to take
away from this is that xrange is an object

Indeed, but using xrange can be faster than converting your "for"
loops to "while" loops plus a counter; I converted your code to use
the latter arrangement and it was noticeably slower. Perhaps the
repeated invocation of each xrange object's next method is less
expensive than repeatedly obtaining new incremented int objects and
testing them against other int objects.
- I thought it was just some loop construct, and that maths is slow in
python - so avoid pathological looping.

You'd be wise to avoid doing unnecessary work deep within nested loops
in any programming language. Sadly, it's not possible for the compiler
to work out that some calculations can be lifted out of the innermost
loops in Python, since the various guarantees that make such
operations possible in other languages are not offered by Python.
I remember the first time I tried Objective-C on OS X I used the
NSNumber class for arithmetic - the results that time were pretty
awful too!

Obviously, it'd be a fairer test if you used comparable numeric types
in both implementations, but a more capable numeric type would be
overkill for the C implementation of this program, and a less capable
numeric type doesn't exist for Python unless you're willing to use a
dialect such as Pyrex (as others have shown).

Paul
 
B

Bruno Desthuilliers

Martin v. Löwis a écrit :
Python is not implemented like Java. In Java (at least in HotSpot),
the byte code is further compiled to machine code before execution;

This is a VM-specific feature.
in Python, the byte code is interpreted.
Idem.

Whether this makes Python an interpreter or a compiler,
I don't know.

This is an old troll^Mdiscussion !-)

Now IANAL, but AFAIK, the byte-code compilation stage can make a great
difference performances-wide, and for a same language, a byte-compiled
implementation is likely to be faster than a pure-interpreted one, at
least because of the saving on parsing time (please someone correct me
if I'm wrong) ...
 
R

Robert Brown

Lisp, as far as I know, requires type declarations, discipline,
deep knowledge of Lisp, and more than passing knowledge of your
Lisp implementation in order to generate code that's competitive
with C.

On my Mac Mini, the original Python code runs in 6 minutes 37 seconds using
Python 2.3.5. The Common Lisp code below, a straightforward translation,
containing *no* type declarations, runs in 27 seconds on the same Mini using
SBCL.

When the commented out optimization declaration is included in the code, the
run time drops to 3.3 seconds. For comparison, run times with GCC on the C
version posted earlier are 3.5 seconds unoptimized and 0.58 seconds with
optimization flag "-O3".

So for this example, deep knowledge of the Lisp implementation and type
declarations are not required to get speed equivalent to unoptimized C code.
Approaching the speed of optimized C code does require more work.

====================

(defun doit ()
;; (declare (optimize (speed 3) (safety 0) (debug 0)))
(let ((solutions (make-array 1001 :initial-element 0)))
(loop for a upfrom 1 below 10000
do (loop for b upfrom 1 below (- 1000 a)
do (loop for c upfrom 1 below (- 1000 a b)
do (let ((p (+ a b c)))
(when (and (< p 1000)
(= (+ (* a a) (* b b)) (* c c)))
(incf (aref solutions p)))))))
(loop with max-index = 0
with max = 0
for solution across solutions
for index upfrom 0
do (when (> solution max)
(setf max solution)
(setf max-index index))
finally (print max-index))))
 
N

Neil Cerutti

On my Mac Mini, the original Python code runs in 6 minutes 37
seconds using Python 2.3.5. The Common Lisp code below, a
straightforward translation, containing *no* type declarations,
runs in 27 seconds on the same Mini using SBCL.

When the commented out optimization declaration is included in
the code, the run time drops to 3.3 seconds. For comparison,
run times with GCC on the C version posted earlier are 3.5
seconds unoptimized and 0.58 seconds with optimization flag
"-O3".

So for this example, deep knowledge of the Lisp implementation
and type declarations are not required to get speed equivalent
to unoptimized C code. Approaching the speed of optimized C
code does require more work.
====================

(defun doit ()
;; (declare (optimize (speed 3) (safety 0) (debug 0)))
(let ((solutions (make-array 1001 :initial-element 0)))
(loop for a upfrom 1 below 10000
do (loop for b upfrom 1 below (- 1000 a)
do (loop for c upfrom 1 below (- 1000 a b)
do (let ((p (+ a b c)))
(when (and (< p 1000)
(= (+ (* a a) (* b b)) (* c c)))
(incf (aref solutions p)))))))
(loop with max-index = 0
with max = 0
for solution across solutions
for index upfrom 0
do (when (> solution max)
(setf max solution)
(setf max-index index))
finally (print max-index))))

That's pretty cool. Thanks for the example.
 
P

paulhankin

... right-angled triangle puzzle solver

max = 0
maxIndex = 0
index = 0
for solution in solutions:
if solution > max:
max = solution
maxIndex = index
index += 1

print maxIndex

Not that it solves your slowness problem, or has anything to
do with the question you asked :), but you can replace this
last segment of your code with:

print solutions.index(max(solutions))
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,968
Messages
2,570,154
Members
46,702
Latest member
LukasConde

Latest Threads

Top