Magic Optimisation

S

simonwittber

Hello People.

I've have a very tight inner loop (in a game app, so every millisecond
counts) which I have optimised below:

def loop(self):
self_pool = self.pool
self_call_exit_funcs = self.call_exit_funcs
self_pool_popleft = self.pool.popleft
self_pool_append = self.pool.append
check = self.pool.__len__
while check() > 0:
task = self_pool_popleft()
try:
task.next()
except StopIteration:
self_call_exit_funcs(task)
return
self_pool_append(task)

This style of optimisation has shaved _seconds_ from my iteration
cycle, esp. when I have many registered tasks, so this style of
optimisation is very important to me.

However, it is very ugly. Does anyone have any tips on how I could get
this optimisation to occor magically, via a decorator perhaps?

Sw.
 
P

Paul McGuire

This isn't much prettier, but what if you extract the try-except
overhead out from the while loop? You only expect the exception to
fire one time, at the end of the list. You can also eliminate any
localization of variables for calls that are not called in the loop,
such as self_pool (which does not seem to be used at all), and
self_call_exit_funcs.

-- Paul

def loop(self):
self_pool_popleft = self.pool.popleft
self_pool_append = self.pool.append
check = self.pool.__len__
try:
while check() > 0:
task = self_pool_popleft()
task.next()
self_pool_append(task)
except StopIteration:
self.call_exit_funcs(task)
return
 
P

Paul Rubin

However, it is very ugly. Does anyone have any tips on how I could get
this optimisation to occor magically, via a decorator perhaps?

Have you tried psyco?
 
S

simonwittber

I guess it is hard to see what the code is doing without a complete
example.

The StopIteration is actually raised by task.next(), at which point
task is removed from the list of generators (self.pool). So the
StopIteration can be raised at any time.

The specific optimisation I am after, which will clean up the code a
lot, is a way to auto-magically create self_attribute local variables
from self.attribute instance variables.

Sw.
 
S

simonwittber

Yes. It slows down the loop when there are only a few iterators in the
pool, and speeds it up when there are > 2000.

My use case involves < 1000 iterators, so psyco is not much help. It
doesn't solve the magic creation of locals from instance vars either.

Sw.
 
P

Paul Rubin

My use case involves < 1000 iterators, so psyco is not much help. It
doesn't solve the magic creation of locals from instance vars either.

How about using __slots__ to put those instance vars at fixed offsets
in the pool object (self then needs to be a new-style class instance).
That might or might not avoid the dict lookups.
 
S

simonwittber

def loop(self):
self_pool = self.pool
self_call_exit_funcs = self.call_exit_funcs
self_pool_popleft = self.pool.popleft
self_pool_append = self.pool.append
check = self.pool.__len__
while check() > 0:
task = self_pool_popleft()
try:
task.next()
except StopIteration:
self_call_exit_funcs(task)
return
self_pool_append(task)

Stupid me. the 'return' statement above should be 'continue'. Sorry for
the confusion.
 
T

Tomasz Lisowski

(e-mail address removed) napisał(a):
Stupid me. the 'return' statement above should be 'continue'. Sorry for
the confusion.

Then you can avoid continue by writing:

while check() > 0:
task = self_pool_popleft()
try:
task.next()
except StopIteration:
self_call_exit_funcs(task)
else:
self_pool_append(task)

Tomasz Lisowski
 
P

Paul McGuire

I still think there are savings to be had by looping inside the
try-except block, which avoids many setup/teardown exception handling
steps. This is not so pretty in another way (repeated while on
check()), but I would be interested in your timings w.r.t. your current
code.

def loop(self):
self_pool_popleft = self.pool.popleft
self_pool_append = self.pool.append
self_call_exit_funcs = self.call_exit_funcs
check = self.pool.__len__
while check() > 0:
try:
while check() > 0:
task = self_pool_popleft()
task.next()
self_pool_append(task)
except StopIteration:
self_call_exit_funcs(task)

-- Paul
 
B

Bengt Richter

I still think there are savings to be had by looping inside the
try-except block, which avoids many setup/teardown exception handling
steps. This is not so pretty in another way (repeated while on
check()), but I would be interested in your timings w.r.t. your current
code.

def loop(self):
self_pool_popleft = self.pool.popleft
self_pool_append = self.pool.append
self_call_exit_funcs = self.call_exit_funcs
check = self.pool.__len__
while check() > 0:
try:
while check() > 0:
task = self_pool_popleft()
task.next()
self_pool_append(task)
except StopIteration:
self_call_exit_funcs(task)

Why not let popleft trigger an exception out of while True instead,
and prevent tasks from raising StopIteration, and let them yield a None
to indicate keep scheduling with no special action, and something else
for optional differentiation of various exit options, e.g., zero for die,
and nonzero for suspension waiting for event(s) E.g., a returned integer
could be an event mask or single index (+ vs -) for thing(s) to wait for.
If you work things right, event check in the loop can be an if like
if waitedfor&events: process_events(), which most of the time is a fast no-op).

Then (without event stuff, and untested ;-) maybe something like:

def loop(self):
self_pool_popleft = self.pool.popleft
self_pool_append = self.pool.append
self_call_exit_funcs = self.call_exit_funcs
try:
while True:
task = self_pool_popleft()
if task.next() is None:
self_call_exit_funcs(task)
else:
self_pool_append(task)
except Indexerror:
pass

You could even consider putting the bound task.next methods in
the deque instead of the task, and using deque rotation instead
of popping and appending. Then, if you put the task.next's in
reverse order in the deque to start with, self_pool[-1] will be
the first, and self_pool_rotate() will bring the next into position.
Which would make it look like (untested!):

def loop(self):
self_pool_pop = self.pool.pop
self_call_exit_funcs = self.call_exit_funcs
self_pool_rotate = self.pool.rotate
try:
while True:
if self.pool[-1]() is None:
self_call_exit_funcs(self_pool_pop())
else:
self_pool_rotate()
except Indexerror:
pass

IWT if the pool remains unchanged most of the time, pool_rotate() ought
to be faster than popping and appending. Note that exit_funcs will need
a mapping of task.next -> task most likely, unless communication is
entirely via a mutable task state object, and all that's needed is to
me map to that and mess with exit state there and trigger a final .next()
to wrap up the generator.

Regards,
Bengt Richter
 
B

Bengt Richter

Why not let popleft trigger an exception out of while True instead,
and prevent tasks from raising StopIteration, and let them yield a None
to indicate keep scheduling with no special action, and something else
for optional differentiation of various exit options, e.g., zero for die,
and nonzero for suspension waiting for event(s) E.g., a returned integer
could be an event mask or single index (+ vs -) for thing(s) to wait for.
If you work things right, event check in the loop can be an if like
if waitedfor&events: process_events(), which most of the time is a fast no-op).

Then (without event stuff, and untested ;-) maybe something like:

def loop(self):
self_pool_popleft = self.pool.popleft
self_pool_append = self.pool.append
self_call_exit_funcs = self.call_exit_funcs
try:
while True:
task = self_pool_popleft()
if task.next() is None:
self_call_exit_funcs(task)
^^^^^^^^^^^^^^^^^^^^^^ oops, need to switch if and else branches
else:
self_pool_append(task)
^^^^^^^^^^^^^^^^^^^^^^ oops, need to switch if and else branches
except Indexerror:
pass

You could even consider putting the bound task.next methods in
the deque instead of the task, and using deque rotation instead
of popping and appending. Then, if you put the task.next's in
reverse order in the deque to start with, self_pool[-1] will be
the first, and self_pool_rotate() will bring the next into position.
Which would make it look like (untested!):

def loop(self):
self_pool_pop = self.pool.pop
self_call_exit_funcs = self.call_exit_funcs
self_pool_rotate = self.pool.rotate
try:
while True:
if self.pool[-1]() is None:
self_call_exit_funcs(self_pool_pop())
^^^^^^^^^^^^^^^^^^^^^^ oops, need to switch if and else branches
else:
self_pool_rotate()
^^^^^^^^^^^^^^^^^^^^^^ oops, need to switch if and else branches
except Indexerror:
pass

IWT if the pool remains unchanged most of the time, pool_rotate() ought
to be faster than popping and appending. Note that exit_funcs will need
a mapping of task.next -> task most likely, unless communication is
entirely via a mutable task state object, and all that's needed is to
me map to that and mess with exit state there and trigger a final .next()
to wrap up the generator.
Sorry. I wonder what else I goofed up ;-)

Regards,
Bengt Richter
 
S

simonwittber

Paul said:
I still think there are savings to be had by looping inside the
try-except block, which avoids many setup/teardown exception handling
steps. This is not so pretty in another way (repeated while on
check()), but I would be interested in your timings w.r.t. your current
code.

Your suggested optimisation worked nicely. It shaved 0.02 seconds from
a loop over 10000 iterators, and about 0.002 seconds from a loop over
1000 iterators.
 
A

ABO

Bengt said:
I still think there are savings to be had by looping inside the
try-except block, which avoids many setup/teardown exception handling
steps. This is not so pretty in another way (repeated while on
check()), but I would be interested in your timings w.r.t. your current
[...]
Why not let popleft trigger an exception out of while True instead,
and prevent tasks from raising StopIteration, and let them yield a None
to indicate keep scheduling with no special action, and something else
[...]

The rule of thumb with exceptions is use them for infrequent events. If
you keep to this you end up with clean and fast code.

Provided task.next() raises StopIteration less than about 25% of the
time it is called, it is cleaner and more efficient to use an exception
than to return None. It is also more efficient to handle terminating by
allowing popleft trigger an exception. Try the following;

def loop(self):
self_call_exit_funcs = self.call_exit_funcs
self_pool_popleft = self.pool.popleft
self_pool_append = self.pool.append
while True:
try:
task = self_pool_popleft()
task.next()
self_pool_append(task)
except StopIteration:
self_call_exit_funcs(task)
except IndexError:
break

There are other "optimisations" that could be applied that make this
code faster but uglier. For example, putting another "while True: loop
inside the try block to avoid the try block setup each iteration. Also,
exception handling is slower when you specify the exception class (it
has to check if the exception matches), so you might be able to arrange
this with an anonymous accept: block around the task.next() to handle
the StopIteration exception.

Another thing that disturbs me is the popfirst/append every iteration.
Depending on how many loops through all the tasks before one
"finishes", you might find it more efficient to do this;

def loop(self):
self_pool = self.pool
self_pool_remove = self_pool.remove
self_call_exit_funcs = self.call_exit_funcs
while self_pool:
try:
for task in self_pool:
task.next()
except:
self_pool_remove(task)
self_call_exit_funcs(task)
 
T

Terry Reedy

ABO said:
There are other "optimisations" that could be applied that make this
code faster but uglier. For example, putting another "while True: loop
inside the try block to avoid the try block setup each iteration.

In CPython, 'setting up' try-blocks is (intentionally) very fast, perhaps
as fast or faster than one interation loop.

tjr
 
P

Paul McGuire

Terry -

If setting up a try-block is as fast (or "takes as long") as one
iteration loop, then wont putting a try-block inside a loop double the
execution time?

-- Paul
 
T

Terry Reedy

Paul McGuire said:
Terry -

If setting up a try-block is as fast (or "takes as long") as one
iteration loop, then wont putting a try-block inside a loop double the
execution time?

It will double (more or less) the loop overhead. The effect of that
depends on the overhead/payload ratio. And, I was responding to the idea
that adding a loop to avoid most try statements would be significantly
faster.

Terry J. Reedy
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,264
Messages
2,571,317
Members
48,003
Latest member
coldDuece

Latest Threads

Top