how do you implement a reactor without a select?

  • Thread starter Michele Simionato
  • Start date
M

Michele Simionato

I have always been curious about how people implement mainloops (or,
in Twisted terminology, reactors). So I sit down and I wrote the
following simple implementation:

import itertools

class SimpleReactor(object):

DELAY = 0.001 # seconds

def __init__(self):
self._event = {} # action id -> (scheduled time, action, args)
self._counter = itertools.count(1) # action id generator
self.running = False

def callLater(self, deltat, action, *args):
"""Schedule an action with arguments args in deltat seconds.
Return the action id"""
now = time.time()
i = self._counter.next()
self._event = now + deltat, action, args
return i

def cancelCallLater(self, action_id):
"Cancel the action identified by action_id"
del self._event[action_id]

def default_action(self): # to be overridden
"Invoked at each lap in the mainloop"
time.sleep(self.DELAY) # don't run too fast, rest a bit

def cleanup_action(self): # to be overridden
"Invoked at the end of the mainloop"

def manage_exc(self, e):
"Invoked at each call"
raise e

def dooneevent(self):
"Perfom scheduled actions"
now = time.time()
for i, (start_time, action, args) in self._event.items():
if now >= start_time: # it's time to start the action
self.cancelCallLater(i) # don't run it again
try:
action(*args)
except Exception, e:
self.manage_exc(e)

def run(self):
"Run the main loop"
self.running = True
try:
while self.running:
self.default_action()
self.dooneevent()
except KeyboardInterrupt:
print 'Stopped via CTRL-C'
finally:
self.cleanup_action()

def stop(self):
self.running = False

Notice that I copied the Twisted terminology, but
I did not look at Twisted implementation because I did not want to
use a select (I assume that the GUI mainloops do not use it either).
The trick I use is to store the actions to perform (which are
callables identified by an integer) in an event dictionary and
to run them in the mainlooop if the current time is greater than
the scheduled time.
I had to add a time.sleep(.001) call in the default_action to avoid
consuming 100%
of the CPU in the loop.
I wonder if real mainloops are done in this way and how bad/good is
this implementation compared to a serious one. Any suggestion/hint/
advice
is well appreciated. Thanks,

Michele Simionato
 
D

Diez B. Roggisch

Notice that I copied the Twisted terminology, but
I did not look at Twisted implementation because I did not want to
use a select (I assume that the GUI mainloops do not use it either).

Why do you assume that? It's a wrong assumption. Yielding a thread/process
until the OS wakes it up because of IO to be performed is the proper way to
go. And at least in unix, IO is _everything_, also mouse-movements and
keyboard events. Most probably the OS will have specialized APIs (or some
wrapper lib has) that allow for reactor registration for events of
different kinds including timers. But basically, it's select - I mean you
could easily offer a timer as a file-object as well. Not sure if that's
done though.
The trick I use is to store the actions to perform (which are
callables identified by an integer) in an event dictionary and
to run them in the mainlooop if the current time is greater than
the scheduled time.
I had to add a time.sleep(.001) call in the default_action to avoid
consuming 100%
of the CPU in the loop.
I wonder if real mainloops are done in this way and how bad/good is
this implementation compared to a serious one. Any suggestion/hint/
advice
is well appreciated. Thanks,

It's ok, but of course more wasteful than it needs to be - better would be
full delegation to the OS.

Diez
 
A

Alex Martelli

Michele Simionato said:
I wonder if real mainloops are done in this way and how bad/good is
this implementation compared to a serious one. Any suggestion/hint/
advice is well appreciated. Thanks,

Module sched in Python's standard library may suggest one clearly-better
approach: when you know in advance when future events are scheduled for,
sleep accordingly (rather than polling every millisecond). sched's
sources are simple enough to study, and its architecture clean and
strong enough that it's easy to extend to other cases, e.g. where
previously-unscheduled events may be delivered from other threads,
without necessarily hacking the sources.

Specifically, sched implements the Dependency Injection DP: rather than
just calling time.time and time.sleep, it accepts those two callables
upon initialization. This makes it easy, among many other
customizations, to pass instead of time.sleep a user-coded callable
(typically a bound method) that "sleeps" by a wait-with-timeout on a
Queue (so that other threads, by putting an event on the Queue in
question, immediately wake up the scheduler, etc, etc).


Alex
 
M

Michele Simionato

Module sched in Python's standard library may suggest one clearly-better
approach: when you know in advance when future events are scheduled for,
sleep accordingly (rather than polling every millisecond). sched's
sources are simple enough to study, and its architecture clean and
strong enough that it's easy to extend to other cases, e.g. where
previously-unscheduled events may be delivered from other threads,
without necessarily hacking the sources.

Specifically, sched implements the Dependency Injection DP: rather than
just calling time.time and time.sleep, it accepts those two callables
upon initialization. This makes it easy, among many other
customizations, to pass instead of time.sleep a user-coded callable
(typically a bound method) that "sleeps" by a wait-with-timeout on a
Queue (so that other threads, by putting an event on the Queue in
question, immediately wake up the scheduler, etc, etc).

Alex

I know about sched (it was the first thing I looked at): the problem
is that sched
adopt a blocking approach and it basically requires threads, whereas I
wanted to
avoid them. Diez B. Roggisch's reply is closer to my expectations:

But what kind of specialized API do I have at my disposition for
timers on Linux?
It looks like this is the question I should have asked the first
time ;)


Michele Simionato
 
S

sjdevnull

Michele said:
Notice that I copied the Twisted terminology, but
I did not look at Twisted implementation because I did not want to
use a select (I assume that the GUI mainloops do not use it either).
The trick I use is to store the actions to perform (which are
callables identified by an integer) in an event dictionary and
to run them in the mainlooop if the current time is greater than
the scheduled time.
I had to add a time.sleep(.001) call in the default_action to avoid
consuming 100%
of the CPU in the loop.

Busy-looping like that is ugly and inefficient, even with the sleep
thrown in.

Most GUI main loops _do_ use either select() or poll(). When Xt/GTK/
Qt/etc have function like "gtk_add_input" which takes an fd that
you'll get notified about if it's written to while you're in the main
loop, that's just adding another fd to the select() loop.

There are other ways to wait for events on an fd, but they tend to be
less portable. Depending on your Unix flavor, epoll, /dev/poll,
kqueues, kevent, queued realtime signals, or something else might be
available from the OS (but probably not from Python without futzing
with ctypes or writing an extension). If you want details, check out
http://www.kegel.com/c10k.html

The alternatives usually aren't really faster unless you have hundreds
of connections, though--select/poll have major portability advantages,
so go with them unless you have a compelling reason.
 
A

Alex Martelli

Michele Simionato said:
I know about sched (it was the first thing I looked at): the problem
is that sched
adopt a blocking approach and it basically requires threads, whereas I

As the "sleep for time N" callable, you can pass any callable you wish;
I suggested one based on Queue.wait with timeout, which would indeed
require some other thread to wake it, but any kind of system call which
allows timeouts, such as select or poll, would work just as well.
wanted to
avoid them. Diez B. Roggisch's reply is closer to my expectations:


But what kind of specialized API do I have at my disposition for
timers on Linux?
It looks like this is the question I should have asked the first
time ;)

What do you expect from "timers on Linux" that you could not get with a
simple "sleep for the next N milliseconds"? A timer (on Linux or
elsewhere) can jog your process N milliseconds from now, e.g. with a
SIGALRM or SIGPROF, and you can set one with the setitimer syscall
(presumably accessible via ctypes, worst case -- I've never used it from
Python, yet), but how would that help you (compared to plain sleep,
select, poll, or whatever else best fits your need)?


Alex
 
M

Michele Simionato

What do you expect from "timers on Linux" that you could not get with a
simple "sleep for the next N milliseconds"? A timer (on Linux or
elsewhere) can jog your process N milliseconds from now, e.g. with a
SIGALRM or SIGPROF, and you can set one with the setitimer syscall
(presumably accessible via ctypes, worst case -- I've never used it from
Python, yet), but how would that help you (compared to plain sleep,
select, poll, or whatever else best fits your need)?

I hoped there was a library such thay I could register a Python
callable (say
a thunk) and having it called by the linux timer at time t without
blocking
my process. But if a Linux timer will just send to my process an
alarm, I would need to code myself a mechanism waiting for the alarm
and doing the function call. In that case as you say, I would be
better off with a select+timeout or a even with a queue+timeout, which
already do most of the job.

Michele Simionato
 
M

Michele Simionato

Busy-looping like that is ugly and inefficient, even with the sleep
thrown in.

Most GUI main loops _do_ use either select() or poll(). When Xt/GTK/
Qt/etc have function like "gtk_add_input" which takes an fd that
you'll get notified about if it's written to while you're in the main
loop, that's just adding another fd to the select() loop.

There are other ways to wait for events on an fd, but they tend to be
less portable. Depending on your Unix flavor, epoll, /dev/poll,
kqueues, kevent, queued realtime signals, or something else might be
available from the OS (but probably not from Python without futzing
with ctypes or writing an extension). If you want details, check outhttp://www.kegel.com/c10k.html

The alternatives usually aren't really faster unless you have hundreds
of connections, though--select/poll have major portability advantages,
so go with them unless you have a compelling reason.

I see where you are coming from. In a GUI or in a Web server most
of the time is spent waiting from input, so a busy loop design would
be a terrible
design indeed. But I had another use case in mind. The code I posted
is
extracted from a batch script I have, called 'dbexplorer'. The script
performs lots of
queries in a database and find out "bad" data according to some
criterium.
So at each iteration there is a default action which is nontrivial; it
becomes
a time.sleep only at the end, when the batch has finished its job and
it
is waiting for input.In normal conditions most of the time is spent
doing
something, not waiting, so the busy loop design here is not so bad.
Still I wanted to know what my alternatives were. And from this thread
I gather the impression that actually the only portable alternative is
using some
kind of select, unless I want to use threads, and in that case the
scheduler approach could be viable.
Anyway, that C10K page is really an interesting resource, thanks for
pointing it out!

Michele Simionato
 
A

Alex Martelli

Michele Simionato said:
I hoped there was a library such thay I could register a Python
callable (say
a thunk) and having it called by the linux timer at time t without
blocking
my process. But if a Linux timer will just send to my process an
alarm, I would need to code myself a mechanism waiting for the alarm
and doing the function call. In that case as you say, I would be
better off with a select+timeout or a even with a queue+timeout, which
already do most of the job.

Python or not, I don't know of a way to "register a callback" from the
OS without using threads. Considering that your callback WOULD be
executing on a different thread no matter what (your "only" thread being
blocked on some blocking syscall, or executing other code -- having
other code in your process suddenly start executing at that point is
pre-emptive threading, by whatever name you choose to call it), it's not
clear to me why the "avoiding threads" issue should matter to you.


Alex
 
A

Antoon Pardon

I hoped there was a library such thay I could register a Python
callable (say
a thunk) and having it called by the linux timer at time t without
blocking

I once played with the following module to do something similar.
Maybe it is usefull to you as is, or can give you an idea on how
to proceed. I only tested it on linux.

---------------------------- alarm.py --------------------


m signal import signal, SIG_IGN, SIGALRM
from time import time
from thread import allocate_lock
from heapq import heappush, heappop
from os import kill, getpid
import errno

from select import select, error as SelectException

from ctypes import *
libc = cdll.LoadLibrary("/lib/libc.so.6")

class _timeval(Structure):
_fields_ = [("tv_sec" , c_long), ("tv_usec", c_long)]

def timeval(tm):
sec = int(tm)
usec = int(1000000 * (tm - sec))
return _timeval(sec, usec)

class itimerval(Structure):
_fields_ = [("it_interval", _timeval), ("it_value", _timeval)]


def alarm(tm):
tv = timeval(tm)
ti = timeval(0.0)
ntv = itimerval(ti, tv)
otv = itimerval(timeval(0.0), timeval(0.0))
rt = libc.setitimer(0, byref(ntv), byref(otv))
#print otv.it_value.tv_sec , otv.it_value.tv_usec
if rt:
raise ValueError
else:
return otv.it_value.tv_sec + otv.it_value.tv_usec / 1000000.0

def sleep(tm):
wakeup = time() + tm
while tm >= 0:
try:
select([],[],[],tm)
except SelectException , Err_Info:
#print dir(Err_Info)
if Err_Info[0] != errno.EINTR:
raise
tm = wakeup - time()

alarms = []
alarm_lock = allocate_lock()

def AlarmHandler(sgnr, frame):
alarm_lock.acquire()
now = time()
while alarms and alarms[0].moment <= now:
current = heappop(alarms)
if not current.canceled:
current.func(*current.args, **current.kwds)
current.executed = True
now = time()
alarm_lock.release()
if alarms:
#print alarms[0].moment - now, alarms
alarm(alarms[0].moment - now)

signal(SIGALRM, AlarmHandler)

class Alarm(object):
def __init__(self, tp, func, *args, **kwds):
alarm(0)
try:
alarm_lock.acquire()
self.canceled = False
self.executed = False
self.func = func
self.args = args
self.kwds = kwds
self.moment = tp
heappush(alarms, self)
now = time()
delta = alarms[0].moment - now
#print alarms
finally:
alarm_lock.release()
if delta <= 0:
pass
kill(getpid(), SIGALRM)
else:
alarm(delta)

def __cmp__(self, other):
return cmp(self.moment, other.moment)

def __str__(self):
return "<Alarm for %d>" % self.moment

__repr__ = __str__

def Cancel(self):
try:
alarm_lock.acquire()
if self.executed:
raise ValueError, "Cancelation was too late"
else:
self.canceled = True
except:
alarm_lock.release()

-----------------------------------------------------------------------------------------------

You use it as follows:

from alarm import Alarm

alert = Alarm(exucutemoment, function, positionalarguments, keywordarguments)

# unless alert.Cancel is called before the alert went off, the function
# with its arguments will be called at the specified time.

# If you are using threads, you are advised to do most of the work in a
# different thread and leave the main thread to only treat the alarms.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,994
Messages
2,570,222
Members
46,810
Latest member
Kassie0918

Latest Threads

Top