Proposal for adding Shallow Threads and a Main Loop to Python

R

Rhamphoryncus

First a bit about myself. I've been programming in python several
years now, and I've got several more years before that with C. I've
got a lot of interest in the more theoretical stuff (language design,
component architectures, etc). Of late my focus has been on concurrent
operations (and on how to design a GUI architecture, but that's not
what this post is about). I've looked at threads, and the inability to
kill them easily was a problem. To get them to quit at all you had to
explicitly check a var that'd trigger it. Might as well be using
event-driven.. so I looked at event-driven. That doesn't have the
issue with ending operations, but it does make writing code much
harder. Anything more than the most trivial loop has to be turned into
a state machine instead.

But recently I came up with a solution to that too. I call it Shallow
Threads.

A shallow thread is just a generator modified in the most obvious way
possible. The yield statement is replaced with a waitfor expression.
You give it the object you wish to "wait for". Then when it's ready
you get back a return value or an exception. These waitfor expressions
are the only points where your shallow thread may get suspended, so
it's always explicit. If you call another function it will be treated
exactly as a normal function call from a generator (this is why they're
'shallow'), so there's no issues with calling C functions.

On the other end of things, your shallow thread object would have
__resume__ and __resumeexc__ methods (or perhaps a single __resume__
method with a default raise_=False argument). They return a tuple of
(mode,value) where mode is a string of 'waitfor', 'exception', or
'return' and value corresponds to that. These methods will be used by
a main loop to resume the shallow thread, like next() is used with a
generator.

A main loop is the next part of my proposal. Where shallow threads
could be added with a fairly small patch, a main loop would require a
much more extensive one. And unfortunately you need a main loop to use
shallow threads. (You could use twisted but that's got the problems of
being third-party and not designed for inclusion in the python core.)

First part is the concept of an "event notifier object". This is
simply an object that is not "done" yet, and you can "watch" and be
given a single value (or exception) when it is done. This could be
something internal to your program like a shallow thread above, or it
could be something external like a file read completing. This would,
like most other protocols in python, involve setting an attribute or a
method to support it. I haven't yet figured out the best design
though.

We need versions of many existing functions that produce event
notifiers instead. I suggest adding an async keyword to the existing
functions, defaulting to False, to indicate an event notifier should be
produced. For example: waitfor (waitfor open("hello.txt",
async=True)).read(async=True). At some mythological point in the
future, perhaps the default for async to be switched to True and that
example would get much shorter.

Now as for the main loop itself, I suggest a mainloop module. The
primary function used would be mainloop.runUntil(), which would take an
event notifier as it's argument and would return as soon as that event
notifier was triggered. An example of this and other features follows.

import mainloop, urllib

def get_and_save(path):
infile = waitfor urllib.urlopen(path, async=True)
outfile = waitfor open(path.split('/')[-1], async=True)
waitfor outfile.write(waitfor infile.read(async=True), async=True)
infile.close()
outfile.close()

def main():
a = get_and_save("http://python.org/pics/PyBanner021.gif")
b = get_and_save("http://python.org/pics/pythonHi.gif")
c = get_and_save("http://python.org/pics/PythonPoweredSmall.gif")

waitfor allDone(a, b, c)

if __name__ == "__main__":
mainloop.runUntil(main())

Well there you have it. I've glossed over many details but they can be
cleared up later. What I need to know now is how everybody else thinks
about it. Is this something you would use? Does it seem like the
right way to do it? And of course the all important one, can I get it
in to python core? <0.5 wink>
 
G

Gary D. Duzan

import mainloop, urllib

def get_and_save(path):
infile = waitfor urllib.urlopen(path, async=True)
outfile = waitfor open(path.split('/')[-1], async=True)
waitfor outfile.write(waitfor infile.read(async=True), async=True)
infile.close()
outfile.close()

def main():
a = get_and_save("http://python.org/pics/PyBanner021.gif")
b = get_and_save("http://python.org/pics/pythonHi.gif")
c = get_and_save("http://python.org/pics/PythonPoweredSmall.gif")

waitfor allDone(a, b, c)

if __name__ == "__main__":
mainloop.runUntil(main())

Well there you have it. I've glossed over many details but they can be
cleared up later. What I need to know now is how everybody else thinks
about it. Is this something you would use? Does it seem like the
right way to do it? And of course the all important one, can I get it
in to python core? <0.5 wink>

A while back I tossed something together to deal with the same issue
in terms of "futures" (or "promises".) Here is roughly what the above
code would look like with futures as I implemented them:

###########################################################################
import urllib
from future import future

def get_and_save(path):
infile = future(urllib.urlopen, path)
outfile = future(open, path.split('/')[-1])

def save(infile, outfile):
outfile().write(infile().read())
infile().close()
outfile().close()

return future(save, infile, outfile)

def main():
a = get_and_save("http://python.org/pics/PyBanner021.gif")
b = get_and_save("http://python.org/pics/pythonHi.gif")
c = get_and_save("http://python.org/pics/PythonPoweredSmall.gif")

a(), b(), c()

if __name__ == "__main__":
main()
###########################################################################

The future object initializer always returns immediately, and
the resulting "future" object can be passed around like any other
object. The __call__ method on the future object is used to get
the actual value. (I'm sure this could be avoided these days with
some sort of metaclass magic, but I haven't explored metaclasses
yet.) If the value is ready, it is returned immediately; otherwise,
the accessor's thread is blocked until it is made ready, and then
the value is returned (or the appropriate exception is raised.)
Specifying a callable (and optional parameters) in the initializer
causes resolver threads to be fired off, and the result of the
callable's evaluation in the thread becomes the future's value. If
you don't specify anything, you can arrange to resolve the value
some other way. (This is useful if the value is provided by some
asynchronous mechanism.)

This was all done using plain Python 1.5.2 in 80 lines of code,
including some blank lines and doc strings. Maybe I'll brush off
the code a bit and post it one of these days.

Gary Duzan
BBN Technologies
 
R

Rhamphoryncus

Gary said:
A while back I tossed something together to deal with the same issue
in terms of "futures" (or "promises".) Here is roughly what the above
code would look like with futures as I implemented them:

This was all done using plain Python 1.5.2 in 80 lines of code,
including some blank lines and doc strings. Maybe I'll brush off
the code a bit and post it one of these days.

Gary Duzan
BBN Technologies

Hrm. Well to be honest the point is the shallow threads, *not* the
main loop and event notifiers. Otherwise I'd just use blocking calls
in real threads, or maybe just twisted, and not bother creating
anything new, ya know?
 
D

Diez B. Roggisch

Hi,

a few questions:
A shallow thread is just a generator modified in the most obvious way
possible. The yield statement is replaced with a waitfor expression.
You give it the object you wish to "wait for". Then when it's ready
you get back a return value or an exception. These waitfor expressions
are the only points where your shallow thread may get suspended, so
it's always explicit. If you call another function it will be treated
exactly as a normal function call from a generator (this is why they're
'shallow'), so there's no issues with calling C functions.

On the other end of things, your shallow thread object would have
__resume__ and __resumeexc__ methods (or perhaps a single __resume__
method with a default raise_=False argument). They return a tuple of
(mode,value) where mode is a string of 'waitfor', 'exception', or
'return' and value corresponds to that. These methods will be used by
a main loop to resume the shallow thread, like next() is used with a
generator.

So basically a shallow thread could be written in today's python like this?

class ShallowThread(object):
def __init__(self, data=None):
self.data = data

def run(self, initial_data=None):
while True:
....
yield <some_state>
new_data = self.data
....

st = ShallowThread().run(<some_initial_data>)
while True:
result = st.next()
st.data = <some_subsequent_data>

Basically <some_state> is your above mentioned tuple, and waitfor is like
yield with a return value that I modeled with explicit state called data.
Is that correct so far?
import mainloop, urllib

def get_and_save(path):
infile = waitfor urllib.urlopen(path, async=True)
outfile = waitfor open(path.split('/')[-1], async=True)
waitfor outfile.write(waitfor infile.read(async=True), async=True)
infile.close()
outfile.close()

def main():
a = get_and_save("http://python.org/pics/PyBanner021.gif")
b = get_and_save("http://python.org/pics/pythonHi.gif")
c = get_and_save("http://python.org/pics/PythonPoweredSmall.gif")

waitfor allDone(a, b, c)

if __name__ == "__main__":
mainloop.runUntil(main())

Well there you have it. I've glossed over many details but they can be
cleared up later. What I need to know now is how everybody else thinks
about it. Is this something you would use? Does it seem like the
right way to do it? And of course the all important one, can I get it
in to python core? <0.5 wink>

I've difficulties grasping where the actual work is done - the event
notifier thingies are sort of generators themselves, and the mainloop gets
them and calls some execute method on them?


And now the final $1000,0000.00 question - why all this? No offense intended
- it's a little bit more comfortable than the generators approach sketched
by others (e.g. David Mertz if I recall corretly) - but to my view, it
_could_ be done in today python because we have generators. Or not?
 
R

Rhamphoryncus

Diez said:
Hi,

a few questions:


So basically a shallow thread could be written in today's python like this?

class ShallowThread(object):
def __init__(self, data=None):
self.data = data

def run(self, initial_data=None):
while True:
....
yield <some_state>
new_data = self.data
....

st = ShallowThread().run(<some_initial_data>)
while True:
result = st.next()
st.data = <some_subsequent_data>

Basically <some_state> is your above mentioned tuple, and waitfor is like
yield with a return value that I modeled with explicit state called data.
Is that correct so far?

Yes, that's the general idea. I would give the following example
however, using Twisted 2.0's deferredGenerator facility (and assuming
all the functions I depend on are still modified the same way, just for
a fair comparison).

from twisted.internet.defer import *
from twisted.internet import reactor

def get_and_save(path):
thingy = waitForDeferred(urllib.urlopen(path, async=True))
yield thingy
infile = thingy.getResult()
thingy = waitForDeferred(open(split('/')[-1], async=True)
yield thingy
outfile = thingy.getResult()
thingy = waitForDeferred(infile.read(async=True))
yield thingy
data = thingy.getResult()
thingy = waitForDeferred(outfile.write(data, async=True))
yield thingy
thingy.getResult() # Still needed to raise exceptions
infile.close()
outfile.close()
get_and_save = deferredGenerator(get_and_save)

def main():
a = get_and_save("http://python.org/pics/PyBanner021.gif")
b = get_and_save("http://python.org/pics/pythonHi.gif")
c = get_and_save("http://python.org/pics/PythonPoweredSmall.gif")

thingy = waitForDeferred(allDone(a, b, c))
yield thingy
thingy.getResult()
main = deferredGenerator(main)

if __name__ == "__main__":
d = main()
d.addBoth(lambda _: reactor.stop())
reactor.run()
import mainloop, urllib

def get_and_save(path):
infile = waitfor urllib.urlopen(path, async=True)
outfile = waitfor open(path.split('/')[-1], async=True)
waitfor outfile.write(waitfor infile.read(async=True), async=True)
infile.close()
outfile.close()

def main():
a = get_and_save("http://python.org/pics/PyBanner021.gif")
b = get_and_save("http://python.org/pics/pythonHi.gif")
c = get_and_save("http://python.org/pics/PythonPoweredSmall.gif")

waitfor allDone(a, b, c)

if __name__ == "__main__":
mainloop.runUntil(main())

Well there you have it. I've glossed over many details but they can be
cleared up later. What I need to know now is how everybody else thinks
about it. Is this something you would use? Does it seem like the
right way to do it? And of course the all important one, can I get it
in to python core? <0.5 wink>

I've difficulties grasping where the actual work is done - the event
notifier thingies are sort of generators themselves, and the mainloop gets
them and calls some execute method on them?

It depends. For files, the mainloop would have something like a
select-based polling loop. Once something is ready, it would set the
appropriate event notifier to the "done" state and notify everything
waiting on it.

And now the final $1000,0000.00 question - why all this? No offense intended
- it's a little bit more comfortable than the generators approach sketched
by others (e.g. David Mertz if I recall corretly) - but to my view, it
_could_ be done in today python because we have generators. Or not?

Go and compare the two get_and_save functions. Would you call that a
"little bit more comfortable"? Now imagine the function was 20 lines
to begin with, involving a couple loops and some list comprehensions.
Then just for kicks, imagine it without generators, instead turning it
into a state machine.

Yes, you can *technically* do it using generators (and they're what
make the implementation easy). But you could also do it using a state
machine. Doesn't mean it's practical though.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,989
Messages
2,570,207
Members
46,783
Latest member
RickeyDort

Latest Threads

Top