D
dustin
I've been hacking away on this PEP for a while, and there has been some
related discussion on python-dev that went into the PEP:
http://mail.python.org/pipermail/python-dev/2007-February/070921.html
http://mail.python.org/pipermail/python-dev/2007-February/071155.html
http://mail.python.org/pipermail/python-dev/2007-February/071181.html
I'd love to have feedback on this PEP:
- from a user's perspective (would you want to write microthreaded code?)
- from an implementer's perspective (and note that I have a reference
implementation that's just getting packaged up right now).
- from a technical perspective (efficiency, compatibility, etc.)
- regarding the two unresolved issues in the text
And now, without further ado:
PEP: XXX
Title: Standard Microthreading Pattern
Version: $Revision$
Last-Modified: $Date$
Author: Dustin J. Mitchell <[email protected]>
Status: Draft
Type: Informational
Content-Type: text/x-rst
Created: 17-Feb-2007
Post-History:
Abstract
========
The extended support for generators introduced in PEP 342 [2]_
facilitated a natural way of writing generators that cooperatively
yield control to other functions, allowing multitasking within
a single OS thread. Such multitasking requires a supporting
environment (usually in the form of a scheduler), and several such
implementations are in active use. Each has slightly different
structural requirements for the task-implementing code.
This PEP proposes a single, consistent pattern for such code.
This consistency will enable microthreaded code to run in a variety
of environments, just as threaded code can currently run atop a
variety of OS-level threading libraries. Compliant libraries,
e.g., a microthreaded urllib, could then be developed, providing
batteries-included functionality for microthreaded software.
Definitions
===========
Within this PEP, "microthreading" is defined as interleaving
the execution of multiple independent codepaths by cooperatively
yielding control periodically in each codepath; a "microthread",
then, is one such codepath. This technique and variations on it
have also been referred to as "weightless threads", "tasklets", and
"greenlets". It is a specilization of the more general event-based
multitasking model.
Microthreads are conceptually distinct from coroutines. A coroutine
can schedule a specific coroutine to be executed when it yields
control, even passing a value to that coroutine. Microthreads,
however, simply yield to facilitate multitasking, with no knowledge
or control of the next codepath to be executed.
This PEP addresses two audiences: authors of microthreaded code
(users) and designers of microthreaded environments (implementations).
Motivation
==========
Application developers usually adopt an event-based architecture in
order to exploit asynchronous IO facilities. Asynchronous IO is
ideal when the overhead of an OS-level thread for each concurrent
IO operation is too high.
An event-based architecture is also useful when an application must
perform lightweight multitasking -- for example, an email client
must both wait for user input and monitor mailboxes for new mail.
Implementing this sort of multitasking using threads requires careful
use of synchronization primitives throughout the application to
ensure correctness. Using common event-based techniques requires
storing state on the heap and implementing callbacks for all state
transitions, which can quickly become complex.
A microthreaded architecture brings the best of both worlds: it
allows users to store state in local varables, just as they would
in a threaded implementation, while avoiding the need for complex
synchronization primitives. An email client, for example, can
implement the complex state of an IMAP client in a natural fashion,
and reflect that state using natural Python data structures with no
need for locking.
Several well-developed implementations of cooperative multitasking
are currently available, some of which implement microthreading.
However, each implementation approaches the subject differently,
and it is virtually impossible to share code between implementations.
This PEP is intended to bring some measure of consistency to
microthreaded code, to facilitate development of libraries and
implementation-agnostic code repositories to support microthreaded
development.
Rationale
=========
Since the introduction of generators in PEP 255 [3]_ and their
extension to support coroutines in PEP 342 [2]_, generators have been
used to implement various forms of microthreading [1]_:
- Example 3 in PEP 342 implements a simple trampoline scheduler.
- `Kiwi tasklets`_ (formerly GTasklets) use pre-PEP 342 generators
along with an explicit post-yield function call to retrieve any
incoming value or exception.
- `Twisted Python`_ includes an ``inlineCallbacks`` decorator
[#inlineCallbacks]_ which allows a generator to yield ``Deferred``
objects; callback results are then re-injected with ``send()``,
while errbacks result in a ``throw()``.
- `Kamaelia`_ includes Axon, a library supporting microprocesses which
multitask by yielding frequently, and interact through a
well-developed IPC mechanism.
- David Mertz's "Charming Python: Implementing "weightless threads"
with Python generators" [#Mertz]_ includes a basic microthreading
scheduler and suggests directions for improvement.
- An ASPN recipe by Maciej Obarski [#Obarski]_ gives a simple
scheduler for task objects represnted by generators.
Each of these implementations have specific requirements of the
microthreaded code. The requirements differ enough that code written
for one environment will generally not function properly in another.
A common pattern for all microthreaded code will allow users to write
microthreaded code that will be portable to multiple microthreading
implementations. This will include libraries, which can then be
shared more broadly.
The specification in this PEP specifically avoids reference
to any symbols not included in the built-in namespace, so
environment-agnostic microthreaded code need not import any
environment-specific modules.
Implementation Specification
============================
An implementation is responsible for executing a body of code written
by its users according to the microthreading pattern.
Like a standard python thread, execution of a microthread begins with
a call to a single function, and ends when that function completes.
The function may call other functions, and so on; the entire control
flow constitutes a microthread. Completely unrelated control flows
may be occurring simultaneously in other microthreads.
Within a microthreading implementation, all microthreads operate
within a single OS-level thread, so at most one microthread is
executing at any point. The implementation must not switch between
microthreads except at times specified by the user.
A microthreaded function is written as a generator function, with
the points at which other microthreads may be executed indicated
by ``yield``. Normal (non-generator) functions thus cannot be
interrupted.
While a microthreaded function is executing, its execution is
represented by a generator. The implementation is responsible for
ensuring that this generator has its ``next``, ``send``, and ``throw``
methods called appropriately. Specifically, it must call ``next`` to
initiate processing, and ``step`` to execute each subsequent segment.
At the completion of each segment, if the return value of ``next``,
``send``, or ``throw`` is not one of the special values described
below, then that value must be supplied to the generator via ``send``
when it is next scheduled.
When the generator is complete, it will raise a ``StopIteration``
exception, which the implementation must catch and handle by either
resuming execution of the calling function (see `Nested Function
Calls`, below) or terminating the microthread. Other exceptions
must be handled as described in the `Exceptions` section, below.
Nested Function Calls
---------------------
When a microthreaded function yields a generator, then it is invoking
another microthreaded function, the execution of which is represented
by the yielded generator. Execution of the current function must be
suspended until the new function has been executed to completion.
Any return value from the new function must be supplied to the
calling function via its generator's ``send`` method.
Although the underlying implementation may vary, the effect is of
a stack of generators within each microthread, with the topmost
generator representing the state of the currently executing
microthreaded function, and the remaining generators suspended
awaiting its completion.
Return Values
-------------
Microthreaded functions return values to their callers by raising
a ``StopIteration`` exception containing the return value in
``args[0]``. Implementations must catch this exception and handle it
appropriately by supplying the return value to the next generator on
the stack via ``send``, or if there is no next generator, terminating
the microthread.
Exceptions
----------
When a generator raises an exception, that exception must
be propagated to the next generator on the stack via ``throw``.
If there is no next generator, then the implementation may display
a traceback to the user or take other appropriate action.
Implementations may adjust tracebacks to remove implementation-related
frames, but must not alter the exception itself.
Special Values
--------------
Implementations may attach special significance to other yielded
values or raised exceptions, but these values and exceptions must
be unique objects which could not be produced by code written
with no knowledge of the implementation. Specifically, no special
significance may be attached to any of the built-in Python types or
types used in the standard library. Rather, an implementation should
attach meaning to classes and objects local to the implementation's
namespace.
Thread Manipulation
-------------------
Implementations are responsible for providing any appropriate
functionality for manipulating microthreads. This PEP does not
place any restrictions on such functionality.
Pattern Specification
=====================
This section specifies the microthreading pattern from the perspective
of a user. In general, microthreaded code should closely resemble
ordinary threaded code, with the addition of ``yield`` keywords at
strategic locations.
- Microthreaded functions are written as generator functions. Their
execution will not be interrupted except at statements including
the ``yield`` keyword. The value of a ``yield`` expression is
the value of its argument, unless the argument is one of the
special values discussed in this section.
- A microthreaded function can call another microthreaded function by
yielding the generator resulting from calling the microthreaded
function. The value of the ``yield`` expression is the return
value of the called function.
- A microthreaded function can "return" a value by raising
``StopIteration`` with that value as an argument::
raise StopIteration(42)
- An exception raised in a microthreaded function will propagate
to its callers just as in a standard function. Uncaught exceptions
will be handled in an implementation-dependent fashion.
- Functions for thread manipulation are not specified in this PEP,
and are implementation-dependent.
Example
-------
This trivial example highlights each of the aspects of the
multithreaded pattern::
def fibonacci(n):
latest, i = (1, 1), 2
if n < 1:
raise ValueError # raise exception
while i < n:
latest = (latest[1], latest[0] + latest[1])
yield # cooperative yield
raise StopIteration(latest[1]) # return value
def fibsquared(n):
try:
fibn = (yield fibonacci(n)) ** 2 # function call
except ValueError: # catch exception
print "Sorry, cannot calculate fibsquared of", n
else:
print "fibsquared of", n, "is", fibn
Backwards Compatibility
=======================
As this is an informational PEP, no backward compatibility problems
will arise from its adoption. In most cases, this PEP specifies
a superset of the functionality of existing microthreading
enviroments, so existing microthreaded code will continue to run
without modification.
One exception to this rule is Kamaelia's Axon, which specifies
that generators which yield a false value will be terminated.
That situation is not compatible with this PEP, which directs that
such a value should be returned from the yield at the next execution
of the microthread.
The specification of Kiwi tasklets requires that generators
implementing tasklets call ``tasklet.get_event()`` after most yields.
This is not necessary under the specification in this PEP, and the
``get_event()`` function can simply be stubbed out to ensure backward
compatibility.
Unresolved Issues
=================
Return Values
-------------
Currently, a ``return`` statement with a value in a generator is
a syntax error. The Python community should consider supporting
the behavior described above in the Python compiler. Specifically,
the compiler would treat::
return foo
in a generator as equivalent to::
raise StopIteration(foo)
This change raises no backward-compatibility issues (as the former
expression is not currently legal), but may introduce some surprising
behavior as a function could then continue executing after a return
statement::
try:
return 10
except StopIteration:
print "I'm still here!"
Non-Microthreaded Generators
----------------------------
Functions which return "normal" generators may confuse a
microthreading implementation if those generators are accidentally
yielded. Consider::
def get_constants(dict):
return ( k for k in dict.iterkeys() if k[0] in string.uppercase )
def examine_module(mod):
d = mod.__dict__
constants = yield get_constants(d)
this example will fail, because the implementation will treat the
generator expresion in ``get_constants`` as a microthreaded function,
calling its ``send`` method until it is exhaustd, then assigning
``None`` to ``constants`` in ``examine_module``.
The problem only occurs when such a generator is yielded: the
snippet above will operate correctly if ``yield`` is removed from the
last line. Thus users can avoid the problem by exercising caution
in selecting the values that are yielded.
References
==========
... [1] Stackless is not included in this list because, while it does
implement a microthreaded environment, it does so without the use
of generators and (as of this writing) requires a modified Python
binary.
... [2] PEP 342, "Coroutines via Enhanced Generators", van Rossum, Eby
(http://www.python.org/peps/pep-0342)
... [3] PEP 355, "Simple Generators", Schemenauer, Peters, Hetland
(http://www.python.org/peps/pep-0342)
... _Kiwi tasklets:
http://www.async.com.br/projects/kiwi/api/kiwi.tasklet.html
... _Twisted Python: http://twistedmatrix.com/
... [#inlineCallbacks]
http://twistedmatrix.com/documents/current/api/twisted.internet.defer.html
... _Kamaelia: http://kamaelia.sourceforge.net/
... [#Mertz] "Charming Python: Implementing 'weightless threads' with
Python generators," Mertz,
(http://www-128.ibm.com/developerworks/library/l-pythrd.html)
... [#Obarski] "simple, cooperative multitasking using generators,"
Obarski,
http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/466008
Copyright
=========
This document has been placed in the public domain.
...
Local Variables:
mode: indented-text
indent-tabs-mode: nil
sentence-end-double-space: t
fill-column: 70
coding: utf-8
End:
related discussion on python-dev that went into the PEP:
http://mail.python.org/pipermail/python-dev/2007-February/070921.html
http://mail.python.org/pipermail/python-dev/2007-February/071155.html
http://mail.python.org/pipermail/python-dev/2007-February/071181.html
I'd love to have feedback on this PEP:
- from a user's perspective (would you want to write microthreaded code?)
- from an implementer's perspective (and note that I have a reference
implementation that's just getting packaged up right now).
- from a technical perspective (efficiency, compatibility, etc.)
- regarding the two unresolved issues in the text
And now, without further ado:
PEP: XXX
Title: Standard Microthreading Pattern
Version: $Revision$
Last-Modified: $Date$
Author: Dustin J. Mitchell <[email protected]>
Status: Draft
Type: Informational
Content-Type: text/x-rst
Created: 17-Feb-2007
Post-History:
Abstract
========
The extended support for generators introduced in PEP 342 [2]_
facilitated a natural way of writing generators that cooperatively
yield control to other functions, allowing multitasking within
a single OS thread. Such multitasking requires a supporting
environment (usually in the form of a scheduler), and several such
implementations are in active use. Each has slightly different
structural requirements for the task-implementing code.
This PEP proposes a single, consistent pattern for such code.
This consistency will enable microthreaded code to run in a variety
of environments, just as threaded code can currently run atop a
variety of OS-level threading libraries. Compliant libraries,
e.g., a microthreaded urllib, could then be developed, providing
batteries-included functionality for microthreaded software.
Definitions
===========
Within this PEP, "microthreading" is defined as interleaving
the execution of multiple independent codepaths by cooperatively
yielding control periodically in each codepath; a "microthread",
then, is one such codepath. This technique and variations on it
have also been referred to as "weightless threads", "tasklets", and
"greenlets". It is a specilization of the more general event-based
multitasking model.
Microthreads are conceptually distinct from coroutines. A coroutine
can schedule a specific coroutine to be executed when it yields
control, even passing a value to that coroutine. Microthreads,
however, simply yield to facilitate multitasking, with no knowledge
or control of the next codepath to be executed.
This PEP addresses two audiences: authors of microthreaded code
(users) and designers of microthreaded environments (implementations).
Motivation
==========
Application developers usually adopt an event-based architecture in
order to exploit asynchronous IO facilities. Asynchronous IO is
ideal when the overhead of an OS-level thread for each concurrent
IO operation is too high.
An event-based architecture is also useful when an application must
perform lightweight multitasking -- for example, an email client
must both wait for user input and monitor mailboxes for new mail.
Implementing this sort of multitasking using threads requires careful
use of synchronization primitives throughout the application to
ensure correctness. Using common event-based techniques requires
storing state on the heap and implementing callbacks for all state
transitions, which can quickly become complex.
A microthreaded architecture brings the best of both worlds: it
allows users to store state in local varables, just as they would
in a threaded implementation, while avoiding the need for complex
synchronization primitives. An email client, for example, can
implement the complex state of an IMAP client in a natural fashion,
and reflect that state using natural Python data structures with no
need for locking.
Several well-developed implementations of cooperative multitasking
are currently available, some of which implement microthreading.
However, each implementation approaches the subject differently,
and it is virtually impossible to share code between implementations.
This PEP is intended to bring some measure of consistency to
microthreaded code, to facilitate development of libraries and
implementation-agnostic code repositories to support microthreaded
development.
Rationale
=========
Since the introduction of generators in PEP 255 [3]_ and their
extension to support coroutines in PEP 342 [2]_, generators have been
used to implement various forms of microthreading [1]_:
- Example 3 in PEP 342 implements a simple trampoline scheduler.
- `Kiwi tasklets`_ (formerly GTasklets) use pre-PEP 342 generators
along with an explicit post-yield function call to retrieve any
incoming value or exception.
- `Twisted Python`_ includes an ``inlineCallbacks`` decorator
[#inlineCallbacks]_ which allows a generator to yield ``Deferred``
objects; callback results are then re-injected with ``send()``,
while errbacks result in a ``throw()``.
- `Kamaelia`_ includes Axon, a library supporting microprocesses which
multitask by yielding frequently, and interact through a
well-developed IPC mechanism.
- David Mertz's "Charming Python: Implementing "weightless threads"
with Python generators" [#Mertz]_ includes a basic microthreading
scheduler and suggests directions for improvement.
- An ASPN recipe by Maciej Obarski [#Obarski]_ gives a simple
scheduler for task objects represnted by generators.
Each of these implementations have specific requirements of the
microthreaded code. The requirements differ enough that code written
for one environment will generally not function properly in another.
A common pattern for all microthreaded code will allow users to write
microthreaded code that will be portable to multiple microthreading
implementations. This will include libraries, which can then be
shared more broadly.
The specification in this PEP specifically avoids reference
to any symbols not included in the built-in namespace, so
environment-agnostic microthreaded code need not import any
environment-specific modules.
Implementation Specification
============================
An implementation is responsible for executing a body of code written
by its users according to the microthreading pattern.
Like a standard python thread, execution of a microthread begins with
a call to a single function, and ends when that function completes.
The function may call other functions, and so on; the entire control
flow constitutes a microthread. Completely unrelated control flows
may be occurring simultaneously in other microthreads.
Within a microthreading implementation, all microthreads operate
within a single OS-level thread, so at most one microthread is
executing at any point. The implementation must not switch between
microthreads except at times specified by the user.
A microthreaded function is written as a generator function, with
the points at which other microthreads may be executed indicated
by ``yield``. Normal (non-generator) functions thus cannot be
interrupted.
While a microthreaded function is executing, its execution is
represented by a generator. The implementation is responsible for
ensuring that this generator has its ``next``, ``send``, and ``throw``
methods called appropriately. Specifically, it must call ``next`` to
initiate processing, and ``step`` to execute each subsequent segment.
At the completion of each segment, if the return value of ``next``,
``send``, or ``throw`` is not one of the special values described
below, then that value must be supplied to the generator via ``send``
when it is next scheduled.
When the generator is complete, it will raise a ``StopIteration``
exception, which the implementation must catch and handle by either
resuming execution of the calling function (see `Nested Function
Calls`, below) or terminating the microthread. Other exceptions
must be handled as described in the `Exceptions` section, below.
Nested Function Calls
---------------------
When a microthreaded function yields a generator, then it is invoking
another microthreaded function, the execution of which is represented
by the yielded generator. Execution of the current function must be
suspended until the new function has been executed to completion.
Any return value from the new function must be supplied to the
calling function via its generator's ``send`` method.
Although the underlying implementation may vary, the effect is of
a stack of generators within each microthread, with the topmost
generator representing the state of the currently executing
microthreaded function, and the remaining generators suspended
awaiting its completion.
Return Values
-------------
Microthreaded functions return values to their callers by raising
a ``StopIteration`` exception containing the return value in
``args[0]``. Implementations must catch this exception and handle it
appropriately by supplying the return value to the next generator on
the stack via ``send``, or if there is no next generator, terminating
the microthread.
Exceptions
----------
When a generator raises an exception, that exception must
be propagated to the next generator on the stack via ``throw``.
If there is no next generator, then the implementation may display
a traceback to the user or take other appropriate action.
Implementations may adjust tracebacks to remove implementation-related
frames, but must not alter the exception itself.
Special Values
--------------
Implementations may attach special significance to other yielded
values or raised exceptions, but these values and exceptions must
be unique objects which could not be produced by code written
with no knowledge of the implementation. Specifically, no special
significance may be attached to any of the built-in Python types or
types used in the standard library. Rather, an implementation should
attach meaning to classes and objects local to the implementation's
namespace.
Thread Manipulation
-------------------
Implementations are responsible for providing any appropriate
functionality for manipulating microthreads. This PEP does not
place any restrictions on such functionality.
Pattern Specification
=====================
This section specifies the microthreading pattern from the perspective
of a user. In general, microthreaded code should closely resemble
ordinary threaded code, with the addition of ``yield`` keywords at
strategic locations.
- Microthreaded functions are written as generator functions. Their
execution will not be interrupted except at statements including
the ``yield`` keyword. The value of a ``yield`` expression is
the value of its argument, unless the argument is one of the
special values discussed in this section.
- A microthreaded function can call another microthreaded function by
yielding the generator resulting from calling the microthreaded
function. The value of the ``yield`` expression is the return
value of the called function.
- A microthreaded function can "return" a value by raising
``StopIteration`` with that value as an argument::
raise StopIteration(42)
- An exception raised in a microthreaded function will propagate
to its callers just as in a standard function. Uncaught exceptions
will be handled in an implementation-dependent fashion.
- Functions for thread manipulation are not specified in this PEP,
and are implementation-dependent.
Example
-------
This trivial example highlights each of the aspects of the
multithreaded pattern::
def fibonacci(n):
latest, i = (1, 1), 2
if n < 1:
raise ValueError # raise exception
while i < n:
latest = (latest[1], latest[0] + latest[1])
yield # cooperative yield
raise StopIteration(latest[1]) # return value
def fibsquared(n):
try:
fibn = (yield fibonacci(n)) ** 2 # function call
except ValueError: # catch exception
print "Sorry, cannot calculate fibsquared of", n
else:
print "fibsquared of", n, "is", fibn
Backwards Compatibility
=======================
As this is an informational PEP, no backward compatibility problems
will arise from its adoption. In most cases, this PEP specifies
a superset of the functionality of existing microthreading
enviroments, so existing microthreaded code will continue to run
without modification.
One exception to this rule is Kamaelia's Axon, which specifies
that generators which yield a false value will be terminated.
That situation is not compatible with this PEP, which directs that
such a value should be returned from the yield at the next execution
of the microthread.
The specification of Kiwi tasklets requires that generators
implementing tasklets call ``tasklet.get_event()`` after most yields.
This is not necessary under the specification in this PEP, and the
``get_event()`` function can simply be stubbed out to ensure backward
compatibility.
Unresolved Issues
=================
Return Values
-------------
Currently, a ``return`` statement with a value in a generator is
a syntax error. The Python community should consider supporting
the behavior described above in the Python compiler. Specifically,
the compiler would treat::
return foo
in a generator as equivalent to::
raise StopIteration(foo)
This change raises no backward-compatibility issues (as the former
expression is not currently legal), but may introduce some surprising
behavior as a function could then continue executing after a return
statement::
try:
return 10
except StopIteration:
print "I'm still here!"
Non-Microthreaded Generators
----------------------------
Functions which return "normal" generators may confuse a
microthreading implementation if those generators are accidentally
yielded. Consider::
def get_constants(dict):
return ( k for k in dict.iterkeys() if k[0] in string.uppercase )
def examine_module(mod):
d = mod.__dict__
constants = yield get_constants(d)
this example will fail, because the implementation will treat the
generator expresion in ``get_constants`` as a microthreaded function,
calling its ``send`` method until it is exhaustd, then assigning
``None`` to ``constants`` in ``examine_module``.
The problem only occurs when such a generator is yielded: the
snippet above will operate correctly if ``yield`` is removed from the
last line. Thus users can avoid the problem by exercising caution
in selecting the values that are yielded.
References
==========
... [1] Stackless is not included in this list because, while it does
implement a microthreaded environment, it does so without the use
of generators and (as of this writing) requires a modified Python
binary.
... [2] PEP 342, "Coroutines via Enhanced Generators", van Rossum, Eby
(http://www.python.org/peps/pep-0342)
... [3] PEP 355, "Simple Generators", Schemenauer, Peters, Hetland
(http://www.python.org/peps/pep-0342)
... _Kiwi tasklets:
http://www.async.com.br/projects/kiwi/api/kiwi.tasklet.html
... _Twisted Python: http://twistedmatrix.com/
... [#inlineCallbacks]
http://twistedmatrix.com/documents/current/api/twisted.internet.defer.html
... _Kamaelia: http://kamaelia.sourceforge.net/
... [#Mertz] "Charming Python: Implementing 'weightless threads' with
Python generators," Mertz,
(http://www-128.ibm.com/developerworks/library/l-pythrd.html)
... [#Obarski] "simple, cooperative multitasking using generators,"
Obarski,
http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/466008
Copyright
=========
This document has been placed in the public domain.
...
Local Variables:
mode: indented-text
indent-tabs-mode: nil
sentence-end-double-space: t
fill-column: 70
coding: utf-8
End: