Callable generators (PEP 288: Generator Attributes, again)

F

Francis Avila

A little annoyed one day that I couldn't use the statefulness of
generators as "resumable functions", I came across Hettinger's PEP 288
(http://www.python.org/peps/pep-0288.html, still listed as open, even
though it's at least a year old and Guido doesn't seem very hot on the
idea). I'm not too sure of its ideas on raising exceptions in
generators from outside (although it looks like it might be convenient
in some cases), but being able to pass names into generators is
definitely something that seems natural and worth having. There's
currently no easy way to statefully consume an iterator/generator.

I'm sorry if I'm raising a dead issue, but it *is* still "Draft", and
with generator expressions on the way, it might be worth bringing up
again.

I'm mostly developing the ideas found in this thread, started by Bengt
Richter nearly a year ago:
http://groups.google.com/groups?hl=...+group%3Acomp.lang.python&ie=ISO-8859-1&hl=en
(If that's too much, google 'generator group:comp.lang.python')

Also on python-dev:
http://www.python.org/dev/summary/2002-11-16_2002-11-30.html
(The thread fizzled out without any clear resolution.)

The basic thing I picked up on the threads above is that generator
attributes are too class-like for generators, and that value-passing
to generators should have more function-like semantics. Justification,
and proposed syntax/semantics follows.

Beware: hand-waving generalizations follow.

In the various wranglings in this and other threads about the
semantics of passing values into generators, one thing that didn't get
mentioned much was to think of generators as "resumable functions"
instead of "simple class instances". The idea of using generator
attributes is the "generators as class instances" view, and as many
mentioned, the semantics don't quite fit. If we think of generators
as resumable functions, a lot of more natural value-passing semantics
suggest themselves.

Functions have a parameter list, which assigns values to locals, does
something to those locals, and then returns something. This parameter
list has some rich functionality, using default args and keyword args,
but basically that's all a function is. This is in contrast to
classes, which are not only stateful, but have multiple methods (which
can be called in any order) and make constant use of their own
attributes (which functions can't easily access at all). Classes have
much more free-form control flow. Functions model algorithms, whereas
classes model things.

Like class instances, generators are stateful, have an initialization,
and are produced by a factory (the function definition with a yield in
the body for generators, the class object for instances). In all
other respects, however, I think they're much more like functions.
Like functions, generators have (conceptually) no methods: they just
continuously return. Also like functions, they generally model an
algorithm, not a thing. Also, they can't access their own attributes.

Generator initialization is already modeled (the parameter list of the
function definition, and the body of the function up to the first
yield). Generator returning is, too (the next() method). It's the
function's callability that is *not* modeled.

Functions dump passed values into a local namespace according to a
parameter list. Let's make generators do the same:

def echo():
while True:
yield something

# Add some syntactic sugar to do the following if you like.
echo.parameter_list = 'something=1'

# Here's an idle suggestion:
# def echo() (something=1): ...
# I.e.: def <name> ( <initial_parameter_list> ) (
<consumer_param_list> ):

# Although really I think generator definitions should have gotten
their own
# keyword instead of overloading 'def'. Oh well.

cecho = echo()
#cecho.next() raises NameError here.
cecho() # yields 1
cecho(10) # yields 10
# argument default overwrites 'something' in the local namespace:
cecho() # yields 1
cecho('abc') # yields 'abc'
# next() bypasses function-like locals() updating:
cecho.next() # yields 'abc'

I.e.: If called as a function, update the local namespace, then yield.
If called as an iterator, just yield.

As things are now, this is easier said than done, of course: the local
namespace of generators (functions, too) is read-only. (Could anyone
go in to the reasons for this? Is it an arbitrary restriction for
safety/optimization/sanity, or a severe implementation limitation, or
what?)

Here is a pie-in-the-sky implementation of the above semantics (which
looks an awful lot like currying):

class consumer(object):
"""consumer(func, *args, **kargs) -> callable generator

func must return a generator
func.parameter_list gives calling semantics of the generator.

"""
def __init__(self, func, *args, **kargs):
self._gen = func(*args, **kargs)

# The following check isn't very useful because generator-like
# things don't have a common base class. What we really need
# to do is check co_flags of func's code object.

if not isinstance(self._gen, types.GeneratorType):
raise TypeError, "func must return a generator."

try:
params = func.parameter_list
except AttributeError:
params = ''

exec 'def flocals(%s): return locals()' % params
self.flocals = flocals

def __call__(self, *args, **kargs):
# This doesn't give very good error messages. Ideally, they
# would be identical to those given by calling functions with
# badly formed arguments.

newlocals = self.flocals(*args, **kargs)
self._gen.gi_frame.f_locals.update(newlocals) #doesn't work
return self._gen.next()
def next(self):
return self._gen.next()

(Notice that there's nothing in the above that would *require* the
generator's parameter list remain static...but forget I said that.)

To get around the f_locals read-only problem, I tried recasting func
with new globals (using 'new.function(func.func_code,
self.my_globals, ...)'), which is a dict-like object with an
"in-between" namespace between locals and globals. I couldn't figure
out how to get __builtins__ lookups to work, and anyway, this approach
ultimately fails if the generator ever assigns to these names in its
local namespace--a pretty serious restriction. Then generators would
require declaring all such argument names global, or never assigning
to them at all. And they'd never be able to access like-named
globals.

So, we're back to the __self__ mess, like before (i.e., all arguments
must be attributes of __self__). At least that can be implemented
fairly easily (there's a recipe at aspn).

Notice that these consumer-generators share something of classes and
something of functions, but fill in the "function" half that simple
generators didn't quite fill. (They only filled the 'class' half by
being stateful.)

As for raising exceptions in a generator from the outside, that I'm
not too sure of, but I can't think of any argument against it. And
here's one for it: A consumer-generator can be used as a simple state
machine, which has a start and end condition. The start condition is
set up by the initial generator-factory call, the run conditions by
the call/yield of the generator, but without being able to raise an
exception in the generator, we need to overload the calling semantics
to signal an end state. Being able to raise an exception within the
generator would make that cleaner. Still, I think it's a separate
consideration.

Personally, I'd like to one day be able to do stupid things like this:(0, 1, 2, 3, 4)
 
M

Michele Simionato

(e-mail address removed) (Francis Avila) wrote in message
I looked at that PEP few months ago and I came out with an iterator class.
Here it is:

"""An object-oriented interface to iterators-generators"""

class Iterator(object):
"""__gen__ is automatically called by __init__, so must have signature
compatibile with __init__. Subclasses should not need to override __init__:
you can do it, but you must do it cooperatively or, at least, ensure that
__gen__ is called correctly and its value assigned to self.iterator.
"""
def __init__(self,*args,**kw):
super(Iterator,self).__init__(*args,**kw)
self.iterator=self.__gen__(*args,**kw)
def __gen__(self,*args,**kw):
"Trivial generator, to be overridden in subclasses"
yield None
def __iter__(self):
return self
def next(self):
return self.iterator.next()

class MyIterator(Iterator):
def __gen__(self):
self.x=1
yield self.x # will be changed outside the class
yield self.x

iterator=MyIterator()

print iterator.next()
iterator.x=5
print iterator.next()

Wrapping the generator in the class, I can pass parameters to it (in
this case x). IOW, here the generator has an explicit "self" rather
than an implicit "__self__" as in the PEP. I am not sure if I like the
PEP, wouldn't be easier to have a built-in iterator class?


Michele Simionato
 
A

Aahz

A little annoyed one day that I couldn't use the statefulness of
generators as "resumable functions", [...]

<raised eyebrow> But generators *are* resumable functions; they just
don't permit injection of new values into their state. As Michele
pointed out, it's easy enough to wrap a generator in a class if you want
to monitor the changing state of an attribute.

The problem with injecting values is that there's no way to pick them
up; it's a "push" solution rather than a "pull" solution.
 
F

Francis Avila

Michele Simionato wrote in message
(e-mail address removed) (Francis Avila) wrote in message
I looked at that PEP few months ago and I came out with an iterator class.
Here it is:

"""An object-oriented interface to iterators-generators"""

class Iterator(object):
"""__gen__ is automatically called by __init__, so must have signature
compatibile with __init__. Subclasses should not need to override __init__:
you can do it, but you must do it cooperatively or, at least, ensure that
__gen__ is called correctly and its value assigned to self.iterator.
"""
def __init__(self,*args,**kw):
super(Iterator,self).__init__(*args,**kw)
self.iterator=self.__gen__(*args,**kw)
def __gen__(self,*args,**kw):
"Trivial generator, to be overridden in subclasses"
yield None
def __iter__(self):
return self
def next(self):
return self.iterator.next()

class MyIterator(Iterator):
def __gen__(self):
self.x=1
yield self.x # will be changed outside the class
yield self.x

iterator=MyIterator()

print iterator.next()
iterator.x=5
print iterator.next()

Wrapping the generator in the class, I can pass parameters to it (in
this case x). IOW, here the generator has an explicit "self" rather
than an implicit "__self__" as in the PEP. I am not sure if I like the
PEP, wouldn't be easier to have a built-in iterator class?
Michele Simionato


I'm suggesting the PEP's functionality, not its syntax and semantics. My
contention is that the PEP regards generators as too class-like, when they
are more naturally considered as function-like.

For example, your iterator class/instance would look like this:

def iterator(x=1)(x):
yield x
yield x

print iterator.next() # -> 1
print iterator(5) # -> 5

The local name "x" is updated (according to the second parameter list in the
function definition) right after the yield of the previous call when
iterator is called, behaving like a state-persistent callable function. If
it's just "nexted", it behaves like a plain old iterator.

Here's what the complete example in the PEP would look like (without
generator exceptions):

def filelike(packagename, appendOrOverwrite) (dat, flush=False):
data = []
if appendOrOverwrite == 'w+':
data.extend(packages[packagename])
while not flush:
data.append(dat)
yield None
packages[packagename] = data

ostream = filelike('mydest','w')
ostream.dat = firstdat
ostream.dat = firstdat
ostream.dat = ('', flush=True)

Note that without exceptions, we need to overload the calling interface and
call with dummy data, which will never go into the stream.

On the other hand, it has a consistent (no "magic attributes") and obvious
interface (just look at the first line of its definition). If its called
incorrectly, it fails like a function call would, without disturbing the
generator. Plus you get default args and keyword args and all that good
stuff.

Now, It's not as if you can't do any of this without classes--but it's much
shorter and less cumbersome, because there's no need to deal with two
different namespaces and make sure they're in sync.

Here's an efficient reversable generator:

def count (n=0)(step=1):
while True:
yield n
n += step

c = count()
c.next() # 0
c.next() # 1
c(-1) # 0
c.next() # -1
# Now turn it into repeat():
c(0) # -1
c.next() # -1
# And of course n and step can be anything for which n+step makes sense.

I was thinking to write an rpn calculator example with these, but for such
simple state machines there isn't an appreciable advantage over classes
wrapping a generator. Classes are more of a hassle, but not much more.
I'll have to find one that's more complex; it may be a hard sell even then,
but I just thought I could do better than what the PEP and discussions about
it suggested.
 
F

Francis Avila

Aahz wrote in message ...
A little annoyed one day that I couldn't use the statefulness of
generators as "resumable functions", [...]

<raised eyebrow> But generators *are* resumable functions; they just
don't permit injection of new values into their state.

I see then that I don't need to convince you. :) But it is because you
can't inject new values into their state that they are not resumable
functions. They're pure state, not functions-with-persisting-state. If
they were resumable functions, we could call them like functions and be
returned values based upon passed parameters, except that the algorithm used
would depend upon the generator's internal state.

Now, the above sounds like a class, doesn't it? But yet we think of
generators as functions. My post was an attempt to explain why this is so,
and argue for a set of semantics based on that perception (because the PEP
seems to want to regard them as classes).

Generators are funny because they share properties of classes and functions,
but I think they're more function-like than class-like, and that's why
class-like interfaces (i.e. generator attributes) are a strange fit.
As Michele
pointed out, it's easy enough to wrap a generator in a class if you want
to monitor the changing state of an attribute.


True--you simply have to use 'self' to access public values, as the PEP
suggests. The only advantage the PEP has is that we don't need to wrap
generators in a function to gain access to public attributes--the only
reason for the wrapper class anyway is to give a namespace between global
and local. Anyone can do that, so why do we need a magic __self__
attribute? So I agree, I don't like the PEP's specific proposal, but I
still like the functionality it's attempting to provide.
The problem with injecting values is that there's no way to pick them
up; it's a "push" solution rather than a "pull" solution.

You mean "pull" rather than "push"?

Well, I just suggested a way to pick them up which is no different than how
a function picks up parameters--they're pushed in rather than pulled in, by
overwriting the local namespace before advancing the generator's state.
Would you care to comment on my suggestion?
 
G

Greg Chapman

Personally, I'd like to one day be able to do stupid things like this:
(0, 1, 2, 3, 4)

Here's one way of getting a resumable function with parameters:

def echo(defvalue = 1):

def gen():
args = [None]

def callgen(arg=defvalue):
args[0] = arg
return gennext()
yield callgen

while True:
value, = args
yield value

gennext = gen().next
return gennext()

You can elaborate on this by abstracting out the generator caller (callgen) with
something like the following (which also allows throwing an exception into a
generator, provided the generator adheres to the convention of unpacking its
arguments when resumed from a yield):

class GenArgs(object):
def __init__(self):
self.args = ()
self.exc = None
def __iter__(self):
if self.exc:
raise self.exc
return iter(self.args)
def __call__(self):
if self.exc:
raise self.exc
return self.args

class CallGen(object):
def __init__(self, gen, genargs, numargs, defaults):
assert numargs >= 0 and len(defaults) <= numargs
self.gen = gen
self.gennext = gen.next
self.genargs = genargs
self.numargs = numargs
self.defaults = defaults
def __call__(self, *cargs):
numargs = self.numargs
numcargs = len(cargs)
if numcargs < numargs:
diff = numargs-numcargs
defaults = self.defaults
if diff <= len(defaults):
cargs = cargs+defaults[-diff:]
numcargs = numargs
if numargs != numcargs:
raise TypeError('%s takes %d arguments (%d given)' % (
self.gen.gi_frame.f_code.co_name, numargs, numcargs
))
self.genargs.args = cargs
return self.gennext()
def throw(self, exc):
self.genargs.exc = exc
try:
return self.gennext()
except StopIteration:
return None

def makeCallGen(gen, numargs, defaults=()):
genargs = GenArgs()
return CallGen(gen, genargs, numargs, defaults), genargs

# Here's echo again with the above:

def echo(defvalue=1):
def gen():
while True:
value, = args
yield value

callgen, args = makeCallGen(gen(), 1, (defvalue,))
return callgen
 
M

Michele Simionato

Francis Avila said:
I'm suggesting the PEP's functionality, not its syntax and semantics. My
contention is that the PEP regards generators as too class-like, when they
are more naturally considered as function-like.

For example, your iterator class/instance would look like this:

def iterator(x=1)(x):
yield x
yield x

print iterator.next() # -> 1
print iterator(5) # -> 5

The local name "x" is updated (according to the second parameter list in the
function definition) right after the yield of the previous call when
iterator is called, behaving like a state-persistent callable function. If
it's just "nexted", it behaves like a plain old iterator.

I see what you mean, now. Still, the notation is confusing, since I
do think an iterator as something which it obtained by "instantiating"
a generator. On the other hand, your iterator(5) would not be a newly
"instantiated" iterator, but the same iterator with a different
parameter x. So, you would need a different syntax, as for instance
iterator[5]. Still, I do think this would be confusing. The class
solution would be more verbose but clearer.
Here's an efficient reversable generator:

def count (n=0)(step=1):
while True:
yield n
n += step

Yeah, it is lightweight, but a class derived from "Iterator"
would be only one line longer, so I am not convinced. Also,
people would start thinking that you can define regular
functions with two sets of arguments, and this would generate
a mess ...

Michele
 
A

Aahz

Aahz wrote in message ...
A little annoyed one day that I couldn't use the statefulness of
generators as "resumable functions", [...]

<raised eyebrow> But generators *are* resumable functions; they just
don't permit injection of new values into their state.

I see then that I don't need to convince you. :) But it is because you
can't inject new values into their state that they are not resumable
functions. They're pure state, not functions-with-persisting-state.
If they were resumable functions, we could call them like functions
and be returned values based upon passed parameters, except that the
algorithm used would depend upon the generator's internal state.

Enh. Depends on how one looks at it. They *are* functions with
persisting state; the state stays around until the generator's iterator
exits. Sounds to me like you're trying to conflate two different
mechanisms for managing function state.
You mean "pull" rather than "push"?

No, I do mean push.
Well, I just suggested a way to pick them up which is no different
than how a function picks up parameters--they're pushed in rather than
pulled in, by overwriting the local namespace before advancing the
generator's state. Would you care to comment on my suggestion?

The problem is that a function declares what parameters are to be pushed
in, so that from the standpoint of the function, it really is a pull
solution. When you instantiate a generator iterator, the bytecodes in
the function do *not* get executed until the first call to the iterator's
next() method. What's supposed to happen when someone pushes additional
values in before the first call to next()? How is the function supposed
to declare what values may be pushed in? You're simply not going to
persuade Guido that arbitrary values should be pushed.
 
B

Bengt Richter

Francis Avila said:
I'm suggesting the PEP's functionality, not its syntax and semantics. My
contention is that the PEP regards generators as too class-like, when they
are more naturally considered as function-like.

For example, your iterator class/instance would look like this:

def iterator(x=1)(x):
yield x
yield x

print iterator.next() # -> 1
print iterator(5) # -> 5

The local name "x" is updated (according to the second parameter list in the
function definition) right after the yield of the previous call when
iterator is called, behaving like a state-persistent callable function. If
it's just "nexted", it behaves like a plain old iterator.

I see what you mean, now. Still, the notation is confusing, since I
do think an iterator as something which it obtained by "instantiating"
a generator. On the other hand, your iterator(5) would not be a newly
"instantiated" iterator, but the same iterator with a different
parameter x. So, you would need a different syntax, as for instance
iterator[5]. Still, I do think this would be confusing. The class
solution would be more verbose but clearer.

<rant>
I really like generators, but I am not really happy with the magic transmogrification
of an ordinary function into a generator-instance factory as a magic side effect
of defining a state machine with yields in the function body. Sorry to complain
about a past decision, but one effect was also to exclude yield-ing in a subroutine
of the generator (which would in general require suspending a stack of frames, but would
open a lot of possibilities, ISTM).
</rant>

But, trying to live with, and extend, current realities, we can't call the generator
factory function name again to pass values into the generator, since that merely creates
another generator. But if we have

gen = factory(some_arg)

where factory could be e.g., (I'm just using 'factory' instead of 'iterator' for semantics)

def factory(x=1): yield x; yield x

we could potentially call something other than gen.next(), i.e., if gen had a gen.__call__ defined,

gen(123)

could be a way to pass data into the generator. But what would it mean w.r.t. backwards compatibility
and what would be different? Obviously the calls to gen.next() generated by

for x in factory(123): print x

would have to work as now. But consider, e.g.,

for x in itertools.imap(factory(), 'abc'): print x

presumably it would be equivalent to

gen = factory()
for x in itertools.imap(gen, 'abc'): print x

and what we'd like is to have some local variables (just x in this case) set to whatever
is passed through gen.__call__(...) arg list, in this case successively 'a', 'b', and 'c'.

Since gen is the product of a factory (or implicit metaclass operating during instantiation
of the function??) it should be possible to create a particular call signature for gen.__call__.

One syntactic possibility would be as Francis suggested, i.e.,

def iterator(x=1)(x):
yield x
yield x

but, BTW, I guess Francis didn't mean

print iterator.next() # -> 1
print iterator(5) # -> 5

since it is not iterator that has the .next method, but rather the returned
generator. So unless I misunderstand it would have to be

gen = iterator()
print gen.next() # -> 1
print gen(5) # -> 5

IIRC, this is what I was getting at originally (among other things;-)

If we live with the current parameter list definition mechanism, we are forced
to pass default or dummy values in the initial factory call, even if we intend
to override them with the very first call to gen('something'). We have the option
of using them through next(), but we don't have to. I.e.,

gen = iterator()
print gen('hoo') # -> hoo (default 1 overridden)
print gen('haw') # -> haw

also, incidentally,

gen = iterator()
print gen('hoo') # -> hoo (default 1 overridden)
print gen.next() # -> hoo (x binding undisturbed)

That doesn't seem so hard to understand, if you ask me. Of course,
getting gen.__call__ to rebind the local values requires real changes,
and there are some other things that might make usage better or worse
depending -- e.g., how to handle default parameters. On the call to the
factory, defaults would certainly act like now, but on subsequent gen() calls,
it might make sense to rebind only the explicitly passed parameters, so the default
would effectively be the existing state, and you could make short-arg-list calls. Hmmm...

This might be a good combination use of defaults in the factory call to set initial
defaults that then are optionally overridden in the gen() calls. I kind of like that ;-)

My way would be

def count (step=1, count=0):
while True:
yield n
n += step

ctr = count()
ctr() # -> 0
ctr() # -> 1
ctr() # -> 2
ctr(-1) # -> 1
ctr() # -> 0
ctr() # -> -1
Yeah, it is lightweight, but a class derived from "Iterator"
would be only one line longer, so I am not convinced. Also,
people would start thinking that you can define regular
functions with two sets of arguments, and this would generate
a mess ...

Well, what do you think of the above?
BTW, thanks Francis for the citation ;-)

PS, I wonder if a co_yield would be interesting/useful? (Think PDP-11 jsr pc,@(sp)+ ;-)

Regards,
Bengt Richter
 
F

Francis Avila

Michele Simionato wrote in message
"Francis Avila" <[email protected]> wrote in message
I'm suggesting the PEP's functionality, not its syntax and semantics. My
contention is that the PEP regards generators as too class-like, when they
are more naturally considered as function-like.

For example, your iterator class/instance would look like this:

def iterator(x=1)(x):
yield x
yield x

print iterator.next() # -> 1
print iterator(5) # -> 5

The local name "x" is updated (according to the second parameter list in the
function definition) right after the yield of the previous call when
iterator is called, behaving like a state-persistent callable function. If
it's just "nexted", it behaves like a plain old iterator.

I see what you mean, now. Still, the notation is confusing, since I
do think an iterator as something which it obtained by "instantiating"
a generator. On the other hand, your iterator(5) would not be a newly
"instantiated" iterator, but the same iterator with a different
parameter x. So, you would need a different syntax, as for instance
iterator[5].

I'm sorry, it was supposed to be 'def MyIterator', then 'iterator =
MyIterator()' as Bengt points out.

What I mean is this:

def generator_factory(<generator_factory's parameter list>) (<returned
generator's param list>):
# blah blah
yield None

I'm not married to this syntax, but I do think its important that it be
possible to give the function and the generator different param lists, and
that the generator be iterable *or* callable. Also, I *think* (but am not
absolutely sure--must look more deeply at code and function objects....)
that Python *must* know about *both* parameter lists at compile time, so
that LOAD_FASTs will work properly, and so that the generator can have
references to the names in co_varnames to update those names in *its* param
list (to update the locals).
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Staff online

Members online

Forum statistics

Threads
473,992
Messages
2,570,220
Members
46,805
Latest member
ClydeHeld1

Latest Threads

Top