PEP 288 ponderings

S

Steven Bethard

PEP 288 was mentioned in one of the lambda threads and so I ended up
reading it for the first time recently. I definitely don't like the
idea of a magical __self__ variable that isn't declared anywhere. It
also seemed to me like generator attributes don't really solve the
problem very cleanly. An example from the PEP[1]:

def mygen():
while True:
print __self__.data
yield None

g = mygen()
g.data = 1
g.next() # prints 1
g.data = 2
g.next() # prints 2

I looked in the archives but couldn't find a good discussion of why
setting an attribute on the generator is preferable to passing the
argument to next. Isn't this example basically equivalent to:

class mygen(object):
def next(self, data):
print data
return None

g = mygen()
g.next(1) # prints 1
g.next(2) # prints 2

Note that I didn't even define an __iter__ method since it's never used
in the example.

Another example from the PEP:

def filelike(packagename, appendOrOverwrite):
data = []
if appendOrOverwrite == 'w+':
data.extend(packages[packagename])
try:
while True:
data.append(__self__.dat)
yield None
except FlushStream:
packages[packagename] = data

ostream = filelike('mydest','w')
ostream.dat = firstdat; ostream.next()
ostream.dat = firstdat; ostream.next()
ostream.throw(FlushStream)

This could be rewritten as:

class filelike(object):
def __init__(self, packagename, appendOrOverwrite):
self.data = []
if appendOrOverwrite == 'w+':
self.data.extend(packages[packagename])
def next(self, dat):
self.data.append(dat)
return None
def close(self):
packages[packagename] = self.data

ostream = filelike('mydest','w')
ostream.next(firstdat)
ostream.next(firstdat)
ostream.close()

So, I guess I have two questions:

(1) What's the benefit of the generator versions of these functions over
the class-based versions?

(2) Since in all the examples there's a one-to-one correlation between
setting a generator attribute and calling the generator's next function,
aren't these generator attribute assignments basically just trying to
define the 'next' parameter list?

If this is true, I would have expected that a more useful idiom would
look something like:

def mygen():
while True:
data, = nextargs()
print data
yield None

g = mygen()
g.next(1) # prints 1
g.next(2) # prints 2

where the nextargs function retrieves the arguments of the most recent
call to the generator's next function.

With a little sys._getframe hack, you can basically get this behavior now:

py> class gen(object):
.... def __init__(self, gen):
.... self.gen = gen
.... def __iter__(self):
.... return self
.... def next(self, *args):
.... return self.gen.next()
.... @staticmethod
.... def nextargs():
.... return sys._getframe(2).f_locals['args']
....
py> def mygen():
.... while True:
.... data, = gen.nextargs()
.... print data
.... yield None
....
py> g = gen(mygen())
py> g.next(1)
1
py> g.next(2)
2

Of course, it's still a little magical, but I think I like it a little
better because you can see, looking only at 'mygen', when 'data' is
likely to change value...


Steve

[1] http://www.python.org/peps/pep-0288.html
 
I

Ian Bicking

Steven said:
PEP 288 was mentioned in one of the lambda threads and so I ended up
reading it for the first time recently. I definitely don't like the
idea of a magical __self__ variable that isn't declared anywhere. It
also seemed to me like generator attributes don't really solve the
problem very cleanly. An example from the PEP[1]:

def mygen():
while True:
print __self__.data
yield None

g = mygen()
g.data = 1
g.next() # prints 1
g.data = 2
g.next() # prints 2

I don't get why this isn't good enough:

def mygen(data):
while True:
print data[0]
yield None

data = [None]
g = mygen(data)
data[0] = 1
g.next()
data[0] = 1
g.next()

Using a one-element list is kind of annoying, because it isn't clear out
of context that it's just a way of creating shared state. But it's
okay, work right now, and provides the exact same functionality. The
exception part of PEP 288 still seems interesting.
 
N

Nick Coghlan

Ian said:
Using a one-element list is kind of annoying, because it isn't clear out
of context that it's just a way of creating shared state. But it's
okay, work right now, and provides the exact same functionality.

Uh, isn't shared state what classes were invented for?

Py> class mygen(object):
.... def __init__(self, data):
.... self.data = data
.... def __iter__(self):
.... while 1:
.... print self.data
.... yield None
....
Py> g = mygen(0)
Py> giter = iter(g)
Py> giter.next()
0
Py> g.data = 1
Py> giter.next()
1

Cheers,
Nick.
 
R

Raymond Hettinger

[Steven Bethard]
(1) What's the benefit of the generator versions of these functions over
the class-based versions?

Generators are easier to write, are clearer, and run faster.

They automatically
* create a distinct generator-iterator object upon each invocation
* create the next() and idempotent __iter__() methods.
* raise StopIteration upon termination
* remain stopped if next() is called too many times
* save their own local variable state, avoiding the need for self.var references
* resume execution at the point of the last yield


(2) Since in all the examples there's a one-to-one correlation between
setting a generator attribute and calling the generator's next function,
aren't these generator attribute assignments basically just trying to
define the 'next' parameter list?

They are not the same. The generator needs some way to receive the values. The
function arguments cannot be used because they are needed to create the
generator-iterator. The yield statements likewise won't work because the first
yield is not encountered until well after the first next() call.

The given examples are minimal and are intended only to demonstrate the idea.


I definitely don't like the
idea of a magical __self__ variable that isn't declared anywhere.

It is no more magical than f.__name__ or f.__doc__ for functions. The concept
is almost identical to that for threading.local(). Also, the __self__ argument
is a non-issue because there are other alternate approaches such as providing a
function that retrieves the currently running generator. Which approach is
ultimately taken is a matter of aesthetics -- the PEP itself concentrates on the
core idea instead of debating syntax.

The real issue with PEP 288's idea for generator attributes is that the current
C implementation doesn't readily accommodate this change. Major surgery would
be required :-(

The more important part of the PEP is the idea for generator exceptions. The
need arises in the context of flushing/closing resources upon generator
termination.



Raymond Hettinger
 
S

Steven Bethard

Raymond said:
[Steven Bethard]
(2) Since in all the examples there's a one-to-one correlation between
setting a generator attribute and calling the generator's next function,
aren't these generator attribute assignments basically just trying to
define the 'next' parameter list?

They are not the same. The generator needs some way to receive the values. The
function arguments cannot be used because they are needed to create the
generator-iterator. The yield statements likewise won't work because the first
yield is not encountered until well after the first next() call.

Yeah, I wasn't trying to claim that passing the arguments to .next() is
equivalent to generator attributes, only that the point at which new
values for the generator state variables are provided correspond with
calls to .next(). So if there was a means within a generator of getting
access to the arguments passed to .next(), generator attributes would be
unnecessary for the examples provided.
The given examples are minimal and are intended only to demonstrate the idea.

Do you have an example where the generator state isn't updated in
lock-step with .next() calls? I'd be interested to look at an example
of this...
It is no more magical than f.__name__ or f.__doc__ for functions.

I'm not sure this is quite a fair parallel. The difference here is that
f.__name__ and f.__doc__ are accessed as attributes of the f object,
and the __name__ and __doc__ attributes are created as a result of
function creation. The proposed __self__ is (1) not an attribute that
becomes available, rather, a new binding local to the function, and (2)
not created as a result of generator object creation but created as a
result of calling .next() on the generator object.
Also, the __self__ argument is a non-issue because there are other alternate
approaches such as providing a function that retrieves the currently
running generator.

Is there a discussion of some of these alternate suggested approaches
somewhere you could point me to?
The more important part of the PEP is the idea for generator exceptions. The
need arises in the context of flushing/closing resources upon generator
termination.

I wonder if maybe it would be worth moving this part to a separate PEP.
It seems like it could probably stand on its own merit, and having it
in with the generator attributes PEP means it isn't likely to be
accepted separately.

Of course, I would probably declare a class and provide a .close()
method. =)

Steve
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,995
Messages
2,570,225
Members
46,815
Latest member
treekmostly22

Latest Threads

Top