PEP 289: Generator Expressions (please comment)

H

Holger Krekel

Ville said:
Why? Does it break backwards compatibility somehow?

OK ok. Probably only backwards readability :)
As far as readability goes, genexps seem to be pretty much as
readable/unreadable as list comprehensions.

Yes, i don't like those much, either. But then i am one of this strange
minority which likes map, filter, lambda and passing functions/callables all
around.

But actually generators i do appreciate a lot so i'll probably not only
get used to generator expressions but will enjoy them (more than list
comprehensions at least, i guess).

cheers,

holger
 
J

John Burton

Raymond Hettinger said:
Peter Norvig's creative thinking triggered renewed interest in PEP 289.
That led to a number of contributors helping to re-work the pep details
into a form that has been well received on the python-dev list:

http://www.python.org/peps/pep-0289.html


This seems a well thought out and useful enhancement to me.
Personally I like the syntax as proposed and would vote yes if this was the
kind of thing we got votes on!
 
S

Skip Montanaro

Rainer> I don't, because using 'yield' here would lead to greater
Rainer> confusion when using generator expressions in generator
Rainer> functions.

There's also the mental confusion about what the yield is doing. Is it
yielding a value back from the function containing the generator expression
or yielding values from the generator expression to the current function?
One of the earliest proposals on python-dev used the yield keyword. I fell
into this exact hole. It took a message or two from others participating in
the thread to extricate myself.

Skip
 
S

Skip Montanaro

Holger> Although most people on python-dev and here seems to like this
Holger> PEP I think it generally decreases readability. The above
Holger> constructs seem heavy and difficult to parse and thus i am
Holger> afraid that they interfere with the perceived simplicity of
Holger> Python's syntax.

Of course, you can do all of these things today by just turning the args
into list comprehensions:

sum([x*x for x in roots])
min([d.temperature()*9/5 + 32 for d in days])
Set([word.lower() for word in text.split() if len(word) < 5])
dict([(k, somefunc(k)) for k in keylist])
dotproduct = sum([x*y for x,y in itertools.izip(xvec, yvec)])
bestplayer, bestscore = max([(p.score, p.name) for p in players])

[regarding the proposed generator expressions, not my listcomp examples]
Which can be a sticking point for using list comprehensions. With generator
expressions in place, list comprehensions become just syntactic sugar for

list(generator expression)

Alex Martelli demonstrated some performance improvements (about 2-to-1 I
think) of generator expressions over the equivalent list comprehensions.

Holger> At least I hope that generator expressions will only be
Holger> introduced via a __future__ statement in Python 2.4 much like it
Holger> happened with 'yield' and generators. Maybe the PEP should have
Holger> an "implementation plan" chapter mentioning such details?

I think the plan is to make them available in 2.4. I don't think there's
any plan for a __future__ import because they won't change the semantics of
any existing constructs.

Skip
 
S

Skip Montanaro

Ville> However, I'm not completely sure about allowing them to be
Ville> creatad w/o parens in function calls, seems a bit too magical for
Ville> my taste.

Think of parens and genex's the same way you'd think of parens and tuples.
Creating tuples often doesn't require parens either, but sometimes does in
cases where the syntax would be ambiguous). As arguments to function calls,
parens would be required to separate generator expressions from other
arguments. In the one arg case they aren't required (though you're free to
add them if it warms the cockles of your heart ;-).

Skip
 
E

Emile van Sebille

Raymond Hettinger revives pep 289:
Peter Norvig's creative thinking triggered renewed interest in PEP 289.
That led to a number of contributors helping to re-work the pep details
into a form that has been well received on the python-dev list:

http://www.python.org/peps/pep-0289.html

In brief, the PEP proposes a list comprehension style syntax for
creating fast, memory efficient generator expressions on the fly:

What do you get from:
type(x*x for x in roots)

I assume you'd get something like genexp that behaves like an iterator and
that the consuming function accesses items until StopIteration(StopGenxing?
;) is raised, and that a genexp is otherwise a first class object.

From the PEP:
"All free variable bindings are captured at the time this function is
defined, and passed into it using default argument values...[snip]...In
fact, to date, no examples have been found of code where it would be better
to use the execution-time instead of the definition-time value of a free
variable."

Does this imply then that instead of:
dumprec = "\n".join('%s : %s' % (fld.name, fld.val) for fld in rec)
for rec in file: print dumprec

one would need to write some variation of:
for rec in file:
print "\n".join('%s : %s' % (fld.name, fld.val) for fld in rec)

or:
def dumprec(rec):
return "\n".join('%s : %s' % (fld.name, fld.val) for fld in rec)
for rec in file: print dumprec(rec)

I find the first construct easier on the eyes, and it fulfills on the
promise of 'generator expression' as opposed to 'iterable generated
expression'. Although it does feel much more macro-ish...

Overall, +1.


Emile van Sebille
(e-mail address removed)
 
S

sdd

John said:
...{about PEP 289)...

This seems a well thought out and useful enhancement to me.
Personally I like the syntax as proposed and would vote yes
if this was the kind of thing we got votes on!

Ahhh, you do get a vote, but more by reason than numbers.
The PSF is watching this discussion with bated breath.
This whole discussion in c.l.p is part of examining the
merits of the PEP.

Don't tell anyone that I mentioned the inner workings of
the PSF cabal, or ..................................~}~~}
 
I

Ian McMeans

I really like this. Not only is it less of a special case (than using
[]), but it makes it easy to use comprehensions for other data types.
I think that using the yield keyword makes it more obviously a
generator.

sum(x for x in lst) makes it clear what it will compute, but not that
it will be computed using a generator. sum(yield x for x in lst) makes
it more apparent that you're creating a generator and passing it to
sum.

Raymond Hettinger:

I like it. This made me realize that list comprehension sytax is a
special case that never really needed to be special-cased.

In fact, list comprehensions themselves become redundant syntactic
sugar under the new proposal:

[foo(x) for x in bar] ==> list(foo(x) for x in bar)

Chris Perkins
 
A

Alex Martelli

Ian said:
I think that using the yield keyword makes it more obviously a
generator.

....but makes it harder to rapidly eyeball if a function which
USES yield is or isn't a generator, which is much more relevant
because it changes all semantics.

I.e., if right now I see a def without further embedded defs and
at a glance I see...:

def f ...
...
... yield ...
...

I already know f is a generator. If the keyword 'yield' was
overloaded for other purposes, I would have to stop and check
out the details very very carefully.

sum(x for x in lst) makes it clear what it will compute, but not that
it will be computed using a generator. sum(yield x for x in lst) makes
it more apparent that you're creating a generator and passing it to
sum.

Yeah, but the fact that what I'm passing is (the result from calling)
a generator is a relatively irrelevant implementation detail (which is
why I preferred to call these "iterator expressions", but was overruled,
oh well). All that matters is that I'm passing an _iterator_ that will
behave in a specific way, without O(N) memory consumption, not how that
iterator is internally implemented.

I'd much rather omit the 'yield' for two reasons, therefore: [a] keep
it easy to see if a function is actually a generator (which does matter
A LOT!), avoid inappropriate focus on implementation detaiils vs
semantical and performance indications.


Alex
 
B

Bjorn Pettersen

(e-mail address removed) (Ian McMeans) wrote in

I really like this. Not only is it less of a special case (than using
[]), but it makes it easy to use comprehensions for other data types.
I think that using the yield keyword makes it more obviously a
generator.

sum(x for x in lst) makes it clear what it will compute, but not that
it will be computed using a generator. sum(yield x for x in lst) makes
it more apparent that you're creating a generator and passing it to
sum.

I can't think of a situation where that would be something I wanted to
know, ie. that would change the way I programmed. Do you have an example?

If not, adding words wihtout a practical purpose just seems... Cobol'ish
<wink>.

-- bjorn
 
S

Skip Montanaro

daniels> Don't tell anyone that I mentioned the inner workings of the
daniels> PSF cabal, or ..................................~}~~}

You're confusing the PSF (which does exist) with the PSU (whi
 
B

Bengt Richter

Peter Norvig's creative thinking triggered renewed interest in PEP 289.
That led to a number of contributors helping to re-work the pep details
into a form that has been well received on the python-dev list:

http://www.python.org/peps/pep-0289.html

In brief, the PEP proposes a list comprehension style syntax for
creating fast, memory efficient generator expressions on the fly:

sum(x*x for x in roots)
min(d.temperature()*9/5 + 32 for d in days)
Set(word.lower() for word in text.split() if len(word) < 5)
dict( (k, somefunc(k)) for k in keylist )
dotproduct = sum(x*y for x,y in itertools.izip(xvec, yvec))
bestplayer, bestscore = max( (p.score, p.name) for p in players )

Each of the above runs without creating a full list in memory,
which saves allocation time, conserves resources, and exploits
cache locality.

The new form is highly expressive and greatly enhances the utility
of the many python functions and methods that accept iterable arguments.
+1

Regards,
Bengt Richter
 
W

Werner Schiendl

Hi,


Rainer said:
At the very least, it would require parentheses if you want to create a list
containing exactly one such generator.

[(x for x in y)] # List containing generator
[x for x in y] # List comprehension

personally the parentheses seem a little strange to me, or put another
way, I feel that simple parentheses around "everything" should not
change the meaning of the code.

For example, see current tuple syntax:
.... print type(x)
.... <type 'tuple'>


As the code shows, just putting parentheses around the value 5 doesn't
make it a tuple.


Another point is that I think it's one of Python's advantages to not
require extensive parentheses e. g. for 'if' 'while' 'for' etc.

Unlike in C you can just write

if a > 7:
print "Large enough."


So requiring parentheses around the only item for a list with one
generator in it seems a little inconsistent in my eyes.

Maybe an alternative would be to have similar to tuples:

[x for x in y, ] # List containing generator
[x for x in y] # List comprehension


Just my 2c

Werner
 
A

Alex Martelli

Holger Krekel wrote (answering Raymond Hettinger):
...
Actually I think RH meant "bestscore, bestplayer = ..." here...
Although most people on python-dev and here seems to like this PEP I
think it generally decreases readability. The above constructs seem
heavy and difficult to parse and thus i am afraid that they interfere
with the perceived simplicity of Python's syntax.

Let's see: today, I could code:
sum([x*x for x in roots])
In 2.4, I will be able to code instead:
sum(x*x for x in roots)
with some performance benefits. Could you please explain how using
the lighter-weight simple parentheses rather than today's somewhat
goofy ([ ... ]) bracketing "generally decreases readability"...?

Similarly, today's:
bestscore, bestplayer = max([ (p.score, p.name) for p in players ])
I'll be able to code as a somewhat faster:
bestscore, bestplayer = max( (p.score, p.name) for p in players )
Again, I don't see how the proposed new construct is any more "heavy
and difficult to parse" than today's equivalent -- on the contrary,
it seems somewhat lighter and easier to parse, to me.

It may be a different issue (as you appear, from a later message, to
be particularly enamoured of Python's current 'fp' corner) to claim
that, say, yesteryear's (and still-usable):

reduce(operator.add, map(lambda x: x*x, roots))

"increases readability" compared to the proposed

sum(x*x for x in roots)

It boggles my mind to try thinking of the latter as "heavy and
difficult to parse" compared to the former, to be honest. And
when I'm thinking of "the sum of the squares of the roots", I
DO find the direct expression of this as sum(x*x for x in roots)
to be most immediate and obvious compared to spelling it out
as
total = 0
for x in roots:
total = total + x*x
which feels more like a series of detailed instructions on
how to implement that "sum of squares", while the "sum(x*x"...
DOES feel like the simplest way to say "sum of squares".

Sometimes it seems that introducing new syntax happens much easier
than improving or adding stdlib-modules.

What a weird optical illusion, hm? In Python 2.3 *NO* new syntax
was introduced (the EXISTING syntax for extended slices is now
usable on more types, but that's not "introducing" any NEW syntax
whatsoever), while HUNDREDS of changes were done that boil down
to "improving or adding standard library modules". So, it's
self-evidently obvious that the "happens much easier" perception
is pure illusion. In 2.4, _some_ syntax novelties are likely to
be allowed -- most PEPs under consideration are already in the
PEP repository or will be there shortly, of course (e.g., syntax
to support some variation of PEP 310, possibly closer to your
own interesting experiments that you reported back in February).
But once again there is absolutely no doubt that many more
changes will be to standard library modules than will "introduce
new syntax".

At least I hope that generator expressions will only be introduced
via a __future__ statement in Python 2.4 much like it happened with
'yield' and generators. Maybe the PEP should have an "implementation
plan" chapter mentioning such details?

I don't think __future__ has EVER been used for changes that do
not introduce backwards incompatibilities, nor do I see why that
excellent tradition should be broken for this PEP specifically.


Alex
 
A

Alex Martelli

Werner Schiendl wrote:
...
personally the parentheses seem a little strange to me, or put another
way, I feel that simple parentheses around "everything" should not
change the meaning of the code. ...
As the code shows, just putting parentheses around the value 5 doesn't
make it a tuple.

No, but, in all cases where the lack of parentheses would make the
syntax unclear, you do need parentheses even today to denote tuples.
Cfr:
[1,2,3]
versus
[(1,2,3)]
a list of three items, versus a list of one tuple.
Maybe an alternative would be to have similar to tuples:

The PEP *IS* "similar to tuples": parentheses are mandatory around a
generator expression wherever their lack would make things unclear
(and that's basically everywhere, except where the genexp is the
only argument to a function call). Hard to imagine a simpler rule.

The idea of "adding a comma" like this:
[x for x in y, ] # List containing generator

feels totally arbitrary and ad-hoc to me, since there is NO other
natural connection between genexps and commas (quite differently
from tuples). Moreover, it would not help in the least in other
VERY similar and typical cases, e.g.:

[x for x in y, z,] # ok, now what...?

today this means [y, z]. Would you break backwards compatibility
by having it mean a [<genexp>] instead? Or asking for a SECOND
trailing comma to indicate the latter? And what about a genexp
followed by z, and, ... ?

No, really, this whole "trailing comma to indicate a genexp in one
very special case" idea is sort of insane. Compare with:

[x for x in y, z,] # same as [y, z]

[(x for x in y, z,)] # the [<genexp>] case

[(x for x in y), z,] # the [<genexp>, z] case

what could possibly be clearer? Not to mention the parens rule
is simple, universally applicable, breaks no backwards compat,
AND is far closer to the general case for tuples ("parens when
needed for clarity") than the "singleton tuple" case you seem
to have focused on.


Alex
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,170
Messages
2,570,921
Members
47,464
Latest member
Bobbylenly

Latest Threads

Top