Thoughts on PEP284

T

Tim Hochberg

M-a-S said:
I guess this should be a matter of optimization. Why don't the Python
compiler recognize 'for <var> in [x]range(<from>,<till>):'? It should be
pretty easy. Microsoft does marvels optimizing loops in C. Why an open
source project like Python can't do it?

FWIW, Psyco recognizes this structure and removes the overhead
associated with it.

-tim
 
S

Stephen Horne

Careful: in Stephen's proposal, slicing would NOT work on *INTEGERS* --
rather, it would work on the *INT TYPE ITSELF*, which is a very, very
different issue.

Exactly - thanks for that.

Actually, Sean Ross posted a prototype implementation which makes a
good point - theres no reason why the sliceable object needs to be the
standard 'int' type in Python as it is now. Maybe a library extended
int type, imported only if used, makes sense. Maybe this is a recipe
or library proposal rather than a language proposal.
Also, one extra feature is that the loop can be infinite (which range
and xrange cannot achieve)...

for i in int [0:] :

No way, Jose -- now THAT would break things (in admittedly rare
cases -- a method expecting a list and NOT providing an upper bound
in the slicing). I'd vote for this to be like int[0:sys.maxint+1]
(i.e., last item returned is sys.maxint).

True - but boundless generators are already in use.

An implicit sys.maxint(ish) upper bound seems wrong to me. I would
either go with allowing the infinite loop or requiring an explicit
upper bound.
What Stephen's proposal lacks is rigorous specs of what happens for all
possible slices -- e.g int[0:-3] isn't immediately intuitive, IMHO;-).

You're right in that this was not intended as a full formal proposal.
But lets see what I can do...

To me, it should be possible to create slices for negative ranges as
easily as positive ranges, so the convention of -1 giving the last
item wouldn't apply.

I already had to look at this issue when I wrapped some C++ container
classes for Python recently (unreleased at present due to the need for
further Pythonicising) - the whole point of doing so was that the
containers allow some more power and flexibility compared with the
Python ones. For instance, my set and (dictionary-like) map classes
are conveniently and efficiently sliceable. But how should the slicing
work?

BTW - there is a price for flexibility - I was very surprised at the
relative speed of a dictionary (at least an order of magnitude faster
than my map) though to be fair I haven't done even trivial
optimisation stuff yet. No, these aren't STL containers.

Anyway, back to the point...

I could have treated slices as specifying subscripts (while the data
structure is *not* a sorted array, it does support reasonably
efficient subscripting) but I felt the upper and lower bounds should
be key values rather than subscripts. But when basing the bounds on
keys, a bound of '-1' logically means a bound with the key value of -1
- not the highest subscript value.

The 'step' value is slightly inconsistent in that it had to be a
subscript-like step (stepping over a specific number of items rather
than a specific key range) as the keys don't always have a sensible
way to interpret these steps. But I'm drifting from the point again.

As a 'set of integers' would IMO logically be sliced by key too, '-1'
should just be a key like any other key, giving...
[-5, -4, -3, -2, -1]
[0, -1, -2, -3, -4]

So to me, the slice should be evaluated as follows...

if step is None : default step to 1
if step == 0 : raise IndexError

if start is None :
either default to zero or raise IndexError, not sure

if stop is None :
either raise IndexError or...

if step > 0 :
default stop to +infinity
else :
default stop to -infinity

Obviously that isn't a real implementation ;-)
 
S

Stephen Horne

FWIW, Psyco recognizes this structure and removes the overhead
associated with it.

I went looking for Psyco yesterday, and all I could find was broken
links. Was sourceforge just having bad day, or is there a new site
that hasn't made it into Google yet?
 
S

Stephen Horne

Here's a quick hack of an int class that supports iteration using the slice
notation, plus simple iteration on the class itself (for i in int: ...).
This can certainly be improved, and other issues need to be addressed, but I
just wanted to see what Stephen's idea would look like in practice. (Not
bad. Actually, pretty good.)

I like it ;-)

Also, it kind of suggests that maybe a recipe or a library 'xint'
class or C extension module could do the job as well as a language
change.
 
S

Stephen Horne

I'd rather
have the missing upper bound be an error in this case (and rely on
itertools.count for very explicit building of infinitely looping
iterators) than "easily create infinities" in response to typos;-).

I probably agree - convenience is nice, but easy-to-make errors are
somewhat less nice.

A compulsory upper bound is probably a good idea. Though maybe
explicitly recognising the strings "+inf" and "-inf" in slice.stop
would be reasonable?
 
A

Andrew Koenig

Stephen> I doubt the need for exclusive ranges in integer for loops. I
Stephen> also doubt the need for switching between different range
Stephen> systems (inclusive, exclusive, half-open). IMO there is more
Stephen> confusion than anything down those routes.

Really? I would expect a common usage to be:

for 0 <= index < len(list):
do something with list[index]
 
D

David Eppstein

Andrew Koenig said:
Stephen> I doubt the need for exclusive ranges in integer for loops. I
Stephen> also doubt the need for switching between different range
Stephen> systems (inclusive, exclusive, half-open). IMO there is more
Stephen> confusion than anything down those routes.

Really? I would expect a common usage to be:

for 0 <= index < len(list):
do something with list[index]

Isn't that what the new enumerate(list) is for?
 
S

Stephen Horne

Stephen> I doubt the need for exclusive ranges in integer for loops. I
Stephen> also doubt the need for switching between different range
Stephen> systems (inclusive, exclusive, half-open). IMO there is more
Stephen> confusion than anything down those routes.

Really? I would expect a common usage to be:

for 0 <= index < len(list):
do something with list[index]

We have half-open already. I was commenting on the need for supporting
several *different* schemes. Basically...

I don't see the need to support this case...

for 0 < index < len(list) : # ie exclusive

And think that supporting one or the other of these two would be
sufficient...

for 0 <= index < len(list) : # ie half-open
for 0 <= index <= len(list) : # ie inclusive

And can I think of any languages that support a variety of cases?

C, C++, Java etc support both half-open and inclusive with a simple
change of the continuation condition operator.

The Pascal, Modula 2, Ada etc seem to stick with inclusive IIRC.

These 'limitations' don't really seem to cause a problem, though.


That said, with all this reverse iteration stuff we've been discussing
recently, there is a point to make. If half-open ranges are common,
then the 'reverse' half-open case may be useful too...

for len(list) > index >= 0 :

It's basically a case of symmetry. It avoids the need for all those
'-1' corrections we've been stressing about just recently.


Well, who says I can't have second thoughts.

Still not keen on the syntax, though. And as exclusive ranges have no
apparent frequent use, and rewriting inclusive ranges as half-open
ranges is not really a problem, so really we only need to support the
two half-open cases. And that is really what all the backward
iteration stuff is about.
 
A

Alex Martelli

David Eppstein wrote:
...
Really? I would expect a common usage to be:

for 0 <= index < len(list):
do something with list[index]

Isn't that what the new enumerate(list) is for?

Not necessarily. enumerate is for when you need the values of both index
AND somelist[index], which is clearly a pretty common case. But sometimes
you don't care about the "previous value" of somelist[index]. E.g., say
that your specs are:

if issospecial(index) returns true, then, whatever the previous value
of somelist[index] might have been, it must be replaced with beeble.next().

Then, expressing this as:

for index in range(len(somelist)):
if issospecial(index):
somelist[index] = beeble.next()

looks rather better to me than:

for index, item in enumerate(somelist):
if issospecial(index):
somelist[index] = beeble.next()

where the fact that 'item' is being so deliberately REQUESTED... and
then utterly IGNORED... makes me wonder if the code's author may not
have committed some typo, or something.


Admittedly, these cases where you don't CARE about the previous values
of items ARE rare enough that enumerate use-cases vastly outnumber them.


Alex
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,164
Messages
2,570,898
Members
47,439
Latest member
shasuze

Latest Threads

Top