What makes an iterator an iterator?

S

Steven D'Aprano

7stud said:
Can you explain some of the details of why this code fails: ...
def next(self):
for word in "Norwegian Blue's have beautiful
plumage!".split():
yield word

Sure, easily: a loop like "for x in y:" binds an unnamed temporary
variable (say _t) to iter(y) and then repeatedly calls _t.next() [or to
be pedantic type(_t).next(t)] until that raises StopIteration.

Calling a generator, such as this next method, returns an iterator
object; calling it repeatedly returns many such iterator objects, and
never raises StopIteration, thus obviously producing an unending loop.

Thank you for that answer Alex, even though I didn't ask the question I
was wondering the same thing myself.
 
S

Steve Holden

7stud said:
Hi,

Thanks for the responses.
7stud said:
Can you explain some of the details of why this code fails:
---
class Parrot(object):
def __iter__(self):
return self
def __init__(self):
self.next = self.next().next
def next(self):
for word in "Norwegian Blue's have beautiful
plumage!".split():
yield word

P = Parrot()
for word in P:
print word
------

...a loop like "for x in y:" binds an unnamed temporary
variable (say _t) to iter(y) and then repeatedly calls _t.next() [or to
be pedantic type(_t).next(t)] until that raises StopIteration.


Aiiii. Isn't this the crux:
repeatedly calls....[type(_t).next(t)]

As far as I can tell, if the call was actually _t.next(), the code I
asked about would work. However, the name look up for 'next' when you
call:
[snip wild goose chase that appears to miss the main point].

It's nothing to do with the name lookup. Alex mentioned that to remind
us that the magic "double-under" names are looked up on the type rather
than the instance, so messing around with instances won't change the
behavior. [This is not true of "old-style" or "classic" classes, which
we should be eschewing in preparation for their disappearance].

You have to understand the iterator protocol, which is how the language
interacts with objects whose contents it iterates over (for example in
for loops). When you iterate over an object X then the *interpreter*,
under the hood, initializes the loop by calling iter(X) and stashing the
result away as, let's say, _t. Every time a new value is needed in the
iteration _t.next() is called to produce it.

We can see this if we open a file:

Calling the file's .next() method produces the next line in the file.

The point is that a function with "yield" expressions in it, when
called, returns a generator object. So if an instance's next() method
contains yield statements then repeated calls to it give you an
(endless) sequence of generator objects.

Here's a simple object class that adheres to the iterator protocol:
... def __init__(self, lim):
... self.lim = lim
... self.current = 0
... def __iter__(self):
... return self
... def next(self):
... self.current += 1
... if self.current > self.lim:
... raise StopIteration
... return self.current # NOT yield!
... ... print i
...
1
2
3
4
5
I hope this helps. You appear to be forming a rather over-complex model
of the way Python behaves. Think "simple" - Python tries to be as simple
as it can to achieve its objectives.

regards
Steve
 
T

Terry Reedy

| >
| > One very good way to get an iterator from an iterable is for .__iter__
to
| > be a generator function.
|
| Ahhh. That eliminates having to deal with next().next constructs.

There never was any 'having' to deal with such a thing. I suggest
completely forgetting .next().next and variants.

| Nice.
|
| > snip all examples of bad code that violate the iterator rule
| > by improperly writing .next as a generator function
|
| What iterator rule states that .next can't be a generator function?

How about the programming rule that a function should return what you want
it to return, or at least something 'close'?

I gave the iterator rule in brief at the top of my posting, but here is
again with several more words: the next method of an iterator for an actual
or virtual collection has no input other than self (and internally stored
information). Each time it is called, its output is either 'another'
object in the collection, if there is at least one, or a StopIteration
exception. For sequences, 'another' most be the next item after the last
one returned (if any). Whether or not duplicates are allowed depends on
the collection type.

Each call of a generator function returns a new generator object. It never
raises StopIteration. So making .next a generator function defines the
collection as an infinite virtual collection (sequence or multiset) of
generators. If that is what is intended (which it is not in the examples
posted), fine. Otherwise, it is a mistake.

| My book says an iterator is any object with a .next method that is
| callable without arguments (Python in a Nutshell(p.65) says the same
| thing).

A complete interface specification specifies information flows in both
directions, as I did before and again here.

| I've read recommendations that an iterator should additionally contain
| an __iter__() method, but I'm not sure why that is. In particular PEP
| 234 says: [snip] should [snip]

In my view, the 'should' should be taken strongly, so that the iterator is
also an iterable. It is certainly idiomatic to follow the advice. Then one
can write code like

def f(iterable):
iterator = iter(iterable)

instead of

def f(iterable):
try: iterator = iter(iterable)
except AttributeError: pass

Terry Jan Reedy
 
7

7stud

It's nothing to do with the name lookup. Alex mentioned that to remind
us that the magic "double-under" names are looked up on the type rather
than the instance...

P.next()

vs.

type(P).next()

Where is the "double-under" name?
 
7

7stud

P.next()

vs.

type(P).next()

Where is the "double-under" name?

The following example looks up next() in the instance not the class:

class Parrot(object):
def __init__(self):
self.next = lambda: "hello"
def __iter__(self):
return self
def next(self):
yield "goodbye"
raise StopIteration

p = Parrot()
_t = iter(p)
print _t.next()

-----output: ----
hello
 
S

Steve Holden

7stud said:
P.next()

vs.

type(P).next()

Where is the "double-under" name?
Try and stick with the main party. Here is the exact exchange (between
Steven D'Aprano and Alex) to which I referred:
The special methods need to be on the type -- having attributes of those
names on the instance doesn't help (applies to all special methods in
the normal, aka newstyle, object model; legacy, aka classic, classes,
work by slightly different and not entirely self-consistent semantics).

So if you have a beef, it would appear to be with Alex?

regards
Steve
 
A

Alex Martelli

7stud said:
repeatedly calls....[type(_t).next(t)]

As far as I can tell, if the call was actually _t.next(), the code I
asked about would work. However, the name look up for 'next' when you

All of your following lamentations are predicated on this assumption, it
would seem. Very well, let's put it to the text, as Python makes it so
easy..:
.... yield 23
.... .... print _t.next()
....
<generator object at 0x7c080>
<generator object at 0x7c080>
<generator object at 0x7c080>
<generator object at 0x7c080>
<generator object at 0x7c080>
<generator object at 0x7c080>
<generator object at 0x7c080>
<generator object at 0x7c080>
<generator object at 0x7c080>
<generator object at 0x7c080>

See? a loop that calls _t.next() and only terminates when that raises
StopIteration will never exit when _t.next is a generator function (as
opposed to *the result of calling* a generator function, which is an
*iterator object*). This applies when _t.next is looked up on the
instance, as we ensured we were in this toy example, just as much as
when it's coming from type(_t) -- which is why in this context the
distinction is pedantic, and why it was definitely not crucial, as you
appear (on the basis of this false idea you have) to think it was, in
the Nutshell (where I went for readability rather than pedantic
precision in this instance).

If you want to implement a class whose next method internally uses a
generator, that's pretty easy, e.g.:

class sane(object):
def __init__(self, sentence='four scores and twenty years ago'):
def agenerator():
for word in sentence.split(): yield word
self._thegen = agenerator()
def __iter__(self): return self
def next(self): return self._thegen.next()

there -- easy, innit? You just have to understand the difference
between calling a generator function, and calling the next method of the
object (iterator) which is returned by that function when called:).


Alex
 
A

Alex Martelli

Steven D'Aprano said:
Thank you for that answer Alex, even though I didn't ask the question I
was wondering the same thing myself.

You're welcome -- refreshing to see somebody who can actually understand
and accept the answer, rather than going on unrelated tangents when one
tries to help them:).


Alex
 
S

Steven D'Aprano

class sane(object):
def __init__(self, sentence='four scores and twenty years ago'):
def agenerator():
for word in sentence.split(): yield word
self._thegen = agenerator()
def __iter__(self): return self
def next(self): return self._thegen.next()


Nice technique! I always forget about nesting functions like that. I
should use it more often.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,995
Messages
2,570,236
Members
46,825
Latest member
VernonQuy6

Latest Threads

Top