Question about exausted iterators

C

Christophe

Is there a good reason why when you try to take an element from an
already exausted iterator, it throws StopIteration instead of some other
exception ? I've lost quite some times already because I was using a lot
of iterators and I forgot that that specific function parameter was one.

Exemple :
.... print list(i)
.... print list(i)
....[0, 1]
[]
This is using Python 2.4.2
 
T

Terry Reedy

Christophe said:
Is there a good reason why when you try to take an element from an
already exausted iterator, it throws StopIteration instead of some other
exception ?

Yes.
..
..
To distinguish the control message "I am done yielding values, as per the
code specification (so don't bother calling me again)." from error messages
that say "Something is wrong, I cannot yield values and give up." In other
words, to distinguish expected correct behavior from unexpected incorrect
behavior. This is essential for the normal and correct use of iterators.
I've lost quite some times already because I was using a lot
of iterators and I forgot that that specific function parameter was one.

I think you mean 'specific function argument' for a parameter which could
be any iterable.
Exemple : Example
... print list(i)
... print list(i)
...[0, 1]
[]

As per specification.
I am guessing that you want the first list() call to terminate normally and
return a list, which requires exhausted i to raise StopIteration, while you
want the second list() to not terminate but raise an exception, which
requires exhausted i to raise something other than StopIteration. Tough.

One solution is call list(i) exactly once:

def f(i):
li = list(i)
print li
print li

Another is to document f as requiring that i be a non-iterator reiterable
iterable and only pass correct arguments.

A third is to add a line like
if iter(i) is i: raise TypeError("input appears to be iterator")

This is not quite exact since it will improperly exclude self-iterator
reiterables (which, I believe, no builtin is) and improperly pass
non-reiterable non-iterator iterables (at least some file objects). But it
might work for all your cases.

Terry Jan Reedy
 
G

George Sakkis

Christophe said:
Is there a good reason why when you try to take an element from an
already exausted iterator, it throws StopIteration instead of some other
exception ? I've lost quite some times already because I was using a lot
of iterators and I forgot that that specific function parameter was one.

Exemple :
... print list(i)
... print list(i)
...[0, 1]
[]

Whether trying to iterate over an exhausted iterator should be treated
differently is appication dependent. In most cases, you don't really
care to distinguish between an iterator that yields no elements and an
iterator that did yield some elements before but it has been exhausted.
If you do care, you can roll your own iterator wrapper:


class ExhaustibleIterator(object):
def __init__(self, iterable):
self._next = getattr(iterable, 'next', iter(iterable).next)
self._exhausted = False

def next(self):
if self._exhausted:
raise ExhaustedIteratorException()
try: return self._next()
except StopIteration:
self._exhausted = True
raise

def __iter__(self):
return self

class ExhaustedIteratorException(Exception):
pass


And then in your function:
def f(i):
i = ExhaustibleIterator(i)
print list(i)
print list(i)


HTH,
George
 
C

Christophe

Terry Reedy a écrit :
Yes.
.
.
To distinguish the control message "I am done yielding values, as per the
code specification (so don't bother calling me again)." from error messages
that say "Something is wrong, I cannot yield values and give up." In other
words, to distinguish expected correct behavior from unexpected incorrect
behavior. This is essential for the normal and correct use of iterators.

You talk about expected behaviour and my expected behaviour is that an
iterator should not be usable once it has raised StopIteration once.
def f(i):

... print list(i)
... print list(i)
...
f(iter(range(2)))

[0, 1]
[]


As per specification.

Specifications sometimes have "bugs" too.
I am guessing that you want the first list() call to terminate normally and
return a list, which requires exhausted i to raise StopIteration, while you
want the second list() to not terminate but raise an exception, which
requires exhausted i to raise something other than StopIteration. Tough.

Exactly. This would be a sane way to handle it.
One solution is call list(i) exactly once:

def f(i):
li = list(i)
print li
print li

Ok, call me stupid if you want but I know perfectly well the "solution"
to that problem ! Come on, I was showing example code of an horrible
gotcha on using iterators.





Instead of saying that all works as intended could you be a little
helpful and tell me why it was intended in such an obviously broken way
instead ?
 
L

looping

Christophe said:
Ok, call me stupid if you want but I know perfectly well the "solution"
to that problem ! Come on, I was showing example code of an horrible
gotcha on using iterators.

OK, your are stupid ;-)
Why asking questions when you don't want to listen answers ?

Instead of saying that all works as intended could you be a little
helpful and tell me why it was intended in such an obviously broken way
instead ?

Why an exausted iterator must return an Exception (other than
StopIteration of course) ?
Well an exausted iterator could be seen like an empty string or an
empty list (or tons of others things), so you expect the code
for car in "":
print car
to return an Exception because it's empty ???
It's your job to check the iterator when it need to be.

Regards.
Dom
 
C

Christophe

looping a écrit :
OK, your are stupid ;-)
Why asking questions when you don't want to listen answers ?

Because I'm still waiting for a valid answer to my question. The answer
"Because it has been coded like that" or is not a valid one.
Why an exausted iterator must return an Exception (other than
StopIteration of course) ?

Because it's exausted. Because it has been for me a frequent cause of
bugs and because I have yet to see a valid use case for such behaviour.
Well an exausted iterator could be seen like an empty string or an
empty list (or tons of others things), so you expect the code
for car in "":
print car
to return an Exception because it's empty ???

Of course not.
It's your job to check the iterator when it need to be.

It's my job to avoid coding bugs, it's the language job to avoid placing
pitfalls everywhere I go.



I must confess I have a strong opinion on that point. Not long ago I
started working on some fresh code where I decided to use a lot of
iterators and set instead of list if possible. That behaviour has caused
me to lose quite some time tracking bugs.
 
F

Fredrik Lundh

Christophe said:
I didn't though I had to mention that "Because the spec has been writen
like that" wasn't a valid answer either.

so what is a valid answer?

</F>
 
D

Diez B. Roggisch

Christophe said:
Fredrik Lundh a écrit :

I didn't though I had to mention that "Because the spec has been writen
like that" wasn't a valid answer either.

The important thing is: it _is_ specified. And what about code like this:


iterable = produce_some_iterable()

for item in iterable:
if some_condition(item)
break
do_something()

for item in iterable:
do_something_with_the_rest()


If it weren't for StopIteration raised if the iterable was exhausted, you'd
have to clutter that code with something like

try:
for item in iterable:
do_something_with_the_rest()
except IteratorExhausted:
pass

What makes you say that this is better than the above? Just because _you_
had some cornercases that others seems not to have (at least that
frequently, I personally can't remember I've ever bitten by it) isn't a
valid reason to _not_ do it as python does.

Besides that: it would be a major change of semantics of iterators that I
seriously doubt it would make it into anything before P3K. So - somewhat a
moot point to discuss here I'd say.

Diez
 
C

Christophe

Fredrik Lundh a écrit :
so what is a valid answer?

Some valid use case for that behaviour, some example of why what I ask
could cause problems, some implementation difficulties etc ...

Saying it's like that because someone said so isn't exactly what I was
expecting as an answer :) People sometimes can be wrong you know.
 
R

Roel Schroeven

Fredrik Lundh schreef:
so what is a valid answer?

I think he wants to know why the spec has been written that way.

The rationale mentions exhausted iterators:

"Once a particular iterator object has raised StopIteration, will
it also raise StopIteration on all subsequent next() calls?
Some say that it would be useful to require this, others say
that it is useful to leave this open to individual iterators.
Note that this may require an additional state bit for some
iterator implementations (e.g. function-wrapping iterators).

Resolution: once StopIteration is raised, calling it.next()
continues to raise StopIteration."

This doesn't, however, completey answer the OP's question, I think. It
is about raising or not raising StopIteration on subsequent next() calls
but doesn't say anything on possible alternatives, such as raising
another exception (I believe that's what the OP would like).

Not that I know of use cases for other exceptions after StopIteration;
just clarifying what I think the OP means.
 
C

Christophe

Diez B. Roggisch a écrit :
Christophe wrote:




The important thing is: it _is_ specified. And what about code like this:


iterable = produce_some_iterable()

for item in iterable:
if some_condition(item)
break
do_something()

for item in iterable:
do_something_with_the_rest()


If it weren't for StopIteration raised if the iterable was exhausted, you'd
have to clutter that code with something like

try:
for item in iterable:
do_something_with_the_rest()
except IteratorExhausted:
pass

It would be ugly but you could do that instead :

iterable = produce_some_iterable()

for item in iterable:
if some_condition(item)
break
do_something()
else:
iterable = []

for item in iterable:
do_something_with_the_rest()

I'll admit that the else clause in for/while loops isn't the most common
and so some people might be a little troubled by that.

There's also that :

iterable = produce_some_iterable()

for item in iterable:
if some_condition(item)
for item in iterable:
do_something_with_the_rest()
break
do_something()
What makes you say that this is better than the above? Just because _you_
had some cornercases that others seems not to have (at least that
frequently, I personally can't remember I've ever bitten by it) isn't a
valid reason to _not_ do it as python does.

Maybe I've used more iterables than most of you. Maybe I've been doing
that wrong. But I'd like to think that if I've made those mistakes,
others will make it too and would benefit for some help in debugging
that from the interpreter :)
Besides that: it would be a major change of semantics of iterators that I
seriously doubt it would make it into anything before P3K. So - somewhat a
moot point to discuss here I'd say.

It wouldn't be such a big semantic change I think. You could add that
easily[1] as deprecation warning at first and later on switch to a full
blown error.

[1] "Easily" provided you can easily code what I ask itself ;)
 
C

Christophe

Roel Schroeven a écrit :
Fredrik Lundh schreef:


I think he wants to know why the spec has been written that way.

The rationale mentions exhausted iterators:

"Once a particular iterator object has raised StopIteration, will
it also raise StopIteration on all subsequent next() calls?
Some say that it would be useful to require this, others say
that it is useful to leave this open to individual iterators.
Note that this may require an additional state bit for some
iterator implementations (e.g. function-wrapping iterators).

Resolution: once StopIteration is raised, calling it.next()
continues to raise StopIteration."

This doesn't, however, completey answer the OP's question, I think. It
is about raising or not raising StopIteration on subsequent next() calls
but doesn't say anything on possible alternatives, such as raising
another exception (I believe that's what the OP would like).

Exactly !
Not that I know of use cases for other exceptions after StopIteration;
just clarifying what I think the OP means.

There are no use cases yet for me. I want those exceptions as an hard
error for debuging purposes.
 
F

Fredrik Lundh

Christophe said:
Maybe I've used more iterables than most of you. Maybe I've been doing
that wrong.

your problem is that you're confusing iterables with sequences. they're
two different things.

</F>
 
C

Christophe

Fredrik Lundh a écrit :
your problem is that you're confusing iterables with sequences. they're
two different things.

Yes, I know perfectly well that the bugs were my fault. But this doesn't
prevent me from asking for a feature that will have ( in my opinion ) a
negligible effect of current valid code and will help all of us catch
errors earlier.
 
E

Erik Max Francis

Christophe said:
Yes, I know perfectly well that the bugs were my fault. But this doesn't
prevent me from asking for a feature that will have ( in my opinion ) a
negligible effect of current valid code and will help all of us catch
errors earlier.

.... and apparently choosing to ask in such a way that guarantees
practically no one will take your suggestion seriously.
 
T

Terry Reedy

Christophe said:
Instead of saying that all works as intended could you be a little
helpful and tell me why it was intended in such an obviously broken way
instead ?

I answered both your explicit and implied questions in good faith. But you
seem to be too attached to your pre-judgment to have benefited much, so I
won't waste my time and yours saying more. Instead I suggest that you try
this:

1. Write a specification for your an alternate, more complicated, iterator
protocol.
2. Write a simple class with .next method that implements your
specification.
3. Test your class with your example.
4. Consider how you would persuade people to add the extra machinery
needed.
5. Consider what you would do when people don't.

If you want, post a report on your experiment, and I will read it if I see
it.

Terry Jan Reedy
 
C

Christophe

Terry Reedy a écrit :
I answered both your explicit and implied questions in good faith. But you
seem to be too attached to your pre-judgment to have benefited much, so I
won't waste my time and yours saying more. Instead I suggest that you try
this:

1. Write a specification for your an alternate, more complicated, iterator
protocol.

Specification : same as now except iterators raise once StopIteration
and any subsequent call to next raises ExaustedIteratorError.
2. Write a simple class with .next method that implements your
specification.

class ExaustedIteratorError(Exception):
pass

class safe_iter(object):
def __init__(self, seq):
self.it = iter(seq)
def __iter__(self):
return self
def next(self):
try:
return self.it.next()
except StopIteration:
del self.it
raise
except AttributeError:
raise ExaustedIteratorError
3. Test your class with your example.
>>> it = safe_iter(range(10))
>>> print list(it) [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
>>> print list(it)
Traceback (most recent call last):
File "safe_iter_test.py", line 20, in ?
print list(it)
File "safe_iter_test.py", line 13, in next
raise ExaustedIteratorError
__main__.ExaustedIteratorError
4. Consider how you would persuade people to add the extra machinery
needed.

Well, the main reason for such change is and will always be to catch
bugs. The fact is, using duct typing is something very common with the
Python language. And as such, considering a lot of functions which take
sequences as parameters work as well with an iterator instead, you can
say that it's an application of duct typing.

The problem is of course the same as for cases. Even if those two objets
( iterator and container ) look alike from a certain point of view, they
have some fundamental differences.

So, we have quite a few functions which take freely either a container
or an iterator, until someone changes that function a little. At that
point there are three kind errors which happen :
- the function expected a sequence and tries to access it's [] operator
which fails. Standard duct typing behaviour.
- the function uses the iterator more than once and so, sometimes it
works without errors but produces an incorrect result.
- the function uses the iterator more than once but never exhausts it's
values. Same result as above but much harder to catch.

In the sake of avoiding behaviour which lets obvious errors pass and
produces incorrect results, I propose to change the standard behaviour
of all the iterators in the standard Python. The change will be so that
they refuse to be used anymore once they have been exausted. Thus it'll
help catch the second class. The other procedure used to catch such bugs
would require explicit typing of the function parameters but this is for
some other proposal.
5. Consider what you would do when people don't.

I'm already doing it. Cleaning up the code, changing parameters names
around so that it is clear such parameter is an iterator and that other
one is not, making some functions explicitly refuse iterators etc ... It
should not that that last feature will be useful even without the
exhausted iterator guard I propose.
If you want, post a report on your experiment, and I will read it if I see
it.

I suppose I could add safe_iter to the builtins in my project and use it
around. It would be easy to catch all usages of that one once we settle
on something different then.
 
Y

yairchu

Consider this example:

As you can see, X is a container, and Y is an iterator.
They are simliar in that "iter" works on them both.

Cristoph claims that this causes confusion.
Why? Because "iter" doesn't have the same meaning for both of them.
For X it always returns an iterator that yields the same set of values.
For Y it returns an iterator yielding different values each time.
1

Most of the uses of iterators iterate all values until exaustion.
Given that, the first call to "iter" will mean the same for an iterator
and a container, but the second one won't.

Allow me to compare it to division in python 2.42.5, 2

Division sometimes works the same for integers and reals, and sometimes
doesn't.
This caused consfusion and bugs, and that's why future Pythons will
change that.

But changing "iter" to have the same meaning for containers and
iterables is impossible.
You cannot, conceptually, reiterate an iterator.
So what Cristoph is suggesting - is to add an exception for the cases
in which iterators and collections behave differently.
Somewhat similar to this:
Traceback (most recent call last):
File "<stdin>", line 1, in ?
IntegerDivisionError: 5 does not divide by 2
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,298
Messages
2,571,539
Members
48,274
Latest member
HowardKipp

Latest Threads

Top