pushback iterator

M

Matus

Hallo pylist,

I searches web and python documentation for implementation of pushback
iterator but found none in stdlib.

problem:
========
when you parse a file, often you have to read a line from parsed file
before you can decide if you want that line it or not. if not, it would
be a nice feature to be able po push the line back into the iterator, so
nest time when you pull from iterator you get this 'unused' line.

solution:
=========
I found a nice and fast solution somewhere on the net:

---------------------------------------------------------------------
class Pushback_wrapper( object ):
def __init__( self, it ):
self.it = it
self.pushed_back = [ ]
self.nextfn = it.next

def __iter__( self ):
return self

def __nonzero__( self ):
if self.pushed_back:
return True

try:
self.pushback( self.nextfn( ) )
except StopIteration:
return False
else:
return True

def popfn( self ):
lst = self.pushed_back
res = lst.pop( )
if not lst:
self.nextfn = self.it.next
return res

def next( self ):
return self.nextfn( )

def pushback( self, item ):
self.pushed_back.append( item )
self.nextfn = self.popfn
---------------------------------------------------------------------

proposal:
=========
as this is (as I suppose) common problem, would it be possible to extend
the stdlib of python (ie itertools module) with a similar solution so
one do not have to reinvent the wheel every time pushback is needed?


thx, Matus
 
S

Simon Forman

Hallo pylist,

I searches web and python documentation for implementation of pushback
iterator but found none in stdlib.

problem:
========
when you parse a file, often you have to read a line from parsed file
before you can decide if you want that line it or not. if not, it would
be a nice feature to be able po push the line back into the iterator, so
nest time when you pull from iterator you get this 'unused' line.

solution:
=========
I found a nice and fast solution somewhere on the net:

---------------------------------------------------------------------
class Pushback_wrapper( object ):
        def __init__( self, it ):
                self.it = it
                self.pushed_back = [ ]
                self.nextfn = it.next

        def __iter__( self ):
                return self

        def __nonzero__( self ):
                if self.pushed_back:
                        return True

                try:
                        self.pushback( self.nextfn( ) )
                except StopIteration:
                        return False
                else:
                        return True

        def popfn( self ):
                lst = self.pushed_back
                res = lst.pop( )
                if not lst:
                        self.nextfn = self.it.next
                return res

        def next( self ):
                return self.nextfn( )

        def pushback( self, item ):
                self.pushed_back.append( item )
                self.nextfn = self.popfn
---------------------------------------------------------------------

proposal:
=========
as this is (as I suppose) common problem, would it be possible to extend
the stdlib of python (ie itertools module) with a similar solution so
one do not have to reinvent the wheel every time pushback is needed?

thx, Matus


By coincidence I had need of just such a pushback iterator just the
other day. I was playing with queues and generators after reading
David Beazley's "Generator Tricks For Systems Programmers" [1] and I
needed an iterator that could be used with this generator function:

def pushQueueNonblocking(queue, iterator):
for item in iterator:
try:
queue.put(item, False)
except Queue.Full:
raise Queue.Full(item)

If the queue is full I pass item taken from the iterator but not yet
pushed onto the queue up to the caller in the Full exception so it
doesn't get lost. I wrote a wrapper function that retried putting
items on the queue after sleeping for a certain time if the queue is
full.

def pushQueueRetry(queue, iterator, retry_delay=1):
iterator = IteratorWithPushback(iterator)
while True:
try:
pushQueueNonblocking(queue, iterator)
except Queue.Full, err:
iterator.pushback(err.args[0])
if retry_delay >= 0:
sleep(retry_delay)
else:
break

Originally I used an itertools.chain() solution, but when I tried the
function with sleep of 0 (the queue was being get() from once a second
in another thread) I hit a funny error.

Because the put() thread was running, and retrying, much faster than
the get() thread, a huge "linked list" of itertools.chain generator
objects was stored up in memory. When the get() thread finally
unblocked the queue and the call to put() succeeded, the
pushQueueNonblocking() would loop and call next() on the chain object,
which would call next() on the next chain object, which called next()
on the next next chain object, etc...

That linked list of chain iterators was longer than
sys.getrecursionlimit().

Anyway, FWIW here's the IteratorWithPushback class that I whanged up:


class IteratorWithPushback:

def __init__(self, initial_iterator):
self.iter = iter(initial_iterator)
self.last = None
self.has_last = False

def __iter__(self):
return self

def next(self):
if self.has_last:
item = self.last
self.has_last = False
self.last = None # Don't hang on to it.
else:
item = self.iter.next()
return item

def pushback(self, item):
if self.has_last:
raise Exception("I'm full!")
self.last = item
self.has_last = True

As for putting something like this in the standard library, I'm not
sure. I've never needed one before and it's easy enough to write.
*shrug*


Regards,
~Simon

[1] http://www.dabeaz.com/generators/Generators.pdf
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,995
Messages
2,570,236
Members
46,825
Latest member
VernonQuy6

Latest Threads

Top