Consume an iterable

R

Raymond Hettinger

     def consume2(iterator, n):  # the approved proposal (see #7764)
         if n is None:
             collections.deque(iterator, maxlen=0)
         else:
             next(islice(iterator, n, n), None)

FWIW, the deque() approach becomes even faster in Py2.7 and Py3.1
which has a high-speed path for the case where maxlen is zero.
Here's a snippet from Modules/_collectionsmodule.c:

/* Run an iterator to exhaustion. Shortcut for
the extend/extendleft methods when maxlen == 0. */
static PyObject*
consume_iterator(PyObject *it)
{
PyObject *item;

while ((item = PyIter_Next(it)) != NULL) {
Py_DECREF(item);
}
Py_DECREF(it);
if (PyErr_Occurred())
return NULL;
Py_RETURN_NONE;
}


This code consumes an iterator to exhaustion.
It is short, sweet, and hard to beat.


Raymond
 
P

Paul Rubin

Raymond Hettinger said:
This code consumes an iterator to exhaustion.
It is short, sweet, and hard to beat.

I've always used sum(1 for x in iterator) or some such.
 
P

Peter Otten

Raymond said:
FWIW, the deque() approach becomes even faster in Py2.7 and Py3.1
which has a high-speed path for the case where maxlen is zero.
Here's a snippet from Modules/_collectionsmodule.c:

/* Run an iterator to exhaustion. Shortcut for
the extend/extendleft methods when maxlen == 0. */
static PyObject*
consume_iterator(PyObject *it)
{
PyObject *item;

while ((item = PyIter_Next(it)) != NULL) {
Py_DECREF(item);
}
Py_DECREF(it);
if (PyErr_Occurred())
return NULL;
Py_RETURN_NONE;
}


This code consumes an iterator to exhaustion.
It is short, sweet, and hard to beat.

islice() is still a tad faster. A possible optimization:

static PyObject*
consume_iterator(PyObject *it)
{
PyObject *item;
PyObject *(*iternext)(PyObject *);

iternext = *Py_TYPE(it)->tp_iternext;

while ((item = iternext(it)) != NULL) {
Py_DECREF(item);
}
Py_DECREF(it);
if (PyErr_Occurred()) {
if(PyErr_ExceptionMatches(PyExc_StopIteration))
PyErr_Clear();
else
return NULL;
}
Py_RETURN_NONE;
}

Before:

$ ./python -m timeit -s"from itertools import repeat, islice; from
collections import deque; from sys import maxint" "next(islice(repeat(None,
1000), maxint, maxint), None)"
100000 loops, best of 3: 6.49 usec per loop

$ ./python -m timeit -s"from itertools import repeat, islice; from
collections import deque; from sys import maxint" "deque(repeat(None, 1000),
maxlen=0)"
100000 loops, best of 3: 9.93 usec per loop

After:

$ ./python -m timeit -s"from itertools import repeat, islice; from
collections import deque; from sys import maxint" "deque(repeat(None, 1000),
maxlen=0)"
100000 loops, best of 3: 6.31 usec per loop

Peter

PS: Two more, for Paul Rubin:

$ ./python -m timeit -s"from itertools import repeat" "sum(0 for _ in
repeat(None, 1000))"
10000 loops, best of 3: 125 usec per loop
$ ./python -m timeit -s"from itertools import repeat" "sum(0 for _ in
repeat(None, 1000) if 0)"
10000 loops, best of 3: 68.3 usec per loop
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,176
Messages
2,570,950
Members
47,503
Latest member
supremedee

Latest Threads

Top