generate tuples from sequence

Will McGugan · Jan 17, 2007

Hi,

I'd like a generator that takes a sequence and yields tuples containing
n items of the sqeuence, but ignoring the 'odd' items. For example

take_group(range(9), 3) -> (0,1,2) (3,4,5) (6,7,8)

This is what I came up with..

def take_group(gen, count):
i=iter(gen)
while True:
yield tuple([i.next() for _ in xrange(count)])

Is this the most efficient solution?

Regards,

Will McGugan

Will McGugan · Jan 17, 2007

Will said:
Hi,

I'd like a generator that takes a sequence and yields tuples containing
n items of the sqeuence, but ignoring the 'odd' items. For example

Forgot to add, for my purposes I will always have a sequence with a
multiple of n items.

Will

Peter Otten · Jan 17, 2007

Will said:
I'd like a generator that takes a sequence and yields tuples containing
n items of the sqeuence, but ignoring the 'odd' items. For example

take_group(range(9), 3) -> (0,1,2) (3,4,5) (6,7,8)

I like

items = range(9)
N = 3
zip(*[iter(items)]*N)

Click to expand...

Click to expand...

[(0, 1, 2), (3, 4, 5), (6, 7, 8)]

Peter

Tim Williams · Jan 17, 2007

Forgot to add, for my purposes I will always have a sequence with a
multiple of n items.

something along the lines of.......

[ (x,x+1,x+2) for x in xrange(0,9,3) ]

Click to expand...

Click to expand...

[(0, 1, 2), (3, 4, 5), (6, 7, 8)]

HTH

Neil Cerutti · Jan 17, 2007

Hi,

I'd like a generator that takes a sequence and yields tuples containing
n items of the sqeuence, but ignoring the 'odd' items. For example

take_group(range(9), 3) -> (0,1,2) (3,4,5) (6,7,8)

This is what I came up with..

def take_group(gen, count):
i=iter(gen)
while True:
yield tuple([i.next() for _ in xrange(count)])

Is this the most efficient solution?

This is starting to seem like an FAQ.

The Python library contains a recipe for this in the itertools
recipes in the documentation (5.16.3).

def grouper(n, iterable, padvalue=None):
"grouper(3, 'abcdefg', 'x') --> ('a','b','c'), ('d','e','f'), ('g','x','x')"
return izip(*[chain(iterable, repeat(padvalue, n-1))]*n)

It's more general and cryptic than what you asked for, though.

Méta-MCI · Jan 17, 2007

Hi!

r=iter(range(9))
print zip(r,r,r)

But, it's few like Peter...

Alan Isaac · Jan 18, 2007

Peter Otten said:
I like

items = range(9)
N = 3
zip(*[iter(items)]*N)

Click to expand...

Click to expand...

[(0, 1, 2), (3, 4, 5), (6, 7, 8)]

Except that it is considered implementation dependent:
http://mail.python.org/pipermail/python-list/2005-February/307550.html

Alan Isaac

Peter Otten · Jan 18, 2007

Alan said:
Peter Otten said:

I like

items = range(9)
N = 3
zip(*[iter(items)]*N)

Click to expand...

[(0, 1, 2), (3, 4, 5), (6, 7, 8)]

Click to expand...

Except that it is considered implementation dependent:
http://mail.python.org/pipermail/python-list/2005-February/307550.html

Use itertools.izip() then instead of zip(). I think that the occurence of
this idiom on the examples page counts as a blessing

But still, to help my lack of fantasy -- what would a sane zip()
implementation look like that does not guarantee the above output?

Peter

Duncan Booth · Jan 18, 2007

Peter Otten said:
But still, to help my lack of fantasy -- what would a sane zip()
implementation look like that does not guarantee the above output?

Hypothetically?

The code in zip which builds the result tuples looks (ignoring error
handling) like:

// inside a loop building each result element...
PyObject *next = PyTuple_New(itemsize);

for (j = 0; j < itemsize; j++) {
PyObject *it = PyTuple_GET_ITEM(itlist, j);
PyObject *item = PyIter_Next(it);
PyTuple_SET_ITEM(next, j, item);
}

For fixed size tuples you can create them using PyTuple_Pack. So imagine
some world where the following code works faster for small tuples:

// Outside the loop building the result list:
PyObject *a, *b, *c;
if (itemsize >= 1 && itemsize <= 3) a = PyTuple_GET_ITEM(...);
if (itemsize >= 2 && itemsize <= 3) b = PyTuple_GET_ITEM(...);
if (itemsize == 3) c = PyTuple_GET_ITEM(...);
...

// inside the result list loop:
PyObject *next;
if (itemsize==1) {
next = PyTuple_Pack(1,
PyIter_Next(a));
} else if (itemsize==2) {
next = PyTuple_Pack(2,
PyIter_Next(a),
PyIter_Next(b));
} else if (itemsize==2) {
next = PyTuple_Pack(3,
PyIter_Next(a),
PyIter_Next(b),
PyIter_Next(c));
} else {
next = PyTuple_New(itemsize);

for (j = 0; j < itemsize; j++) {
PyObject *it = PyTuple_GET_ITEM(itlist, j);
PyObject *item = PyIter_Next(it);
PyTuple_SET_ITEM(next, j, item);
}
}

If compiled on a system where the stack grows downwards (as it often does)
the C compiler is very likely to evaluate function arguments in reverse
order.

(BTW, this also assumes that it's an implementation which uses exceptions
or something for error handling otherwise you probably can't get it right,
but maybe something like IronPython could end up with code like this.)

Or maybe if someone added PyTuple_Pack1, PyTuple_Pack2, PyTuple_Pack3
functions which grab their memory off a separate free list for each tuple
length. That might speed up the time to create the tuples as you might be
able to just reset the content not rebuild the object each time. Again that
could make code like the above run more quickly.

Peter Otten · Jan 18, 2007

Duncan said:
Hypothetically?

The code in zip which builds the result tuples looks (ignoring error
handling) like:

// inside a loop building each result element...
PyObject *next = PyTuple_New(itemsize);

for (j = 0; j < itemsize; j++) {
PyObject *it = PyTuple_GET_ITEM(itlist, j);
PyObject *item = PyIter_Next(it);
PyTuple_SET_ITEM(next, j, item);
}

For fixed size tuples you can create them using PyTuple_Pack. So imagine
some world where the following code works faster for small tuples:

// Outside the loop building the result list:
PyObject *a, *b, *c;
if (itemsize >= 1 && itemsize <= 3) a = PyTuple_GET_ITEM(...);
if (itemsize >= 2 && itemsize <= 3) b = PyTuple_GET_ITEM(...);
if (itemsize == 3) c = PyTuple_GET_ITEM(...);
...

// inside the result list loop:
PyObject *next;
if (itemsize==1) {
next = PyTuple_Pack(1,
PyIter_Next(a));
} else if (itemsize==2) {
next = PyTuple_Pack(2,
PyIter_Next(a),
PyIter_Next(b));
} else if (itemsize==2) {
next = PyTuple_Pack(3,
PyIter_Next(a),
PyIter_Next(b),
PyIter_Next(c));
} else {
next = PyTuple_New(itemsize);

for (j = 0; j < itemsize; j++) {
PyObject *it = PyTuple_GET_ITEM(itlist, j);
PyObject *item = PyIter_Next(it);
PyTuple_SET_ITEM(next, j, item);
}
}

If compiled on a system where the stack grows downwards (as it often does)
the C compiler is very likely to evaluate function arguments in reverse
order.

(BTW, this also assumes that it's an implementation which uses exceptions
or something for error handling otherwise you probably can't get it right,
but maybe something like IronPython could end up with code like this.)

Or maybe if someone added PyTuple_Pack1, PyTuple_Pack2, PyTuple_Pack3
functions which grab their memory off a separate free list for each tuple
length. That might speed up the time to create the tuples as you might be
able to just reset the content not rebuild the object each time. Again
that could make code like the above run more quickly.

Special-casing small tuples meets my sanity criterion

Let's see if I understand the above: In C a call

f(g(), g())

may result in machine code equivalent to either

x = g()
y = g()
f(x, y)

or

y = g()
x = g()
f(x, y)

Is that it?

Peter

Duncan Booth · Jan 18, 2007

Peter Otten said:
Let's see if I understand the above: In C a call

f(g(), g())

may result in machine code equivalent to either

x = g()
y = g()
f(x, y)

or

y = g()
x = g()
f(x, y)

Is that it?

Yes, or changing one of the calls to h() and compiling with
"cl -Fat.asm -Ox -c t.c":

------ t.c --------
extern int f(int a, int b);
extern int g();
extern int h();

int main(int argc, char **argv) {
return f(g(), h());
}
-------------------

The output file:
------- t.asm -----
; Listing generated by Microsoft (R) Optimizing Compiler Version 13.10.3077

TITLE t.c
.386P
include listing.inc
if @Version gt 510
..model FLAT
else
_TEXT SEGMENT PARA USE32 PUBLIC 'CODE'
_TEXT ENDS
_DATA SEGMENT DWORD USE32 PUBLIC 'DATA'
_DATA ENDS
CONST SEGMENT DWORD USE32 PUBLIC 'CONST'
CONST ENDS
_BSS SEGMENT DWORD USE32 PUBLIC 'BSS'
_BSS ENDS
$$SYMBOLS SEGMENT BYTE USE32 'DEBSYM'
$$SYMBOLS ENDS
_TLS SEGMENT DWORD USE32 PUBLIC 'TLS'
_TLS ENDS
FLAT GROUP _DATA, CONST, _BSS
ASSUME CS: FLAT, DS: FLAT, SS: FLAT
endif

INCLUDELIB LIBC
INCLUDELIB OLDNAMES

PUBLIC _main
EXTRN _f:NEAR
EXTRN _g:NEAR
EXTRN _h:NEAR
; Function compile flags: /Ogty
_TEXT SEGMENT
_argc$ = 8 ; size = 4
_argv$ = 12 ; size = 4
_main PROC NEAR
; File c:\temp\t.c
; Line 6
call _h
push eax
call _g
push eax
call _f
add esp, 8
; Line 7
ret 0
_main ENDP
_TEXT ENDS
END
-------------------------

can't generate iterator from list	3	Sep 10, 2011
getting n items at a time from a generator	16	Dec 27, 2007
Into itertools	5	Apr 26, 2009
The devolution of English language and slothful c.l.p behaviors exposed!	50	Jan 24, 2012
Feedback on Sets, and Partitions	7	Apr 30, 2004
anybody help me	1	Feb 10, 2006
comp.lang.c Answers to Frequently Asked Questions (FAQ List)	15	Apr 1, 2006
comp.lang.c Answers (Abridged) to Frequently Asked Questions (FAQ)	0	Mar 1, 2008

generate tuples from sequence

Will McGugan

Will McGugan

Peter Otten

Tim Williams

Neil Cerutti

Méta-MCI

Alan Isaac

Peter Otten

Duncan Booth

Peter Otten

Duncan Booth

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads