Recursive list comprehension

Timothy Babytch · Dec 6, 2004

Hi all.

I have a list that looks like [['N', 'F'], ['E'], ['D']]
I try to make it flat one: ['N', 'F', 'E', 'D']

How can I archieve such an effect with list comprehension?
Two cycles did the job, but that way did not look pythonic..

I tried
print [x for x in y for y in c_vars]
and got NameError: name 'y' is not defined.

Peter Otten · Dec 6, 2004

Timothy said:
Hi all.

I have a list that looks like [['N', 'F'], ['E'], ['D']]
I try to make it flat one: ['N', 'F', 'E', 'D']

How can I archieve such an effect with list comprehension?
Two cycles did the job, but that way did not look pythonic..

I tried
print [x for x in y for y in c_vars]
and got NameError: name 'y' is not defined.

The order of the for expressions is as it would be for nested loops:

items = [['N', 'F'], ['E'], ['D']]
[y for x in items for y in x]

Click to expand...

Click to expand...

['N', 'F', 'E', 'D']

I would still prefer a for loop because it spares you from iterating over
the sublist items in python:

data = []
for sub in [['N', 'F'], ['E'], ['D']]:

Click to expand...

Click to expand...

.... data.extend(sub)
....['N', 'F', 'E', 'D']

Peter

Timothy Babytch · Dec 6, 2004

Peter said:
The order of the for expressions is as it would be for nested loops:

items = [['N', 'F'], ['E'], ['D']]
[y for x in items for y in x]

Click to expand...

Click to expand...

I would still prefer a for loop because it spares you from iterating over
the sublist items in python:

data = []
for sub in [['N', 'F'], ['E'], ['D']]:

Click to expand...

Click to expand...

... data.extend(sub)
...

Thanks. Both tips were helpful.

Peter Nuttall · Dec 6, 2004

Hi all.

I have a list that looks like [['N', 'F'], ['E'], ['D']]
I try to make it flat one: ['N', 'F', 'E', 'D']

How can I archieve such an effect with list comprehension?
Two cycles did the job, but that way did not look pythonic..

I tried
print [x for x in y for y in c_vars]
and got NameError: name 'y' is not defined.

Hi,

I think you do it with a generator like this:

def flatten(nested):
for sublist in nested:
for element in sublist:
yield element

n=[['N', 'F'], ['E'], ['D']]
output=[]

for value in flatten(n):
output.append(value)

print output

Have a merry Christmas

Peter Nuttall

Nick Coghlan · Dec 6, 2004

Peter said:
I think you do it with a generator like this:

def flatten(nested):
for sublist in nested:
for element in sublist:
yield element

n=[['N', 'F'], ['E'], ['D']]
output=[]

for value in flatten(n):
output.append(value)

print output

I highly recommend learning about the stdlib module "itertools" (I only really
looked into it recently). The above can be done in 3 lines (counting imports):

from itertools import chain
n = [['N', 'F'], ['E'], ['D']]
print [chain(*n)]

Documentation:
http://www.python.org/doc/2.3.4/lib/itertools-functions.html

Cheers,
Nick.

Peter Otten · Dec 6, 2004

Nick said:
from itertools import chain
n = [['N', 'F'], ['E'], ['D']]
print [chain(*n)]

However, [generator] is not the same as list(generator):

[ said:
from itertools import chain
n = [['N', 'F'], ['E'], ['D']]
print [chain(*n)]

Click to expand...

[ said:

print list(chain(*n))

Click to expand...

Click to expand...

['N', 'F', 'E', 'D']

And with the star operator you are foregoing some laziness, usually an
important selling point for the iterator approach. Therefore:

n = [['N', 'F'], ['E'], ['D']]
lazyItems = (x for y in n for x in y)
lazyItems.next() 'N'
list(lazyItems) ['F', 'E', 'D']

Click to expand...

Click to expand...

Of course this makes most sense when you want to keep the original n anyway
_and_ can be sure it will not be mutated while you are still drawing items
from the lazyItems generator.

Peter

Nick Coghlan · Dec 6, 2004

Peter said:
Nick Coghlan wrote:

from itertools import chain
n = [['N', 'F'], ['E'], ['D']]
print [chain(*n)]

Click to expand...

However, [generator] is not the same as list(generator):

Heh - good point. As you say, replacing with list() gives the intended answer.

With regards to laziness, my main point was that itertools is handy for
manipulating sequences, even if you aren't exploiting its capacity for lazy
evaluation.

Cheers,
Nick.

Serhiy Storchaka · Dec 6, 2004

Timothy said:
I have a list that looks like [['N', 'F'], ['E'], ['D']]
I try to make it flat one: ['N', 'F', 'E', 'D']

How can I archieve such an effect with list comprehension?
Two cycles did the job, but that way did not look pythonic..

I tried
print [x for x in y for y in c_vars]
and got NameError: name 'y' is not defined.

sum(c_vars, [])

Timothy Babytch · Dec 6, 2004

Serhiy said:
>>>sum([['N', 'F'], ['E'], ['D']], [])

Click to expand...

Click to expand...

['N', 'F', 'E', 'D']

THE BEST!

Peter Hansen · Dec 6, 2004

Timothy said:
Serhiy said:

sum([['N', 'F'], ['E'], ['D']], [])

Click to expand...

Click to expand...

['N', 'F', 'E', 'D']

THE BEST!

Hmmm. Maybe, unless readability as in "self-documenting code"
is important to you...

Preceding the above with "flatten = sum" would perhaps be
an adequate improvement.

-Peter

Adam DePrince · Dec 8, 2004

Serhiy said:
Serhiy said:

sum([['N', 'F'], ['E'], ['D']], [])

Click to expand...

Click to expand...

['N', 'F', 'E', 'D']

THE BEST!

Sum certainly takes the cake for hackish elegance, and would be my
choice if I was absolutely certain that my data structure was exactly a
list of lists. A more general solution with applicability to arbitrary
levels of nesting is below.

a = [['N','F'],['E'],['D']]
b = [[['A','B',['C','D']],'N','F'],'E', ['F'] ]
c = iter([[iter(['A','B',('C','D')]),'N','F'],'E', iter(['F']) ])

def flatten( i ):
try:
i = i.__iter__()
while 1:
j = flatten( i.next() )
try:
while 1:
yield j.next()
except StopIteration:
pass
except AttributeError:
yield i

if __name__ == "__main__":
print list( flatten( a ) )
print list( flatten( b ) )
print list( flatten( c ) )

Which when run gives you ...

['N', 'F', 'E', 'D']
['A', 'B', 'C', 'D', 'N', 'F', 'E', 'F']
['A', 'B', 'C', 'D', 'N', 'F', 'E', 'F']

Adam DePrince

Steven Bethard · Dec 8, 2004

Adam said:
def flatten( i ):
try:
i = i.__iter__()
while 1:
j = flatten( i.next() )
try:
while 1:
yield j.next()
except StopIteration:
pass
except AttributeError:
yield i

Probably you want to catch a TypeError instead of an AttributeError;
objects may support the iterator protocol without defining an __iter__
method:
.... def __getitem__(self, index):
.... if index > 3:
.... raise IndexError(index)
.... return index
....[0, 1, 2, 3]

I would write your code as something like:
.... try:
.... if isinstance(i, basestring):
.... raise TypeError('strings are atomic')
.... iterable = iter(i)
.... except TypeError:
.... yield i
.... else:
.... for item in iterable:
.... for sub_item in flatten(item):
.... yield sub_item
....

>>> list(flatten([['N','F'],['E'],['D']])) ['N', 'F', 'E', 'D']
>>> list(flatten([C(), 'A', 'B', ['C', C()]]))

Click to expand...

Click to expand...

[0, 1, 2, 3, 'A', 'B', 'C', 0, 1, 2, 3]

Note that I special-case strings because, while strings support the
iterator protocol, in this case we want to consider them 'atomic'. By
catching the TypeError instead of an AttributeError, I can support
old-style iterators as well.

Steve

Peter Otten · Dec 8, 2004

Adam said:
def flatten( i ):
try:
i = i.__iter__()
while 1:
j = flatten( i.next() )
try:
while 1:
yield j.next()
except StopIteration:
pass
except AttributeError:
yield i

While trying to break your code with a len() > 1 string I noted that strings
don't feature an __iter__ attribute. Therefore obj.__iter__() is not
equivalent to iter(obj) for strings. Do you (plural) know whether this is a
CPython implementation accident or can be relied upon?

Peter

Nick Craig-Wood · Dec 8, 2004

Adam DePrince said:
def flatten( i ):
try:
i = i.__iter__()
while 1:
j = flatten( i.next() )
try:
while 1:
yield j.next()
except StopIteration:
pass
except AttributeError:
yield i

Hmm, there is more to that than meets the eye! I was expecting

print list(flatten("hello"))

to print

['h', 'e', 'l', 'l', 'o']

But it didn't, it printed

['hello']

With a little more investigation I see that str has no __iter__
method. However you can call iter() on a str
....
h
e
l
l
o

Or even
....
h
e
l
l
o

....and this works because str supports __getitem__ according to the
docs.

So there is some magic going on here! Is str defined to never have an
__iter__ method? I see no reason why that it couldn't one day have an
__iter__ method though.

Steven Bethard · Dec 8, 2004

Peter said:
I noted that strings
don't feature an __iter__ attribute. Therefore obj.__iter__() is not
equivalent to iter(obj) for strings. Do you (plural) know whether this is a
CPython implementation accident or can be relied upon?

> With a little more investigation I see that str has no __iter__
> method. However you can call iter() on a str [snip]
> ...and this works because str supports __getitem__ according to the
> docs.
>
> So there is some magic going on here! Is str defined to never have an
> __iter__ method? I see no reason why that it couldn't one day have an
> __iter__ method though.

The magic is the old-style iteration protocol (also called the 'sequence
protocol') which calls __getitem__ starting at 0 until an IndexError is
raised. From the docs:

http://www.python.org/doc/lib/built-in-funcs.html

iter( o[, sentinel])
Return an iterator object. The first argument is interpreted very
differently depending on the presence of the second argument. Without a
second argument, o must be a collection object which supports the
iteration protocol (the __iter__() method), or it must support the
sequence protocol (the __getitem__() method with integer arguments
starting at 0). If it does not support either of those protocols,
TypeError is raised...

I looked around to see if there was any talk specifically of str and
__iter__/__getitem__ and couldn't find any. (Though I wouldn't claim
that this means it's not out there.)

My guess is that there isn't
any guarantee that str objects might not one day grow an __iter__
method, so I wouldn't rely on it.

See my other post that uses iter() and TypeError instead of .__iter__()
and AttributeError -- it's relatively simple to avoid relying on
..__iter__, and doing so also allows you to support other objects that
support the sequence protocol but have no __iter__ method.

Steve

Adam DePrince · Dec 8, 2004

Adam said:
Adam said:

def flatten( i ):
try:
i = i.__iter__()
while 1:
j = flatten( i.next() )
try:
while 1:
yield j.next()
except StopIteration:
pass
except AttributeError:
yield i

Click to expand...

Probably you want to catch a TypeError instead of an AttributeError;
objects may support the iterator protocol without defining an __iter__
method:
... def __getitem__(self, index):
... if index > 3:
... raise IndexError(index)
... return index
...[0, 1, 2, 3]

I would write your code as something like:
... try:
... if isinstance(i, basestring):
... raise TypeError('strings are atomic')
... iterable = iter(i)
... except TypeError:
... yield i
... else:
... for item in iterable:
... for sub_item in flatten(item):
... yield sub_item
...

list(flatten([['N','F'],['E'],['D']])) ['N', 'F', 'E', 'D']
list(flatten([C(), 'A', 'B', ['C', C()]]))

Click to expand...

Click to expand...

[0, 1, 2, 3, 'A', 'B', 'C', 0, 1, 2, 3]

Note that I special-case strings because, while strings support the
iterator protocol, in this case we want to consider them 'atomic'. By
catching the TypeError instead of an AttributeError, I can support
old-style iterators as well.

Of course, iter( "a" ).next() is "a" -- if you don't look for the
special case of a string you will spin until you blow your stack. The
problem with a special case is it misses objects that have string like
behavior but are not members of basestring. This change deals with that
case, albeit at the expense of some performance (it adds a small
O(depth) factor)).

def flatten(i, history=[]):
try:
if reduce( lambda x,y:x or y, map( lambda x:i is x, history ),\
False ):
raise TypeError('Dej' )
iterable = iter(i)
except TypeError:
yiel>>> list(flatten([C(), 'A', 'B', ['C', C()]]))

[0, 1, 2, 3, 'A', 'B', 'C', 0, 1, 2, 3]

Note that I special-case strings because, while strings support the
iterator protocol, in this case we want to consider them 'atomic'. By
catching the TypeError instead of an AttributeError, I can support
old-style iterators as well.

Of course, iter( "a" ).next() is "a" -- if you don't look for the
special case of a string you will spin until you blow your stack. The
problem with a special case is it misses objects that have string like
behavior but are not members of basestring. This change deals with that
case, albeit at the expense of some performance (it adds a small
O(depth) factor)).

def flatten(i, history=[]):
try:
if isinstance( i,basestring) or reduce( lambda x,y:x or y, map(
lambda x:i is x, history ),\ False ):
raise TypeError('strings are atomic' )
iterable = iter(i)
except TypeError:
yield i
else:
history = history +
for item in iterable:
for sub_item in flatten(item, history ):
yield sub_item

if __name__ == "__main__":
print list( flatten( a ) )
print list( flatten( b ) )
print list( flatten( c ) )

Steve

Click to expand...

Adam DePrince

Steven Bethard · Dec 8, 2004

Adam said:
Of course, iter( "a" ).next() is "a" -- if you don't look for the
special case of a string you will spin until you blow your stack. The
problem with a special case is it misses objects that have string like
behavior but are not members of basestring.

Yup. Unfortunately, there's no "string protocol" like there's an
"iterator protocol" or we could check this kind of thing easier. Seems
like your history check might be the best option if you need to support
this kind of thing.

Steve

Terry Reedy · Dec 9, 2004

Steven Bethard said:
Probably you want to catch a TypeError instead of an AttributeError;
objects may support the iterator protocol without defining an __iter__
method:

No, having an __iter__ method that returns an iterator is an essential half
of the current iterator protocol just so that iter(iterator) (==
iterator.__iter__()) always works. This requirement conveniently makes
'iterator' a subcategory of 'iterable'. (I am ignoing the old and obsolete
getnext protocol, as does the itertools library module.)

Terry J. Reedy

Steven Bethard · Dec 9, 2004

Terry said:
No, having an __iter__ method that returns an iterator is an essential half
of the current iterator protocol just so that iter(iterator) (==
iterator.__iter__()) always works. This requirement conveniently makes
'iterator' a subcategory of 'iterable'.

Yeah, you're right, I probably should have referred to it as the
'iterable protocol' instead of the 'iterator protocol'.

> (I am ignoing the old and obsolete
> getnext protocol, as does the itertools library module.)

What is the getnext protocol? Is that the same thing that the iter()
docs call the sequence protocol? Because this definitely still works
with itertools:
.... def __getitem__(self, index):
.... if index > 3:
.... raise IndexError(index)
.... return index
....['0', '1', '2', '3']

Steve

Terry Reedy · Dec 9, 2004

Steven Bethard said:
What is the getnext protocol? Is that the same thing that the iter()
docs call the sequence protocol?

Yes. (I meant to write getitem rather than getnext.)

Because this definitely still works with itertools:

Yes, not because itertools are cognizant of sequence objects but because
itertools apply iter() to inputs and because iter() currently accomodates
sequence-protocol objects as well as iterable-protocol objects by wrapping
the former with builtin <iterator> objects. I expect that may change if
and when the builtin C-coded types are updated to have __init__ methods.
This is a ways off, if ever, but I think the general advice for user code
is to use the newer protocol. So, for the purpose of writing new code, I
think it justified to forget about or at least ignore the older iteration
protocol.

Terry J. Reedy

Python List Comprehension Error: Unexpected Output	1	Aug 28, 2023
is list comprehension necessary?	15	Oct 26, 2010
List comprehension/genexp inconsistency.	0	Mar 20, 2012
[newbie] Recursive algorithm - review	5	Jan 4, 2014
Range / empty list issues??	1	Dec 11, 2023
[newbie] Recursive algorithm - review	5	Jan 4, 2014
List comprehension timing difference.	4	Sep 2, 2011
List comprehension vs filter()	6	Apr 20, 2011

Recursive list comprehension

Timothy Babytch

Peter Otten

Timothy Babytch

Peter Nuttall

Nick Coghlan

Peter Otten

Nick Coghlan

Serhiy Storchaka

Timothy Babytch

Peter Hansen

Adam DePrince

Steven Bethard

Peter Otten

Nick Craig-Wood

Steven Bethard

Adam DePrince

Steven Bethard

Terry Reedy

Steven Bethard

Terry Reedy

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads