merge list of tuples with list

Daniel Wagner · Oct 20, 2010

Hello Everyone,

I'm new in this group and I hope it is ok to directly ask a question.

My short question: I'm searching for a nice way to merge a list of
tuples with another tuple or list. Short example:
a = [(1,2,3), (4,5,6)]
b = (7,8)

After the merging I would like to have an output like:
a = [(1,2,3,7), (4,5,6)]

It was possible for me to create this output using a "for i in a"
technique but I think this isn't a very nice way and there should
exist a solution using the map(), zip()-functions....

I appreciate any hints how to solve this problem efficiently.

Greetings,
Daniel Wagner

James Mills · Oct 20, 2010

My short question: I'm searching for a nice way to merge a list of
tuples with another tuple or list. Short example:
a = [(1,2,3), (4,5,6)]
b = (7,8)

After the merging I would like to have an output like:
a = [(1,2,3,7), (4,5,6)]

What happens with the 8 in the 2nd tuple b ?

cheers
James

Daniel Wagner · Oct 20, 2010

My short question: I'm searching for a nice way to merge a list of
tuples with another tuple or list. Short example:
a = [(1,2,3), (4,5,6)]
b = (7,8)

Click to expand...

After the merging I would like to have an output like:
a = [(1,2,3,7), (4,5,6)]

Click to expand...

What happens with the 8 in the 2nd tuple b ?

Ohhhh, I'm sorry! This was a bad typo:
the output should look like:
a = [(1,2,3,7), (4,5,6,8)]

Greetings,
Daniel

Paul Rubin · Oct 20, 2010

Daniel Wagner said:
My short question: I'm searching for a nice way to merge a list of
tuples with another tuple or list. Short example:
a = [(1,2,3), (4,5,6)]
b = (7,8) ...

Click to expand...

Click to expand...

the output should look like:
a = [(1,2,3,7), (4,5,6,8)]

That is not really in the spirit of tuples, which are basically supposed
to be of fixed size (like C structs). But you could write:

>>> [x+(y,) for x,y in zip(a,b)]

Click to expand...

Click to expand...

[(1, 2, 3, 7), (4, 5, 6, 8)]

MRAB · Oct 20, 2010

Daniel Wagner said:
Daniel Wagner said:

My short question: I'm searching for a nice way to merge a list of
tuples with another tuple or list. Short example:
a = [(1,2,3), (4,5,6)]
b = (7,8) ...

Click to expand...

the output should look like:
a = [(1,2,3,7), (4,5,6,8)]

Click to expand...

That is not really in the spirit of tuples, which are basically supposed
to be of fixed size (like C structs). But you could write:

[x+(y,) for x,y in zip(a,b)]

Click to expand...

Click to expand...

[(1, 2, 3, 7), (4, 5, 6, 8)]

In Python 2.x:

zip(*zip(*a) + )

In Python 3.x:

list(zip(*list(zip(*a)) + ))

Daniel Wagner · Oct 20, 2010

I used the following code to add a single fixed value to both tuples.
But this is still not what I want...

a = [(1,2,3), (4,5,6)]
b = 1
a = map(tuple, map(lambda x: x + [1], map(list, a)))
a

Click to expand...

Click to expand...

[(1, 2, 3, 1), (4, 5, 6, 1)]

What I need is:

a = [(1,2,3), (4,5,6)]
b = (7,8)
a = CODE
a

Click to expand...

Click to expand...

[(1,2,3,7), (4,5,6,8)]

Greetings,
Daniel

Daniel Wagner · Oct 20, 2010

SOLVED! I just found it out....

I'm searching for a nice way to merge a list of
tuples with another tuple or list. Short example:
a = [(1,2,3), (4,5,6)]
b = (7,8)

After the merging I would like to have an output like:
a = [(1,2,3,7), (4,5,6)]

The following code solves the problem:

a = [(1,2,3), (4,5,6)]
b = [7,8]
a = map(tuple, map(lambda x: x + [b.pop(0)] , map(list, a)))
a

Click to expand...

Click to expand...

[(1, 2, 3, 7), (4, 5, 6, 8)]

Any more efficient ways or suggestions are still welcome!

Greetings,
Daniel

James Mills · Oct 20, 2010

Any more efficient ways or suggestions are still welcome!

Did you not see Paul Rubin's solution:

[x+(y,) for x,y in zip(a,b)]

Click to expand...

Click to expand...

[(1, 2, 3, 7), (4, 5, 6, 8)]

I think this is much nicer and probably more efficient.

cheers
James

Chris Torek · Oct 20, 2010

Any more efficient ways or suggestions are still welcome!

[/QUOTE]

Did you not see Paul Rubin's solution:

[x+(y,) for x,y in zip(a,b)]

Click to expand...

Click to expand...

[(1, 2, 3, 7), (4, 5, 6, 8)]

I think this is much nicer and probably more efficient.

For a slight boost in Python 2.x, use itertools.izip() to avoid
making an actual list out of zip(a,b). (In 3.x, "plain" zip() is
already an iterator rather than a list-result function.)

This method (Paul Rubin's) uses only a little extra storage, and
almost no extra when using itertools.izip() (or 3.x). I think it
is more straightforward than multi-zip-ing (e.g., zip(*zip(*a) + ))
as well. The two-zip method needs list()-s in 3.x as well, making
it clearer where the copies occur:

list(zip(*a)) makes the list [(1, 4), (2, 5), (3, 6)]
[input value is still referenced via "a" so
sticks around]
makes the tuple (7, 8) into the list [(7, 8)]
[input value is still referenced via "b" so
sticks around]
+ adds those two lists producing the list
[(1, 4), (2, 5), (3, 6), (7, 8)]
[the two input values are no longer referenced
and are thus discarded]
list(zip(*that)) makes the list [(1, 2, 3, 7), (4, 5, 6, 8)]
[the input value -- the result of the addition
in the next to last step -- is no longer
referenced and thus discarded]

All these temporary results take up space and time. The list
comprehension simply builds the final result, once.

Of course, I have not used timeit to try this out. Let's do
that, just for fun (and to let me play with timeit from the command
line):

(I am not sure why I have to give the full path to the
timeit.py source here)

sh-3.2$ python /System/Library/Frameworks/Python.framework/\
Versions/2.5/lib/python2.5/timeit.py \
'a=[(1,2,3),(4,5,6)];b=(7,8);[x+(y,) for x,y in zip(a,b)]'
100000 loops, best of 3: 2.55 usec per loop

sh-3.2$ python [long path snipped] \
'a=[(1,2,3),(4,5,6)];b=(7,8);[x+(y,) for x,y in zip(a,b)]'
100000 loops, best of 3: 2.56 usec per loop

sh-3.2$ python [long path snipped] \
'a=[(1,2,3),(4,5,6)];b=(7,8);zip(*zip(*a) + )'
100000 loops, best of 3: 3.84 usec per loop

sh-3.2$ python [long path snipped] \
'a=[(1,2,3),(4,5,6)];b=(7,8);zip(*zip(*a) + )'
100000 loops, best of 3: 3.85 usec per loop

Hence, even in 2.5 where zip makes a temporary copy of the list,
the list comprehension version is faster. Adding an explicit use
of itertools.izip does help, but not much, with these short lists:

sh-3.2$ python ... -s 'import itertools' \
'a=[(1,2,3),(4,5,6)];b=(7,8);[x+(y,) for x,y in itertools.izip(a,b)]'
100000 loops, best of 3: 2.27 usec per loop

sh-3.2$ python ... -s 'import itertools' \
'a=[(1,2,3),(4,5,6)];b=(7,8);[x+(y,) for x,y in itertools.izip(a,b)]'
100000 loops, best of 3: 2.29 usec per loop

(It is easy enough to move the assignments to a and b into the -s
argument, but it makes relatively little difference since the list
comprehension and two-zip methods both have the same setup overhead.
The "import", however, is pretty slow, so it is not good to repeat
it on every trip through the 100000 loops -- on my machine it jumps
to 3.7 usec/loop, almost as slow as the two-zip method.)

Peter Otten · Oct 20, 2010

Daniel said:
Hello Everyone,

I'm new in this group and I hope it is ok to directly ask a question.

My short question: I'm searching for a nice way to merge a list of
tuples with another tuple or list. Short example:
a = [(1,2,3), (4,5,6)]
b = (7,8)

After the merging I would like to have an output like:
a = [(1,2,3,7), (4,5,6)]

It was possible for me to create this output using a "for i in a"
technique but I think this isn't a very nice way and there should
exist a solution using the map(), zip()-functions....

I appreciate any hints how to solve this problem efficiently.

from itertools import starmap, izip
from operator import add
a = [(1,2,3), (4,5,6)]
b = (7,8)
list(starmap(add, izip(a, izip(b))))

Click to expand...

Click to expand...

[(1, 2, 3, 7), (4, 5, 6, 8)]

This is likely slower than the straightforward

[x + (y,) for x, y in zip(a, b)]

for "short" lists, but should be faster for "long" lists. Of course you'd
have to time-it to be sure.
You should also take into consideration that the latter can be understood
immediately by any moderately experienced pythonista.

Peter

Daniel Wagner · Oct 20, 2010

Many thanks for all these suggestions! here is a short proof that you
guys are absolutely right and my solution is pretty inefficient.

One of your ways:

$ python /[long_path]/timeit.py 'a=[(1,2,3),(4,5,6)];b=(7,8);[x+(y,)
for x,y in zip(a,b)]'
1000000 loops, best of 3: 1.44 usec per loop

And my way:

$ python /[long_path]/timeit.py 'a=[(1,2,3),
(4,5,6)];b=[7,8];map(tuple, map(lambda x: x + [b.pop(0)] , map(list,
a)))'
100000 loops, best of 3: 5.33 usec per loop

I really appreciate your solutions but they bring me to a new
question: Why is my solution so inefficient? The same operation
without the list/tuple conversion

$ python /[long_path]/timeit.py 'a=[[1,2,3],
[4,5,6]];b=[7,8];map(lambda x: x + [b.pop(0)] , a)'
100000 loops, best of 3: 3.36 usec per loop

is still horrible slow. Could anybody explain me what it makes so
slow? Is it the map() function or maybe the lambda construct?

Greetings,
Daniel

Steven D'Aprano · Oct 21, 2010

I really appreciate your solutions but they bring me to a new question:
Why is my solution so inefficient? The same operation without the
list/tuple conversion

$ python /[long_path]/timeit.py 'a=[[1,2,3], [4,5,6]];b=[7,8];map(lambda
x: x + [b.pop(0)] , a)' 100000 loops, best of 3: 3.36 usec per loop

is still horrible slow.

What makes you say that? 3 microseconds to create four lists, two
assignments, create a function object, then inside the map look up the
global b twice, the method 'pop' twice, call the method twice, resize the
list b twice, create an inner list twice, concatenate that list with
another list twice, and stuff those two new lists into a new list...
3usec for all that in Python code doesn't seem unreasonable to me.

On my PC, it's two orders of magnitude slower than a pass statement. That
sounds about right to me.

$ python -m timeit
10000000 loops, best of 3: 0.0325 usec per loop
$ python -m timeit 'a=[[1,2,3], [4,5,6]];b=[7,8];map(lambda x: x + [b.pop
(0)] , a)'
100000 loops, best of 3: 4.32 usec per loop

Can we do better?

$ python -m timeit -s 'a=[[1,2,3], [4,5,6]]; f = lambda x: x + [b.pop
(0)]' 'b=[7,8]; map(f, a)'
100000 loops, best of 3: 3.25 usec per loop

On my system, moving the creation of the list a and the code being timed
and into the setup code reduces the time by 25%. Not too shabby.

Could anybody explain me what it makes so slow?
Is it the map() function or maybe the lambda construct?

lambdas are just functions -- there is no speed difference between a
function

def add(a, b):
return a+b

and lambda a, b: a+b

The looping overhead of map(f, data) is minimal. But in this case, the
function you're calling does a fair bit of work:

lambda x: x + [b.pop(0)]

This has to lookup the global b, resize it, create a new list,
concatenate it with the list x (which creates a new list, not an in-place
concatenation) and return that. The amount of work is non-trivial, and I
don't think that 3us is unreasonable.

But for large lists b, it will become slow, because resizing the list is
slow. Popping from the start on a regular list has to move every element
over, one by one. You may find using collections.deque will be *much*
faster for large lists. (But probably not for small lists.)

Personally, the approach I'd take is:

a = [[1,2,3], [4,5,6]]
b = [7,8]
[x+[y] for x,y in zip(a,b)]

Speedwise:

$ python -m timeit -s 'a=[[1,2,3], [4,5,6]]; b=[7,8]' '[x+[y] for x,y in
zip(a,b)]'
100000 loops, best of 3: 2.43 usec per loop

If anyone can do better than that (modulo hardware differences), I'd be
surprised.

Daniel Wagner · Oct 21, 2010

[b.pop(0)]

This has to lookup the global b, resize it, create a new list,
concatenate it with the list x (which creates a new list, not an in-place
concatenation) and return that. The amount of work is non-trivial, and I
don't think that 3us is unreasonable.

I forgot to take account for the resizing of the list b. Now it makes sense. Thanks!

Personally, the approach I'd take is:

a = [[1,2,3], [4,5,6]]
b = [7,8]
[x+[y] for x,y in zip(a,b)]

Speedwise:

$ python -m timeit -s 'a=[[1,2,3], [4,5,6]]; b=[7,8]' '[x+[y] for x,y in
zip(a,b)]'
100000 loops, best of 3: 2.43 usec per loop

If anyone can do better than that (modulo hardware differences), I'd be
surprised.

Yeah, this seems to be a nice solution.

Greetings,
Daniel

Average of MultiMode of a list of a list	1	Oct 28, 2022
Numpy.array with dtype works on list of tuples not on list of lists?	2	Sep 18, 2011
min max from tuples in list	23	Dec 12, 2013
Dictionaries with tuples or tuples of tuples	18	Feb 19, 2013
Iterate through a list of tuples for processing	0	Sep 20, 2013
Differences creating tuples and collections.namedtuples	28	Feb 18, 2013
Compare tuples of different lenght	7	Aug 20, 2011
A question of style (finding item in list of tuples)	3	May 21, 2012

merge list of tuples with list

Daniel Wagner

James Mills

Daniel Wagner

Paul Rubin

MRAB

Daniel Wagner

Daniel Wagner

James Mills

Chris Torek

Peter Otten

Daniel Wagner

Steven D'Aprano

Daniel Wagner

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads