intersection of 2 list of pairs

les_ander · Feb 20, 2005

Hi,
I have 2 lists of tuples that look like:
E1=[('a','g'),('r','s')] and
E2=[('g','a'),('r','q'),('f','h')].
In this tuple, the ordering does not
matter, i.e. (u,v) is the same as (v,u).

What I want to do is the following:
given 2 list of tuples, E1 and E2, I want to create another list with
tuples that are common to both. So in the above example I would like
to return ('a','g') as being common.
So far I have the code below but it does not work out properly, I would
be grateful if someone can help me out .
thanks

def list_of_tuples_intersect(E1,E2):

S1=Set()
S2=Set()
for e in E1:
S1.add((e[0],e[1]))
S1.add((e[1],e[0]))
for e in E2:
S2.add((e[0],e[1]))
S2.add((e[1],e[0]))
S= S1 & S2
SS=Set()
done=Set()

for e in S:
if ((e[0],e[1]) in done) or ((e[1],e[0]) in done):
continue
else:
SS.add((e[0],e[1]))
done.add((e[0],e[1]))
done.add((e[1],e[0]))
return SS

Peter Otten · Feb 20, 2005

I have 2 lists of tuples that look like:
E1=[('a','g'),('r','s')] and
E2=[('g','a'),('r','q'),('f','h')].
In this tuple, the ordering does not
matter, i.e. (u,v) is the same as (v,u).

What I want to do is the following:
given 2 list of tuples, E1 and E2, I want to create another list with
tuples that are common to both. So in the above example I would like
to return ('a','g') as being common.

How about

e1 = [('a', 'g'), ('r', 's')]
e2 = [('g', 'a'), ('r', 'q'), ('f', 'h')]
s2 = set(e2)
s2.update((b, a) for (a, b) in e2)
list(set(e1) & s2)

Click to expand...

Click to expand...

[('a', 'g')]

If you are on 2.3, continue to use Set instead of set and modify the update
line to use a list comprehension:

s2.update([(b, a) for (a, b) in e2])

Peter

John Machin · Feb 20, 2005

[email protected] said:
Hi,
I have 2 lists of tuples that look like:
E1=[('a','g'),('r','s')] and
E2=[('g','a'),('r','q'),('f','h')].
In this tuple, the ordering does not
matter, i.e. (u,v) is the same as (v,u).

What I want to do is the following:
given 2 list of tuples, E1 and E2, I want to create another list with
tuples that are common to both. So in the above example I would like
to return ('a','g') as being common.
So far I have the code below but it does not work out properly, I would
be grateful if someone can help me out .

[snip]

This is likely to do what you want, provided the types of the objects
in your tuples define an ordering. The idea is to change the
representation of the tuples to a canonical form (a, b) where a <= b.
By the way, have you considered that there might be duplicates *within*
each list?

E1=[('a','g'),('r','s')]
E2=[('g','a'),('r','q'),('f','h')]
s1 = set((min(x,y),max(x,y)) for x, y in E1)
s1 set([('a', 'g'), ('r', 's')])
s2 = set((min(x,y),max(x,y)) for x, y in E2)
s2 set([('a', 'g'), ('q', 'r'), ('f', 'h')])
answer = s1 & s2
answer set([('a', 'g')])

Click to expand...

Click to expand...

Here's a hint: when you find yourself typing stuff like (x[0],x[1])
..... (x[1],x[0]) more than minimally, the 'wrong way, go back' sign
should flash up.

HTH,

John

James Stroud · Feb 21, 2005

Learned list comprehension today, its cool:

E1=[('a','g'),('r','s')]
E2=[('g','a'),('r','q'),('f','h')]
[x for x in E1 for y in E2 if x == y or (x[1],x[0])==y]

Click to expand...

Click to expand...

[('a', 'g')]

Hi,
I have 2 lists of tuples that look like:
E1=[('a','g'),('r','s')] and
E2=[('g','a'),('r','q'),('f','h')].
In this tuple, the ordering does not
matter, i.e. (u,v) is the same as (v,u).

What I want to do is the following:
given 2 list of tuples, E1 and E2, I want to create another list with
tuples that are common to both. So in the above example I would like
to return ('a','g') as being common.
So far I have the code below but it does not work out properly, I would
be grateful if someone can help me out .
thanks

def list_of_tuples_intersect(E1,E2):

S1=Set()
S2=Set()
for e in E1:
S1.add((e[0],e[1]))
S1.add((e[1],e[0]))
for e in E2:
S2.add((e[0],e[1]))
S2.add((e[1],e[0]))
S= S1 & S2
SS=Set()
done=Set()

for e in S:
if ((e[0],e[1]) in done) or ((e[1],e[0]) in done):
continue
else:
SS.add((e[0],e[1]))
done.add((e[0],e[1]))
done.add((e[1],e[0]))
return SS

Pierre Quentel · Feb 21, 2005

Another method is to build two sets of sets, one for E1 and one for E2,
then make the intersection of these sets

- with Python 2.3

>>> E1=[('a','g'),('r','s')]
>>> E2=[('g','a'),('r','q'),('f','h')]
>>> from sets import Set,ImmutableSet
>>> f=Set([ImmutableSet(s) for s in E1])& Set([ImmutableSet(s) for s in E2])
>>> [tuple(x) for x in f]

Click to expand...

Click to expand...

[('a', 'g')]

- with Python 2.4

>>> E1=[('a','g'),('r','s')]
>>> E2=[('g','a'),('r','q'),('f','h')]
>>> f=set([frozenset(s) for s in E1]) & set([frozenset(s) for s in E2])
>>> [tuple(x) for x in f]

Click to expand...

Click to expand...

[('a', 'g')]

You must use ImmutableSet or frozenset to be able to create a set of sets

Pierre

John Machin · Feb 21, 2005

Pierre said:
Another method is to build two sets of sets, one for E1 and one for E2,
then make the intersection of these sets

- with Python 2.3

E1=[('a','g'),('r','s')]
E2=[('g','a'),('r','q'),('f','h')]
from sets import Set,ImmutableSet
f=Set([ImmutableSet(s) for s in E1])& Set([ImmutableSet(s) for s

Click to expand...

Click to expand...

in

E2])

[tuple(x) for x in f]

Click to expand...

Click to expand...

[('a', 'g')]

- with Python 2.4

E1=[('a','g'),('r','s')]
E2=[('g','a'),('r','q'),('f','h')]
f=set([frozenset(s) for s in E1]) & set([frozenset(s) for s in E2])
[tuple(x) for x in f]

Click to expand...

Click to expand...

[('a', 'g')]

You must use ImmutableSet or frozenset to be able to create a set of sets

Pierre

The problem with using frozenset([x,y)) -- as compared to
(min(x,y),max(x,y))-- is that you don't get an *understandable*
canonical representation of the pair:

frozenset(('a','b')) frozenset(['a', 'b'])
frozenset(('b','c'))

Click to expand...

Click to expand...

frozenset(['c', 'b'])

i.e. it depends on the Python hash function for the type/class being
used. Try explaining that to the first person you meet at the bus stop.
Exceptions prove the rule: if the OP regularly caught the same bus as
the timbot he wouldn't be asking questions like this

In any case the OP still has a problem; when ('a','g') -- or ('g', 'a')
--pops out as the answer, he still doesn't know which list(s) the
duplicates are in, nor which way they are represented. To do something
useful, the output would have to be a list of lists, with the inner
list representing a cluster of duplicates. Each duplicate would be
represented by a unique identifier -- such as a tuple: (list_number,
index_in_that_list) -- so that it can be retrieved, inspected,
jettisoned, ...

bearophileHUGS · Feb 22, 2005

The use of frozenset can okay when sub-sequences are longer, but for
this problem it's slow, and anyway there are situations of repeated
data like this that have to be considered:

frozenset( ('a', 'a') )
==>
frozenset(['a'])

For Py2.4 the faster and better solution seems Peter Otten one.
James Stroud solution is O(n^2) and it can become slow for long
lists...

Bear hugs,
Bearophile

John Machin · Feb 22, 2005

[email protected] said:
The use of frozenset can okay when sub-sequences are longer, but for
this problem it's slow, and anyway there are situations of repeated
data like this that have to be considered:

Not just for frozenset; this has to be considered whatever the
representation.

frozenset( ('a', 'a') )
==>
frozenset(['a'])

For Py2.4 the faster and better solution seems Peter Otten one.
James Stroud solution is O(n^2) and it can become slow for long
lists...

"Better"? In what sense? My point is that the OP's real problem
requires (a) a workable canonical representation of a pair -- lack of
this is most probably the root cause of the problem that he is trying
to solve!! -- and (b) a result that identifies the duplicate pairs
unambiguously so that he can do something with them.

"Faster"? The solutions proposed by Peter and James do more-or-less
answer the OP's ill-formed question, but speed should be the last
consideration, after the real problem is defined carefully.

Don't understand SequenceMatcher from difflib	0	Jun 21, 2011
Average of MultiMode of a list of a list	1	Oct 28, 2022
ray/triangle intersection	0	Nov 5, 2008
Blue J Ciphertext Program	2	Nov 22, 2023
Code help please	4	May 19, 2023
Calculate rang and derang of ordering of subsets	1	Feb 6, 2024
Python3 string slicing	2	Dec 11, 2023
Getting value of instances of variable.	1	Mar 25, 2023

intersection of 2 list of pairs

les_ander

Peter Otten

John Machin

James Stroud

Pierre Quentel

John Machin

bearophileHUGS

John Machin

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads