Compare tuples of different lenght

J

Jurgens de Bruin

Hi,

I have a list of tuples:

[(2,),(12,13),(2,3,4),(8,),(5,6),(7,8,9),]

I would like to compare all the tuples to each other and if one
element if found two tuples the smallest tuples is removed from the
list.

example if tuple 1 and tuple 3 are compare it should find that a
single element in each are the same and tuple 1 should be removed
resulting in

[(12,13),(2,3,4),(8,),(5,6),(7,8,9),]

the same for tuple 4 and 6 resulting in

[(12,13),(2,3,4),(5,6),(7,8,9),]

is this possible as I am having no success.

Thanks
 
C

Chris Rebert

Hi,

I have a list of tuples:

[(2,),(12,13),(2,3,4),(8,),(5,6),(7,8,9),]

I would like to compare all the tuples to each other and if one
element if found two tuples the smallest tuples is removed from the
list.

So, would [(5,6), (6,7,8)] become [(6,7,8)] ?

If no, then I believe you're trying to solve the set covering problem:
http://en.wikipedia.org/wiki/Set_cover_problem

Cheers,
Chris
 
J

Jurgens de Bruin

I have a list of tuples:
[(2,),(12,13),(2,3,4),(8,),(5,6),(7,8,9),]

I would like to compare all the tuples to each other and if one
element if found two tuples the smallest tuples is removed from the
list.

So, would [(5,6), (6,7,8)] become [(6,7,8)] ?

If no, then I believe you're trying to solve the set covering problem:http://en.wikipedia.org/wiki/Set_cover_problem

Cheers,
Chris
--http://rebertia.com

[(5,6), (6,7,8)] would become [(6,7,8)].

Thanks for the response
 
J

Jurgens de Bruin

I have a list of tuples:
[(2,),(12,13),(2,3,4),(8,),(5,6),(7,8,9),]

I would like to compare all the tuples to each other and if one
element if found two tuples the smallest tuples is removed from the
list.

So, would [(5,6), (6,7,8)] become [(6,7,8)] ?

If no, then I believe you're trying to solve the set covering problem:http://en.wikipedia.org/wiki/Set_cover_problem

Cheers,
Chris
--http://rebertia.com

[(5,6), (6,7,8)] will indeed become [(6,7,8)]

Tanks!!
 
S

Steven D'Aprano

Jurgens said:
Hi,

I have a list of tuples:

[(2,),(12,13),(2,3,4),(8,),(5,6),(7,8,9),]

I would like to compare all the tuples to each other and if one
element if found two tuples the smallest tuples is removed from the
list.

It's not clear what you mean by "smallest" tuple. Is (8,) smaller than
(7,8,9)?

I'm going to guess you care only about the length of the tuple, and not the
items themselves.

Let's start with a couple of helper functions.

def compare(t1, t2):
'Return -1 if t1 is "smaller" than t2, 0 if equal, and +1 if "bigger".'
if len(t1) < len(t2): return -1
elif len(t1) > len(t2): return 1
else: return 0

def match_any_item(t1, t2):
try:
s1 = set(t1)
s2 = set(t2)
return bool(s1 & s2)
except TypeError:
# Can't convert to sets because at least one item is mutable.
# Let's do this the slow(?) way.
matched = [x for x in t1 if x in t2]
return bool(matched)



list_of_tuples = [(2,),(12,13),(2,3,4),(8,),(5,6),(7,8,9),]
flags = [True]*len(list_of_tuples)
for i,t1 in enumerate(list_of_tuples):
for j in range(i+1, len(list_of_tuples)):
t2 = list_of_tuples[j]
if match_any_item(t1, t2):
n = compare(t1, t2)
if n == -1:
# Flag t1 to be removed.
flags = False
elif n == 1:
# Flag t2 to be removed.
flags[j] = False

saved_tuples = []
for t,flag in zip(list_of_tuples, flags):
if flag: saved_tuples.append(t)



This gives:
[(12, 13), (2, 3, 4), (5, 6), (7, 8, 9)]


which matches what you wanted:
[(12,13),(2,3,4),(5,6),(7,8,9),]
 
J

Jurgens de Bruin

Jurgens said:
I have a list of tuples:
[(2,),(12,13),(2,3,4),(8,),(5,6),(7,8,9),]

I would like to compare all the tuples to each other and if one
element if found two tuples the smallest tuples is removed from the
list.

It's not clear what you mean by "smallest" tuple. Is (8,) smaller than
(7,8,9)?

I'm going to guess you care only about the length of the tuple, and not the
items themselves.

Let's start with a couple of helper functions.

def compare(t1, t2):
    'Return -1 if t1 is "smaller" than t2, 0 if equal, and +1 if "bigger".'
    if len(t1) < len(t2): return -1
    elif len(t1) > len(t2): return 1
    else: return 0

def match_any_item(t1, t2):
    try:
        s1 = set(t1)
        s2 = set(t2)
        return bool(s1 & s2)
    except TypeError:
        # Can't convert to sets because at least one item is mutable.
        # Let's do this the slow(?) way.
        matched = [x for x in t1 if x in t2]
        return bool(matched)

list_of_tuples = [(2,),(12,13),(2,3,4),(8,),(5,6),(7,8,9),]
flags = [True]*len(list_of_tuples)
for i,t1 in enumerate(list_of_tuples):
    for j in range(i+1, len(list_of_tuples)):
        t2 = list_of_tuples[j]
        if match_any_item(t1, t2):
            n = compare(t1, t2)
            if n == -1:
                # Flag t1 to be removed.
                flags = False
            elif n == 1:
                # Flag t2 to be removed.
                flags[j] = False

saved_tuples = []
for t,flag in zip(list_of_tuples, flags):
    if flag: saved_tuples.append(t)

This gives:

[(12, 13), (2, 3, 4), (5, 6), (7, 8, 9)]

which matches what you wanted:
[(12,13),(2,3,4),(5,6),(7,8,9),]


Thanks Steven. This works great!!!

Appreciated very much!!!
 
P

Peter Otten

Jurgens said:
Hi,

I have a list of tuples:

[(2,),(12,13),(2,3,4),(8,),(5,6),(7,8,9),]

I would like to compare all the tuples to each other and if one
element if found two tuples the smallest tuples is removed from the
list.

example if tuple 1 and tuple 3 are compare it should find that a
single element in each are the same and tuple 1 should be removed
resulting in

[(12,13),(2,3,4),(8,),(5,6),(7,8,9),]

the same for tuple 4 and 6 resulting in

[(12,13),(2,3,4),(5,6),(7,8,9),]

is this possible as I am having no success.

Thanks

from collections import Counter, defaultdict
from itertools import chain

def process_counter(sample):
c = Counter()
d = defaultdict(list)
for items in sample:
c.update(items)
d[len(items)].append(items)

result = []
for cluster in sorted(d.values(), key=len):
c.subtract(chain.from_iterable(cluster))
for items in cluster:
if not any(c[item] for item in items):
result.append(items)

result.sort(key=sample.index)
return result

if __name__ == "__main__":
for process in [process_counter]:
print process.__name__

sample = [(2,),(12,13),(2,3,4),(8,),(5,6),(7,8,9),]
wanted = [(12,13),(2,3,4),(5,6),(7,8,9),]
assert process(sample) == wanted


sample = [(5,6), (6,7,8)]
wanted = [(6,7,8)]
got = process(sample)
assert got == wanted

sample = wanted = [(5, 6), (6, 7)]
assert process(sample) == wanted

sample = [(1,), (1, 2), (2, 3, 4)]
wanted = [(2, 3, 4)]
assert process(sample) == wanted
 
J

John O'Hagan

Hi,

I have a list of tuples:

[(2,),(12,13),(2,3,4),(8,),(5,6),(7,8,9),]

I would like to compare all the tuples to each other and if one
element if found two tuples the smallest tuples is removed from the
list.
[...]

This should work:

def long_match(tuples):
sorted_tuples = sorted(tuples, key=len)
for n, t in enumerate(sorted_tuples):
for s in sorted_tuples[n + 1:]:
if len(s) > len(t) and any(i in s for i in t):
tuples.remove(t)
break
return tuples


Regards,

John
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,961
Messages
2,570,130
Members
46,689
Latest member
liammiller

Latest Threads

Top