testing if a list contains a sublist

J

Johannes

Am 16.08.2011 10:00, schrieb Laszlo Nagy:
Error free? Consider this stated requirement:
l1 = [1,2,2,], l2 = [1,2,3,4,5] -> l1 is not contained in l2
If you look it the strict way, "containment" relation for lists is meant
this way:


l1 = []
l2 = [1,l1,2] # l2 CONTAINS l1

But you are right, I was wrong. So let's clarify what the OP wants!

For example:

l1 = [1,2,2,], l2 = [2,1,2,3,4,5]
I dont care about this case, because all list are ordered for me.

I've chosen the following solution
def _list_contained_in_list(l1,l2):
d1 = {}
d2 = {}
for i in l1:
if i in d1:
d1 += 1
else:
d1 = 1
for i in l2:
if i in d2:
d2 += 1
else:
d2 = 1
if not all([k in d2.keys() for k in d1.keys()]):
return false
return all([d1 <= d2 for i in d1])



greatz Johannes
 
A

Alain Ketterlin

Laszlo Nagy said:
For numbers, it will always work.

I'm not even sure: if str() is locale-sensitive, it may introduce commas
in numbers (I don't know whether str() is locale-sensitive, the doc
doesn't mention anything.)
But what about

lst1 = [",",",,"]
lst1 = [",",",",","]

Yes.

It will also fail on nested lists, for fundamental reasons which are
impossible to handle with regexps. (Tough I'm not sure the OP had nested
lists in mind.)

The "brute-force" algorithm given somewhere else in this thread is
probably the way to go, unless the lists are really long, in which case
one of the "string searching" algorithm should be used (I would be
surprised noone has implemented Boyer-Moore or Karp-Rabin).

-- Alain.
 
N

Neil Cerutti

That can be easily fixed:
s1 = ','.join(map(str, lst1))
s2 = ','.join(map(str, lst2))
return False if s2.find(s1)==-1 else True
sublist([1,2,3],[1,2,3,4,5]) True
sublist([1,2,2],[1,2,3,4,5]) False
sublist([1,2,3],[1,3,5,7]) False
sublist([12],[1,2]) False

String conversion is risky:
sublist(['1,2', '3,4'], [1, 2, 3, 4])
True

Since we're bike-shedding, here's my entry. It's not clear to me
if accepting iterables rather than lists is a win, but I thought,
"Why not be general if the implementation is easy?"

def is_subseq_of(x, y):
r"""Return True if the elements in iterable x are found contiguously in
iterable y.
is_subseq_of([], []) True
is_subseq_of([], [1, 2, 3]) True
is_subseq_of([1], [1, 2, 3]) True
is_subseq_of([1], []) False
is_subseq_of([4, 5], [1, 2, 3, 4, 5]) True
is_subseq_of([1, 2], [1, 3, 2]) False
is_subseq_of([2, 3], [1, 2, 3, 4]) True
is_subseq_of([1, 2, 2], [1, 2, 3, 4, 5]) False
is_subseq_of([1, 2], [1, 2]) True
is_subseq_of([1, 2, 3], [1, 2]) False
is_subseq_of(['1,2', '3,4'], [1, 2, 3, 4])
False
"""
x = tuple(x)
ix = 0
lenx = len(x)
if lenx == 0:
return True
for elem in y:
if x[ix] == elem:
ix += 1
else:
ix = 0
if ix == lenx:
return True
return False

if __name__ == '__main__':
import doctest
doctest.testmod()
 
C

ChasBrown

hi list,
what is the best way to check if a given list (lets call it l1) is
totally contained in a second list (l2)?
for example:
l1 = [1,2], l2 = [1,2,3,4,5] -> l1 is contained in l2
l1 = [1,2,2,], l2 = [1,2,3,4,5] -> l1 is not contained in l2
l1 = [1,2,3], l2 = [1,3,5,7] -> l1 is not contained in l2
my problem is the second example, which makes it impossible to work with
sets insteads of lists. But something like set.issubset for lists would
be nice.
greatz Johannes
My best guess:
from collections import Counter

There's no reason to think that the Original Poster wants a multiset based
solution. He asked about lists and sublists. That's a standard term, like
substring:

"12" is a substring of "01234".
"21" and "13" are not.

[1, 2] is a sublist of [0, 1, 2, 3, 4].
[2, 1] and [1, 3] are not.

Since lists are ordered, so are sublists.

That's reasonable; although except in the subject, the OP never uses
the term 'sublist'; instead using more ambiguous terms like
'contains', 'is totally contained', etc., with definition by limited
example. So it was a bit of a guess on my part of what was wanted.
If the OP does want a solution that ignores order, then he needs to describe
his problem better.

As it turns out, in another response the OP says he wants [2,1,2] to
be 'contained' by [1,2,2]. But in any case he always has sorted lists,
in which case, interestingly, the multiset approach and your more
canonical sublist approach yield the same results.

Cheers - Chas
 
N

Nobody

How about using Python's core support for "==" on list objects:
for i in range(alist_sz - slist_sz + 1):
if slist == alist[i:i+slist_sz]:
return True

This is bound to be asymptotically O(alist_sz * slist_sz), even if the
constant factor is reduced by use of ==. Boyer-Moore and regexps are
asymptotically O(alist_sz). However, the setup costs are much higher, so
you might need alist_sz to be very large before they win out.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,159
Messages
2,570,879
Members
47,416
Latest member
LionelQ387

Latest Threads

Top