list comprehention

Mathijs · Jan 19, 2006

Hi,

Python beginner here and very much enjoying it. I'm looking for a
pythonic way to find how many listmembers are also present in a reference
list. Don't count duplicates (eg. if you already found a matching member
in the ref list, you can't use the ref member anymore).

Example1:
ref=[2, 2, 4, 1, 1]
list=[2, 3, 4, 5, 3]
solution: 2

Example2:
ref=[2, 2, 4, 1, 1]
list=[2, 2, 5, 2, 4]
solution: 3 (note that only the first two 2's count, the third 2 in the
list should not be counted)

Any suggestions or comments?

Thanks.
M.

#my failing effort:
sum([min(r.count(n)-l[:i].count(n),l.count(n)) for i,n in enumerate(l)])

#test lists
import random
#reference list
r=[random.randint(1,5) for n in range(5)]
#list
l=[random.randint(1,5) for n in range(5)]

Tim Chase · Jan 19, 2006

Python beginner here and very much enjoying it. I'm looking

> for a pythonic way to find how many listmembers are also
> present in a reference list. Don't count duplicates (eg. if
> you already found a matching member in the ref list, you can't
> use the ref member anymore).
>
> Example1:
> ref=[2, 2, 4, 1, 1]
> list=[2, 3, 4, 5, 3]
> solution: 2
>
> Example2:
> ref=[2, 2, 4, 1, 1]
> list=[2, 2, 5, 2, 4]
> solution: 3 (note that only the first two 2's count, the third
> 2 in the list should not be counted)

It sounds like you're looking for "set" operations: (using "ell"
for clarity)

>>> from sets import Set
>>> a = [2,2,4,1,1]
>>> b = [2,3,4,5,3]
>>> setA = Set(a)
>>> setB = Set(b)
>>> results = setA.intersection(setB)
>>> results Set([2,4])
>>> intersection = [x for x in results]
>>> intersection

Click to expand...

Click to expand...

[2,4]

I'm a tad confused by the help, as it sounds like sets are
supposed to be first-class citizens, but in ver2.3.5 that I'm
running here (or rather "there", on a friend's box), I have to
"import sets" which I didn't see mentioned in the reference manual.

-one of many tims on the list
tim = Set(["bald", "vegetarian", "loving husband"])

markscala · Jan 19, 2006

def reference(alist1,alist2):
counter = 0
for x in lis:
if x in ref:
ref.pop(ref.index(x))
counter += 1
return counter

this works I think for your examples, but you should check it against
them and other cases.
good luck

markscala · Jan 19, 2006

revision of previous:

def reference(reflist,alist2):
counter = 0
for x in alist2:
if x in reflist:
reflist.pop(reflist.index(x))
counter += 1
return counter

markscala · Jan 19, 2006

another approach:

ref = [2,2,4,1,1]
lis = [2,2,5,2,4]

len([ref.pop(ref.index(x)) for x in lis if x in ref])

bonono · Jan 19, 2006

Tim said:
Python beginner here and very much enjoying it. I'm looking
for a pythonic way to find how many listmembers are also
present in a reference list. Don't count duplicates (eg. if
you already found a matching member in the ref list, you can't
use the ref member anymore).

Example1:
ref=[2, 2, 4, 1, 1]
list=[2, 3, 4, 5, 3]
solution: 2

Example2:
ref=[2, 2, 4, 1, 1]
list=[2, 2, 5, 2, 4]
solution: 3 (note that only the first two 2's count, the third
2 in the list should not be counted)

Click to expand...

It sounds like you're looking for "set" operations: (using "ell"
for clarity)

from sets import Set
a = [2,2,4,1,1]
b = [2,3,4,5,3]
setA = Set(a)
setB = Set(b)
results = setA.intersection(setB)
results Set([2,4])
intersection = [x for x in results]
intersection

Click to expand...

Click to expand...

[2,4]

won't set remove duplicates which he wants to preserve ? He is not just
looking for the 'values' that is in common, but the occurence as well,
if I understand the requirement correctly.

I would just build a dict with the value as the key and occurence as
the value then loop the list and lookup.

Peter Otten · Jan 19, 2006

Mathijs said:
Python beginner here and very much enjoying it. I'm looking for a
pythonic way to find how many listmembers are also present in a reference
list. Don't count duplicates (eg. if you already found a matching member
in the ref list, you can't use the ref member anymore).

Example1:
ref=[2, 2, 4, 1, 1]
list=[2, 3, 4, 5, 3]
solution: 2

Example2:
ref=[2, 2, 4, 1, 1]
list=[2, 2, 5, 2, 4]
solution: 3 (note that only the first two 2's count, the third 2 in the
list should not be counted)

Any suggestions or comments?

sum(min(list.count(n), ref.count(n)) for n in set(ref))

Is that it?

Peter

Peter Otten · Jan 19, 2006

Tim said:
I'm a tad confused by the help, as it sounds like sets are
supposed to be first-class citizens, but in ver2.3.5 that I'm
running here (or rather "there", on a friend's box), I have to
"import sets" which I didn't see mentioned in the reference manual.

set and frozenset are a builtins starting with Python 2.4.

http://www.python.org/2.4/highlights.html

"""
New or upgraded built-ins

built-in sets - the sets module, introduced in 2.3, has now been implemented
in C, and the set and frozenset types are available as built-in types (PEP
218)
"""

Peter

Tim Chase · Jan 19, 2006

Python beginner here and very much enjoying it. I'm looking

[snipped]

won't set remove duplicates which he wants to preserve ?

My reading was that the solution shouldn't count duplicates
("Don't count duplicates"). However, once you mentioned it, and
I saw other folks' responses that looked very diff. from my own,
I re-read the OP's comments and found that I missed this key bit:

"note that only the first *two* 2's count, the third 2
in the list should not be counted"
(*emphasis* mine)

and that the desired result was a count (though a len(Set())
would return the right count if the OP wanted the true
intersection, but that's beside the point).

My error. Sorry, ladies and gentlemen

I'm partial to the elegance of markscala's suggestion of:

len([ref.pop(ref.index(x)) for x in lis if x in ref])

though it might need to come with a caveat that it doesn't leave
"ref" in the same state is it was originally, so it should be
copied and then manipulated thusly:

r = ref[:]
len([r.pop(r.index(x)) for x in (lis if x in r])

which would then leave ref undisturbed. As tested:

import random
ref = [random.randint(1,5) for n in range(5)]
for x in range(1,10):
lis = [random.randint(1,5) for n in range(5)]
r = ref[:]
print repr((r,lis))
print len([r.pop(r.index(x)) for x in lis if x in r])

seems to give the results the OP describes.

-tim

Matt · Jan 19, 2006

Tim Chase wrote:

I'm a tad confused by the help, as it sounds like sets are
supposed to be first-class citizens, but in ver2.3.5 that I'm
running here (or rather "there", on a friend's box), I have to
"import sets" which I didn't see mentioned in the reference manual.

-one of many tims on the list
tim = Set(["bald", "vegetarian", "loving husband"])

The builtin "set" was added in 2.4, I believe (note lowercase "s"):

http://docs.python.org/whatsnew/node2.html

print "vegetarian".replace("etari","")

Paddy · Jan 19, 2006

Hi,
I liked the twist at the end when you state that only the first two 2's
count. It reminded me
of my maths O'level revision where you always had to read the question
thoroughly.

Here is what I came up with:

ref [2, 2, 4, 1, 1]
lst [2, 2, 5, 2, 4]
tmp = [ [val]*min(lst.count(val), ref.count(val)) for val in set(ref)]
tmp [[], [2, 2], [4]]
answer = [x for y in tmp for x in y]
answer [2, 2, 4]

Click to expand...

Click to expand...

I took a lot from Peter Ottens reply to generate tmp then flattened the
inner lists.

After a bit more thought, the intermediate calculation of tmp can be
removed with a
little loss in clarity though, to give:

answer = [ val for val in set(ref) for x in range(min(lst.count(val), ref.count(val)))]
answer

Click to expand...

Click to expand...

[2, 2, 4]

- Cheers, Paddy.

Duncan Booth · Jan 20, 2006

Mathijs said:
Python beginner here and very much enjoying it. I'm looking for a
pythonic way to find how many listmembers are also present in a
reference list. Don't count duplicates (eg. if you already found a
matching member in the ref list, you can't use the ref member
anymore).

Example1:
ref=[2, 2, 4, 1, 1]
list=[2, 3, 4, 5, 3]
solution: 2

Example2:
ref=[2, 2, 4, 1, 1]
list=[2, 2, 5, 2, 4]
solution: 3 (note that only the first two 2's count, the third 2 in
the list should not be counted)

Here's the way I would do it:
res = {}
for item in it:
if item in res:
res[item] += 1
else:
res[item] = 1
return res

ref=[2, 2, 4, 1, 1]
lst=[2, 2, 5, 2, 4]
oref = occurrences(ref)
sum(min(v,oref.get(k,0)) for (k,v) in occurrences(lst).iteritems()) 3
lst=[2, 3, 4, 5, 3]
sum(min(v,oref.get(k,0)) for (k,v) in occurrences(lst).iteritems()) 2

Click to expand...

Click to expand...

Or in other words, define a function to return a dictionary containing
a count of the number of occurrences of each element in the list (this
assumes that the list elements are hashable). Then you just add up the
values in the test list making sure each count is limited to no higher than
the reference count.

Bryan Olson · Jan 20, 2006

Duncan said:
Here's the way I would do it:

res = {}
for item in it:
if item in res:
res[item] += 1
else:
res[item] = 1
return res

I slightly prefer:

def occurrences(it):
res = {}
res[item] = res.get(item, 0) + 1
return res

[...]

Or in other words, define a function to return a dictionary containing
a count of the number of occurrences of each element in the list (this
assumes that the list elements are hashable). Then you just add up the
values in the test list making sure each count is limited to no higher than
the reference count.

Resulting in a linear-time average case, where the posted
list-comprehension-based solutions are quadratic. The title
of the thread is unfortunate.

The generalized problem is multiset (AKA "bag") intersection:

http://en.wikipedia.org/wiki/Bag_(mathematics)

Mathijs · Jan 23, 2006

Op 19 jan 2006 vond "(e-mail address removed)" :

another approach:

ref = [2,2,4,1,1]
lis = [2,2,5,2,4]

len([ref.pop(ref.index(x)) for x in lis if x in ref])

This is the type of solution I was hoping to see: one-liners, with no use
of local variables. As Tim Chase already wrote, it has only one less
elegant side: it alters the original ref list.

Thanks for your suggestion.

Mathijs · Jan 23, 2006

Op 19 jan 2006 vond Peter Otten said:
sum(min(list.count(n), ref.count(n)) for n in set(ref))

Is that it?

Seems like this is it! Thanks.

Mathijs · Jan 23, 2006

Op 19 jan 2006 vond "Paddy said:
answer = [ val for val in set(ref) for x in
range(min(lst.count(val), ref.count(val)))] answer

Click to expand...

Click to expand...

[2, 2, 4]

I don't think it's correct. Your algoritm with the ref and lst below gives
3 as answer. The answer should have been 2 (1,3).

ref=[3, 3, 1, 1, 3]
lst=[5, 1, 4, 5, 3]

Mathijs · Jan 23, 2006

Op 20 jan 2006 vond Duncan Booth said:
Or in other words, define a function to return a dictionary containing
a count of the number of occurrences of each element in the list (this
assumes that the list elements are hashable). Then you just add up the
values in the test list making sure each count is limited to no higher
than the reference count.

Thanks. Though I don't know much about python (yet), this is more or less
the way I'de do it the language I'm more comfortable with (object pascal),
and I wouldn't label this as a pythonic solution. I could be wrong,
though

Duncan Booth · Jan 23, 2006

Mathijs said:
Thanks. Though I don't know much about python (yet), this is more or
less the way I'de do it the language I'm more comfortable with (object
pascal), and I wouldn't label this as a pythonic solution. I could be
wrong, though

I'm curious what you think isn't pythonic about this solution?

Peter Otten · Jan 23, 2006

Mathijs said:
Thanks. Though I don't know much about python (yet), this is more or less
the way I'de do it the language I'm more comfortable with (object pascal),
and I wouldn't label this as a pythonic solution. I could be wrong,
though

You *are* wrong. Also, as the lists grow, Duncan's approach scales *much*
better than, e. g., mine.

If picking a better algorithm were unpythonic there would not be much value
in striving for pythonic solutions.

Peter

Steven D'Aprano · Jan 23, 2006

Op 19 jan 2006 vond "(e-mail address removed)" :

another approach:

ref = [2,2,4,1,1]
lis = [2,2,5,2,4]

len([ref.pop(ref.index(x)) for x in lis if x in ref])

Click to expand...

This is the type of solution I was hoping to see: one-liners, with no use
of local variables.

Because you like unreadable, incomprehensible, unmaintainable code?

*wink*

Range / empty list issues??	1	Dec 11, 2023
Require code that starts within the middle of the for loop ranges.	3	Oct 31, 2022
Code help please	4	May 19, 2023
Why is Python telling me variable is local not global?	3	Sep 2, 2023
C program: memory leak/ segmentation fault/ memory limit exceeded	0	Nov 12, 2022
python adds an extra half space when reading froma string or list --back to the question	1	Jul 1, 2013
Translater + module + tkinter	1	Feb 16, 2023
I Need Fix In Code	1	Apr 12, 2023

list comprehention

Mathijs

Tim Chase

markscala

markscala

markscala

bonono

Peter Otten

Peter Otten

Tim Chase

Matt

Paddy

Duncan Booth

Bryan Olson

Mathijs

Mathijs

Mathijs

Mathijs

Duncan Booth

Peter Otten

Steven D'Aprano

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads