list comprehention

M

Mathijs

Hi,

Python beginner here and very much enjoying it. I'm looking for a
pythonic way to find how many listmembers are also present in a reference
list. Don't count duplicates (eg. if you already found a matching member
in the ref list, you can't use the ref member anymore).

Example1:
ref=[2, 2, 4, 1, 1]
list=[2, 3, 4, 5, 3]
solution: 2

Example2:
ref=[2, 2, 4, 1, 1]
list=[2, 2, 5, 2, 4]
solution: 3 (note that only the first two 2's count, the third 2 in the
list should not be counted)

Any suggestions or comments?

Thanks.
M.


#my failing effort:
sum([min(r.count(n)-l[:i].count(n),l.count(n)) for i,n in enumerate(l)])

#test lists
import random
#reference list
r=[random.randint(1,5) for n in range(5)]
#list
l=[random.randint(1,5) for n in range(5)]
 
T

Tim Chase

Python beginner here and very much enjoying it. I'm looking
> for a pythonic way to find how many listmembers are also
> present in a reference list. Don't count duplicates (eg. if
> you already found a matching member in the ref list, you can't
> use the ref member anymore).
>
> Example1:
> ref=[2, 2, 4, 1, 1]
> list=[2, 3, 4, 5, 3]
> solution: 2
>
> Example2:
> ref=[2, 2, 4, 1, 1]
> list=[2, 2, 5, 2, 4]
> solution: 3 (note that only the first two 2's count, the third
> 2 in the list should not be counted)

It sounds like you're looking for "set" operations: (using "ell"
for clarity)
>>> from sets import Set
>>> a = [2,2,4,1,1]
>>> b = [2,3,4,5,3]
>>> setA = Set(a)
>>> setB = Set(b)
>>> results = setA.intersection(setB)
>>> results Set([2,4])
>>> intersection = [x for x in results]
>>> intersection
[2,4]


I'm a tad confused by the help, as it sounds like sets are
supposed to be first-class citizens, but in ver2.3.5 that I'm
running here (or rather "there", on a friend's box), I have to
"import sets" which I didn't see mentioned in the reference manual.

-one of many tims on the list
tim = Set(["bald", "vegetarian", "loving husband"])

:)
 
M

markscala

def reference(alist1,alist2):
counter = 0
for x in lis:
if x in ref:
ref.pop(ref.index(x))
counter += 1
return counter

this works I think for your examples, but you should check it against
them and other cases.
good luck
 
M

markscala

revision of previous:

def reference(reflist,alist2):
counter = 0
for x in alist2:
if x in reflist:
reflist.pop(reflist.index(x))
counter += 1
return counter
 
M

markscala

another approach:

ref = [2,2,4,1,1]
lis = [2,2,5,2,4]

len([ref.pop(ref.index(x)) for x in lis if x in ref])
 
B

bonono

Tim said:
Python beginner here and very much enjoying it. I'm looking
for a pythonic way to find how many listmembers are also
present in a reference list. Don't count duplicates (eg. if
you already found a matching member in the ref list, you can't
use the ref member anymore).

Example1:
ref=[2, 2, 4, 1, 1]
list=[2, 3, 4, 5, 3]
solution: 2

Example2:
ref=[2, 2, 4, 1, 1]
list=[2, 2, 5, 2, 4]
solution: 3 (note that only the first two 2's count, the third
2 in the list should not be counted)

It sounds like you're looking for "set" operations: (using "ell"
for clarity)
from sets import Set
a = [2,2,4,1,1]
b = [2,3,4,5,3]
setA = Set(a)
setB = Set(b)
results = setA.intersection(setB)
results Set([2,4])
intersection = [x for x in results]
intersection
[2,4]
won't set remove duplicates which he wants to preserve ? He is not just
looking for the 'values' that is in common, but the occurence as well,
if I understand the requirement correctly.

I would just build a dict with the value as the key and occurence as
the value then loop the list and lookup.
 
P

Peter Otten

Mathijs said:
Python beginner here and very much enjoying it. I'm looking for a
pythonic way to find how many listmembers are also present in a reference
list. Don't count duplicates (eg. if you already found a matching member
in the ref list, you can't use the ref member anymore).

Example1:
ref=[2, 2, 4, 1, 1]
list=[2, 3, 4, 5, 3]
solution: 2

Example2:
ref=[2, 2, 4, 1, 1]
list=[2, 2, 5, 2, 4]
solution: 3 (note that only the first two 2's count, the third 2 in the
list should not be counted)

Any suggestions or comments?

sum(min(list.count(n), ref.count(n)) for n in set(ref))

Is that it?

Peter
 
P

Peter Otten

Tim said:
I'm a tad confused by the help, as it sounds like sets are
supposed to be first-class citizens, but in ver2.3.5 that I'm
running here (or rather "there", on a friend's box), I have to
"import sets" which I didn't see mentioned in the reference manual.

set and frozenset are a builtins starting with Python 2.4.

http://www.python.org/2.4/highlights.html

"""
New or upgraded built-ins

built-in sets - the sets module, introduced in 2.3, has now been implemented
in C, and the set and frozenset types are available as built-in types (PEP
218)
"""

Peter
 
T

Tim Chase

Python beginner here and very much enjoying it. I'm looking
[snipped]

won't set remove duplicates which he wants to preserve ?

My reading was that the solution shouldn't count duplicates
("Don't count duplicates"). However, once you mentioned it, and
I saw other folks' responses that looked very diff. from my own,
I re-read the OP's comments and found that I missed this key bit:

"note that only the first *two* 2's count, the third 2
in the list should not be counted"
(*emphasis* mine)

and that the desired result was a count (though a len(Set())
would return the right count if the OP wanted the true
intersection, but that's beside the point).

My error. Sorry, ladies and gentlemen :)

I'm partial to the elegance of markscala's suggestion of:

len([ref.pop(ref.index(x)) for x in lis if x in ref])

though it might need to come with a caveat that it doesn't leave
"ref" in the same state is it was originally, so it should be
copied and then manipulated thusly:

r = ref[:]
len([r.pop(r.index(x)) for x in (lis if x in r])

which would then leave ref undisturbed. As tested:

import random
ref = [random.randint(1,5) for n in range(5)]
for x in range(1,10):
lis = [random.randint(1,5) for n in range(5)]
r = ref[:]
print repr((r,lis))
print len([r.pop(r.index(x)) for x in lis if x in r])

seems to give the results the OP describes.

-tim
 
M

Matt

Tim Chase wrote:
I'm a tad confused by the help, as it sounds like sets are
supposed to be first-class citizens, but in ver2.3.5 that I'm
running here (or rather "there", on a friend's box), I have to
"import sets" which I didn't see mentioned in the reference manual.

-one of many tims on the list
tim = Set(["bald", "vegetarian", "loving husband"])

:)

The builtin "set" was added in 2.4, I believe (note lowercase "s"):

http://docs.python.org/whatsnew/node2.html

print "vegetarian".replace("etari","")
:)
 
P

Paddy

Hi,
I liked the twist at the end when you state that only the first two 2's
count. It reminded me
of my maths O'level revision where you always had to read the question
thoroughly.

Here is what I came up with:
ref [2, 2, 4, 1, 1]
lst [2, 2, 5, 2, 4]
tmp = [ [val]*min(lst.count(val), ref.count(val)) for val in set(ref)]
tmp [[], [2, 2], [4]]
answer = [x for y in tmp for x in y]
answer [2, 2, 4]

I took a lot from Peter Ottens reply to generate tmp then flattened the
inner lists.

After a bit more thought, the intermediate calculation of tmp can be
removed with a
little loss in clarity though, to give:
answer = [ val for val in set(ref) for x in range(min(lst.count(val), ref.count(val)))]
answer
[2, 2, 4]



- Cheers, Paddy.
 
D

Duncan Booth

Mathijs said:
Python beginner here and very much enjoying it. I'm looking for a
pythonic way to find how many listmembers are also present in a
reference list. Don't count duplicates (eg. if you already found a
matching member in the ref list, you can't use the ref member
anymore).

Example1:
ref=[2, 2, 4, 1, 1]
list=[2, 3, 4, 5, 3]
solution: 2

Example2:
ref=[2, 2, 4, 1, 1]
list=[2, 2, 5, 2, 4]
solution: 3 (note that only the first two 2's count, the third 2 in
the list should not be counted)

Here's the way I would do it:
res = {}
for item in it:
if item in res:
res[item] += 1
else:
res[item] = 1
return res
ref=[2, 2, 4, 1, 1]
lst=[2, 2, 5, 2, 4]
oref = occurrences(ref)
sum(min(v,oref.get(k,0)) for (k,v) in occurrences(lst).iteritems()) 3
lst=[2, 3, 4, 5, 3]
sum(min(v,oref.get(k,0)) for (k,v) in occurrences(lst).iteritems()) 2

Or in other words, define a function to return a dictionary containing
a count of the number of occurrences of each element in the list (this
assumes that the list elements are hashable). Then you just add up the
values in the test list making sure each count is limited to no higher than
the reference count.
 
B

Bryan Olson

Duncan said:
Here's the way I would do it:

res = {}
for item in it:
if item in res:
res[item] += 1
else:
res[item] = 1
return res

I slightly prefer:

def occurrences(it):
res = {}
res[item] = res.get(item, 0) + 1
return res


[...]
Or in other words, define a function to return a dictionary containing
a count of the number of occurrences of each element in the list (this
assumes that the list elements are hashable). Then you just add up the
values in the test list making sure each count is limited to no higher than
the reference count.

Resulting in a linear-time average case, where the posted
list-comprehension-based solutions are quadratic. The title
of the thread is unfortunate.

The generalized problem is multiset (AKA "bag") intersection:

http://en.wikipedia.org/wiki/Bag_(mathematics)
 
M

Mathijs

Op 19 jan 2006 vond "(e-mail address removed)" :
another approach:

ref = [2,2,4,1,1]
lis = [2,2,5,2,4]

len([ref.pop(ref.index(x)) for x in lis if x in ref])

This is the type of solution I was hoping to see: one-liners, with no use
of local variables. As Tim Chase already wrote, it has only one less
elegant side: it alters the original ref list.

Thanks for your suggestion.
 
M

Mathijs

Op 19 jan 2006 vond "Paddy said:
answer = [ val for val in set(ref) for x in
range(min(lst.count(val), ref.count(val)))] answer
[2, 2, 4]

I don't think it's correct. Your algoritm with the ref and lst below gives
3 as answer. The answer should have been 2 (1,3).

ref=[3, 3, 1, 1, 3]
lst=[5, 1, 4, 5, 3]
 
M

Mathijs

Op 20 jan 2006 vond Duncan Booth said:
Or in other words, define a function to return a dictionary containing
a count of the number of occurrences of each element in the list (this
assumes that the list elements are hashable). Then you just add up the
values in the test list making sure each count is limited to no higher
than the reference count.

Thanks. Though I don't know much about python (yet), this is more or less
the way I'de do it the language I'm more comfortable with (object pascal),
and I wouldn't label this as a pythonic solution. I could be wrong,
though:)
 
D

Duncan Booth

Mathijs said:
Thanks. Though I don't know much about python (yet), this is more or
less the way I'de do it the language I'm more comfortable with (object
pascal), and I wouldn't label this as a pythonic solution. I could be
wrong, though:)
I'm curious what you think isn't pythonic about this solution?
 
P

Peter Otten

Mathijs said:
Thanks. Though I don't know much about python (yet), this is more or less
the way I'de do it the language I'm more comfortable with (object pascal),
and I wouldn't label this as a pythonic solution. I could be wrong,
though:)

You *are* wrong. Also, as the lists grow, Duncan's approach scales *much*
better than, e. g., mine.

If picking a better algorithm were unpythonic there would not be much value
in striving for pythonic solutions.

Peter
 
S

Steven D'Aprano

Op 19 jan 2006 vond "(e-mail address removed)" :
another approach:

ref = [2,2,4,1,1]
lis = [2,2,5,2,4]

len([ref.pop(ref.index(x)) for x in lis if x in ref])

This is the type of solution I was hoping to see: one-liners, with no use
of local variables.

Because you like unreadable, incomprehensible, unmaintainable code?

*wink*
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,282
Messages
2,571,404
Members
48,095
Latest member
WalkerBore

Latest Threads

Top