Pythonic way to count sequences

C

CM

I have to count the number of various two-digit sequences in a list
such as this:

mylist = [(2,4), (2,4), (3,4), (4,5), (2,1)] # (Here the (2,4)
sequence appears 2 times.)

and tally up the results, assigning each to a variable. The inelegant
first pass at this was something like...

# Create names and set them all to 0
alpha = 0
beta = 0
delta = 0
gamma = 0
# etc...

# loop over all the tuple sequences and increment appropriately
for sequence_tuple in list_of_tuples:
if sequence_tuple == (1,2):
alpha += 1
if sequence_tuple == (2,4):
beta += 1
if sequence_tuple == (2,5):
delta +=1
# etc... But I actually have more than 10 sequence types.

# Finally, I need a list created like this:
result_list = [alpha, beta, delta, gamma] #etc...in that order

I can sense there is very likely an elegant/Pythonic way to do this,
and probably with a dict, or possibly with some Python structure I
don't typically use. Suggestions sought. Thanks.
 
C

Chris Angelico

I have to count the number of various two-digit sequences in a list
such as this:

mylist = [(2,4), (2,4), (3,4), (4,5), (2,1)] # (Here the (2,4)
sequence appears 2 times.)

and tally up the results, assigning each to a variable.

You can use a tuple as a dictionary key, just like you would a string.
So you can count them up directly with a dictionary:

count = {}
for sequence_tuple in list_of_tuples:
count[sequence_tuple] = count.get(sequence_tuple,0) + 1

Also, since this is such a common thing to do, there's a standard
library way of doing it:

import collections
count = collections.Counter(list_of_tuples)

This doesn't depend on knowing ahead of time what your elements will
be. At the end of it, you can simply iterate over 'count' and get all
your counts:

for sequence,number in count.items():
print("%d of %r" % (number,sequence))

ChrisA
 
S

Steven D'Aprano

I have to count the number of various two-digit sequences in a list such
as this:

mylist = [(2,4), (2,4), (3,4), (4,5), (2,1)] # (Here the (2,4) sequence
appears 2 times.)

and tally up the results, assigning each to a variable. The inelegant
first pass at this was something like...

# Create names and set them all to 0
alpha = 0
beta = 0
delta = 0
gamma = 0
# etc...

Do they absolutely have to be global variables like that? Seems like a
bad design, especially if you don't know in advance exactly how many
there are.

# loop over all the tuple sequences and increment appropriately for
sequence_tuple in list_of_tuples:
if sequence_tuple == (1,2):
alpha += 1
if sequence_tuple == (2,4):
beta += 1
if sequence_tuple == (2,5):
delta +=1
# etc... But I actually have more than 10 sequence types.

counts = {}
for t in list_of_tuples:
counts[t] = counts.get(t, 0) + 1


Or, use collections.Counter:

from collections import Counter
counts = Counter(list_of_tuples)

# Finally, I need a list created like this: result_list = [alpha, beta,
delta, gamma] #etc...in that order

Dicts are unordered, so getting the results in a specific order will be a
bit tricky. You could do this:

results = sorted(counts.items(), key=lambda t: t[0])
results = [t[1] for t in results]

if you are lucky enough to have the desired order match the natural order
of the tuples. Otherwise:

desired_order = [(2, 3), (3, 1), (1, 2), ...]
results = [counts.get(t, 0) for t in desired_order]
 
S

Serhiy Storchaka

25.04.13 08:26, Chris Angelico напиÑав(ла):
So you can count them up directly with a dictionary:

count = {}
for sequence_tuple in list_of_tuples:
count[sequence_tuple] = count.get(sequence_tuple,0) + 1

Or alternatives:

count = {}
for sequence_tuple in list_of_tuples:
if sequence_tuple] in count:
count[sequence_tuple] += 1
else:
count[sequence_tuple] = 1

count = {}
for sequence_tuple in list_of_tuples:
try:
count[sequence_tuple] += 1
except KeyError:
count[sequence_tuple] = 1

import collections
count = collections.defaultdict(int)
for sequence_tuple in list_of_tuples:
count[sequence_tuple] += 1

But of course collections.Counter is a preferable way now.
 
D

Denis McMahon

I have to count the number of various two-digit sequences in a list such
as this:

mylist = [(2,4), (2,4), (3,4), (4,5), (2,1)] # (Here the (2,4) sequence
appears 2 times.)

and tally up the results, assigning each to a variable. The inelegant
first pass at this was something like...

# Create names and set them all to 0 alpha = 0 beta = 0 delta = 0 gamma
= 0 # etc...

# loop over all the tuple sequences and increment appropriately for
sequence_tuple in list_of_tuples:
if sequence_tuple == (1,2):
alpha += 1
if sequence_tuple == (2,4):
beta += 1
if sequence_tuple == (2,5):
delta +=1
# etc... But I actually have more than 10 sequence types.

# Finally, I need a list created like this:
result_list = [alpha, beta, delta, gamma] #etc...in that order

I can sense there is very likely an elegant/Pythonic way to do this, and
probably with a dict, or possibly with some Python structure I don't
typically use. Suggestions sought. Thanks.

mylist = [ (3,3), (1,2), "fred", ("peter",1,7), 1, 19, 37, 28.312,
("monkey"), "fred", "fred", (1,2) ]

bits = {}

for thing in mylist:
if thing in bits:
bits[thing] += 1
else:
bits[thing] = 1

for thing in bits:
print thing, " occurs ", bits[thing], " times"

outputs:

(1, 2) occurs 2 times
1 occurs 1 times
('peter', 1, 7) occurs 1 times
(3, 3) occurs 1 times
28.312 occurs 1 times
fred occurs 3 times
19 occurs 1 times
monkey occurs 1 times
37 occurs 1 times

if you want to check that thing is a 2 int tuple then use something like:

for thing in mylist:
if isinstance( thing, tuple ) and len( thing ) == 2 and isinstance
( thing[0], ( int, long ) ) and isinstance( thing[1], ( int, long) ):
if thing in bits:
bits[thing] += 1
else:
bits[thing] = 1
 
M

Modulok

I have to count the number of various two-digit sequences in a list such
as this:

mylist = [(2,4), (2,4), (3,4), (4,5), (2,1)] # (Here the (2,4) sequence
appears 2 times.)

and tally up the results, assigning each to a variable.
....

Consider using the ``collections`` module::


from collections import Counter

mylist = [(2,4), (2,4), (3,4), (4,5), (2,1)]
count = Counter()
for k in mylist:
count[k] += 1

print(count)

# Output looks like this:
# Counter({(2, 4): 2, (4, 5): 1, (3, 4): 1, (2, 1): 1})


You then have access to methods to return the most common items, etc. See more
examples here:

http://docs.python.org/3.3/library/collections.html#collections.Counter


Good luck!
-Modulok-
 
M

Matthew Gilson

A Counter is definitely the way to go about this. Just as a little more
information. The below example can be simplified:

from collections import Counter
count = Counter(mylist)

With the other example, you could have achieved the same thing (and been
backward compatible to python2.5) with

from collections import defaultdict
count = defaultdict(int)
for k in mylist:
count[k] += 1



I have to count the number of various two-digit sequences in a list such
as this:

mylist = [(2,4), (2,4), (3,4), (4,5), (2,1)] # (Here the (2,4) sequence
appears 2 times.)

and tally up the results, assigning each to a variable.
...

Consider using the ``collections`` module::


from collections import Counter

mylist = [(2,4), (2,4), (3,4), (4,5), (2,1)]
count = Counter()
for k in mylist:
count[k] += 1

print(count)

# Output looks like this:
# Counter({(2, 4): 2, (4, 5): 1, (3, 4): 1, (2, 1): 1})


You then have access to methods to return the most common items, etc. See more
examples here:

http://docs.python.org/3.3/library/collections.html#collections.Counter


Good luck!
-Modulok-
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,968
Messages
2,570,149
Members
46,695
Latest member
StanleyDri

Latest Threads

Top