Pythonic way to count sequences

CM · Apr 25, 2013

I have to count the number of various two-digit sequences in a list
such as this:

mylist = [(2,4), (2,4), (3,4), (4,5), (2,1)] # (Here the (2,4)
sequence appears 2 times.)

and tally up the results, assigning each to a variable. The inelegant
first pass at this was something like...

# Create names and set them all to 0
alpha = 0
beta = 0
delta = 0
gamma = 0
# etc...

# loop over all the tuple sequences and increment appropriately
for sequence_tuple in list_of_tuples:
if sequence_tuple == (1,2):
alpha += 1
if sequence_tuple == (2,4):
beta += 1
if sequence_tuple == (2,5):
delta +=1
# etc... But I actually have more than 10 sequence types.

# Finally, I need a list created like this:
result_list = [alpha, beta, delta, gamma] #etc...in that order

I can sense there is very likely an elegant/Pythonic way to do this,
and probably with a dict, or possibly with some Python structure I
don't typically use. Suggestions sought. Thanks.

Chris Angelico · Apr 25, 2013

I have to count the number of various two-digit sequences in a list
such as this:

mylist = [(2,4), (2,4), (3,4), (4,5), (2,1)] # (Here the (2,4)
sequence appears 2 times.)

and tally up the results, assigning each to a variable.

You can use a tuple as a dictionary key, just like you would a string.
So you can count them up directly with a dictionary:

count = {}
for sequence_tuple in list_of_tuples:
count[sequence_tuple] = count.get(sequence_tuple,0) + 1

Also, since this is such a common thing to do, there's a standard
library way of doing it:

import collections
count = collections.Counter(list_of_tuples)

This doesn't depend on knowing ahead of time what your elements will
be. At the end of it, you can simply iterate over 'count' and get all
your counts:

for sequence,number in count.items():
print("%d of %r" % (number,sequence))

ChrisA

Steven D'Aprano · Apr 25, 2013

I have to count the number of various two-digit sequences in a list such
as this:

mylist = [(2,4), (2,4), (3,4), (4,5), (2,1)] # (Here the (2,4) sequence
appears 2 times.)

and tally up the results, assigning each to a variable. The inelegant
first pass at this was something like...

# Create names and set them all to 0
alpha = 0
beta = 0
delta = 0
gamma = 0
# etc...

Do they absolutely have to be global variables like that? Seems like a
bad design, especially if you don't know in advance exactly how many
there are.

# loop over all the tuple sequences and increment appropriately for
sequence_tuple in list_of_tuples:
if sequence_tuple == (1,2):
alpha += 1
if sequence_tuple == (2,4):
beta += 1
if sequence_tuple == (2,5):
delta +=1
# etc... But I actually have more than 10 sequence types.

counts = {}
for t in list_of_tuples:
counts[t] = counts.get(t, 0) + 1

Or, use collections.Counter:

from collections import Counter
counts = Counter(list_of_tuples)

# Finally, I need a list created like this: result_list = [alpha, beta,
delta, gamma] #etc...in that order

Dicts are unordered, so getting the results in a specific order will be a
bit tricky. You could do this:

results = sorted(counts.items(), key=lambda t: t[0])
results = [t[1] for t in results]

if you are lucky enough to have the desired order match the natural order
of the tuples. Otherwise:

desired_order = [(2, 3), (3, 1), (1, 2), ...]
results = [counts.get(t, 0) for t in desired_order]

Serhiy Storchaka · Apr 25, 2013

25.04.13 08:26, Chris Angelico Ð½Ð°Ð¿Ð¸ÑÐ°Ð²(Ð»Ð°):

So you can count them up directly with a dictionary:

count = {}
for sequence_tuple in list_of_tuples:
count[sequence_tuple] = count.get(sequence_tuple,0) + 1

Or alternatives:

count = {}
for sequence_tuple in list_of_tuples:
if sequence_tuple] in count:
count[sequence_tuple] += 1
else:
count[sequence_tuple] = 1

count = {}
for sequence_tuple in list_of_tuples:
try:
count[sequence_tuple] += 1
except KeyError:
count[sequence_tuple] = 1

import collections
count = collections.defaultdict(int)
for sequence_tuple in list_of_tuples:
count[sequence_tuple] += 1

But of course collections.Counter is a preferable way now.

Denis McMahon · Apr 26, 2013

I have to count the number of various two-digit sequences in a list such
as this:

mylist = [(2,4), (2,4), (3,4), (4,5), (2,1)] # (Here the (2,4) sequence
appears 2 times.)

and tally up the results, assigning each to a variable. The inelegant
first pass at this was something like...

# Create names and set them all to 0 alpha = 0 beta = 0 delta = 0 gamma
= 0 # etc...

# loop over all the tuple sequences and increment appropriately for
sequence_tuple in list_of_tuples:
if sequence_tuple == (1,2):
alpha += 1
if sequence_tuple == (2,4):
beta += 1
if sequence_tuple == (2,5):
delta +=1
# etc... But I actually have more than 10 sequence types.

# Finally, I need a list created like this:
result_list = [alpha, beta, delta, gamma] #etc...in that order

I can sense there is very likely an elegant/Pythonic way to do this, and
probably with a dict, or possibly with some Python structure I don't
typically use. Suggestions sought. Thanks.

mylist = [ (3,3), (1,2), "fred", ("peter",1,7), 1, 19, 37, 28.312,
("monkey"), "fred", "fred", (1,2) ]

bits = {}

for thing in mylist:
if thing in bits:
bits[thing] += 1
else:
bits[thing] = 1

for thing in bits:
print thing, " occurs ", bits[thing], " times"

outputs:

(1, 2) occurs 2 times
1 occurs 1 times
('peter', 1, 7) occurs 1 times
(3, 3) occurs 1 times
28.312 occurs 1 times
fred occurs 3 times
19 occurs 1 times
monkey occurs 1 times
37 occurs 1 times

if you want to check that thing is a 2 int tuple then use something like:

for thing in mylist:
if isinstance( thing, tuple ) and len( thing ) == 2 and isinstance
( thing[0], ( int, long ) ) and isinstance( thing[1], ( int, long) ):
if thing in bits:
bits[thing] += 1
else:
bits[thing] = 1

Modulok · Apr 26, 2013

I have to count the number of various two-digit sequences in a list such
as this:

mylist = [(2,4), (2,4), (3,4), (4,5), (2,1)] # (Here the (2,4) sequence
appears 2 times.)

and tally up the results, assigning each to a variable.

Click to expand...

....

Consider using the ``collections`` module::

from collections import Counter

mylist = [(2,4), (2,4), (3,4), (4,5), (2,1)]
count = Counter()
for k in mylist:
count[k] += 1

print(count)

# Output looks like this:
# Counter({(2, 4): 2, (4, 5): 1, (3, 4): 1, (2, 1): 1})

You then have access to methods to return the most common items, etc. See more
examples here:

http://docs.python.org/3.3/library/collections.html#collections.Counter

Good luck!
-Modulok-

CM · Apr 26, 2013

Thank you, everyone, for the answers. Very helpful and knowledge-
expanding.

Matthew Gilson · Apr 26, 2013

A Counter is definitely the way to go about this. Just as a little more
information. The below example can be simplified:

from collections import Counter
count = Counter(mylist)

With the other example, you could have achieved the same thing (and been
backward compatible to python2.5) with

from collections import defaultdict
count = defaultdict(int)
for k in mylist:
count[k] += 1

I have to count the number of various two-digit sequences in a list such
as this:

mylist = [(2,4), (2,4), (3,4), (4,5), (2,1)] # (Here the (2,4) sequence
appears 2 times.)

and tally up the results, assigning each to a variable.

Click to expand...

Click to expand...

...

Consider using the ``collections`` module::

from collections import Counter

mylist = [(2,4), (2,4), (3,4), (4,5), (2,1)]
count = Counter()
for k in mylist:
count[k] += 1

print(count)

# Output looks like this:
# Counter({(2, 4): 2, (4, 5): 1, (3, 4): 1, (2, 1): 1})

You then have access to methods to return the most common items, etc. See more
examples here:

http://docs.python.org/3.3/library/collections.html#collections.Counter

Good luck!
-Modulok-

I want to overwrite(=update/correction elements) elements from the first row to the third row.	1	Jul 10, 2023
More pythonic shell sort?	5	Jun 9, 2006
Need to modify a Python Gui	0	Jun 2, 2013
iTunes Search Algorithm/Data Structure?	1	Aug 17, 2006
a flattening operator?	2	Apr 18, 2006
ANN: 'rex', a module for easy creation and use of regular expressions	0	Jun 10, 2004
I have two almost identical pages in IE, one works and the other doesn't	23	Feb 9, 2007
I have two almost identical pages in IE, one works and the other doesn't	0	Feb 8, 2007

Pythonic way to count sequences

CM

Chris Angelico

Steven D'Aprano

Serhiy Storchaka

Denis McMahon

Modulok

CM

Matthew Gilson

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads