Grouping items by a key?

Michael Fogleman · Mar 22, 2013

I feel like Python ought to have a built-in to do this. Take a list of items and turn them into a dictionary mapping keys to a list of items with that key in common.

It's easy enough to do:

# using defaultdict
lookup = collections.defaultdict(list)
for item in items:
lookup[key(item)].append(item)

# or, using plain dict
lookup = {}
for item in items:
lookup.setdefault(key(item), []).append(item)

But this is frequent enough of a use case that a built-in function would be nice. I could implement it myself, as such:

def grouped(iterable, key):
result = {}
for item in iterable:
result.setdefault(key(item), []).append(item)
return result

lookup = grouped(items, key)

This is different than `itertools.groupby` in a few important ways. To get the same result from `groupby`, you'd have to do this, which is a little ugly:

lookup = dict((k, list(v)) for k, v in groupby(sorted(items, key=key), key))

Some examples:

{0: [0, 2, 4, 6, 8], 1: [1, 3, 5, 7, 9]}
{8: ['overflow'], 3: ['how', 'are', 'you'], 5: ['hello', 'stack']}

Is there a better way?

Steven D'Aprano · Mar 23, 2013

I feel like Python ought to have a built-in to do this. Take a list of
items and turn them into a dictionary mapping keys to a list of items
with that key in common.

It's easy enough to do:

# using defaultdict
lookup = collections.defaultdict(list)
for item in items:
lookup[key(item)].append(item)

# or, using plain dict
lookup = {}
for item in items:
lookup.setdefault(key(item), []).append(item)

That's pretty much the reason setdefault was invented. So, in a sense,
there is a built-in for this.

But this is frequent enough of a use case that a built-in function would
be nice.

I'm not so sure I agree it's a frequent use-case. I don't think I've ever
needed to do it, or if I did, it was so rare and so long ago that I've
forgotten it.

I could implement it myself, as such:

def grouped(iterable, key):
result = {}
for item in iterable:
result.setdefault(key(item), []).append(item)
return result

lookup = grouped(items, key)

This is different than `itertools.groupby` in a few important ways.

Why do you care about itertools.groupby? That does something completely
different. It groups items that occur in *contiguous* groups, e.g.

[1, 2, 3, 2, 2, 2, 3, 3, 4, 5, 5, 2, 2, 5]

will be grouped into three separate groups of two:

[1], [2], [3], [2, 2, 2], [3, 3], [4], [5, 5], [2, 2], [5]

This is a feature of groupby. If you want to accumulate items regardless
of where they occur, e.g. for the above:

[1], [2, 2, 2, 2, 2, 2], [3, 3, 3], [4], [5, 5, 5]

then there's no need to use groupby.

Some examples:

{0: [0, 2, 4, 6, 8], 1: [1, 3, 5, 7, 9]}
{8: ['overflow'], 3: ['how', 'are', 'you'], 5: ['hello', 'stack']}

Is there a better way?

Looks perfectly fine to me. It's a five line helper function, it's
readable and simple and clear. The only improvements I would make would
be to give it a doc string describing what it does and showing some
examples:

def grouped(items, key):
"""Return a dict with items accumulated by key.
{0: [0, 2, 4, 6, 8], 1: [1, 3, 5, 7, 9]}
{8: ['overflow'], 3: ['how', 'are', 'you'], 5: ['hello', 'stack']}

"""
result = {}
for item in iterable:
result.setdefault(key(item), []).append(item)
return result

Now you have a nice, descriptive help string for when you call
help(grouped).

use cases for a defaultdict	0	Jan 18, 2006
Idioms combining 'next(items)' and 'for item in items:'	1	Sep 10, 2011
PHP RSS Feed Aggregator changing to todays date everytime feed is aggregated	1	Jan 11, 2022
Dictionary : items()	4	Jan 22, 2009
Range / empty list issues??	1	Dec 11, 2023
Removing items from a list	5	Feb 10, 2012
Proper deletion of selected items during map iteration in for loop	0	Apr 25, 2014
Is there a better way to do this snippet?	6	Apr 3, 2012

Grouping items by a key?

Michael Fogleman

Steven D'Aprano

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads