groupby

G

George Sakkis

Bryan said:
can some explain why in the 2nd example, m doesn't print the list [1, 1, 1]
which i had expected?

for k, g in groupby([1, 1, 1, 2, 2, 3]):
... print k, list(g)
...
1 [1, 1, 1]
2 [2, 2]
3 [3]

m = list(groupby([1, 1, 1, 2, 2, 3]))
m
[(1 said:
list(m[0][1]) []


thanks,

bryan

I've tripped on this more than once, but it's in the docs
(http://docs.python.org/lib/itertools-functions.html):

"The returned group is itself an iterator that shares the underlying
iterable with groupby(). Because the source is shared, when the groupby
object is advanced, the previous group is no longer visible. So, if
that data is needed later, it should be stored as a list"

George
 
B

Bryan

George said:
Bryan said:
can some explain why in the 2nd example, m doesn't print the list [1, 1, 1]
which i had expected?

for k, g in groupby([1, 1, 1, 2, 2, 3]):
... print k, list(g)
...
1 [1, 1, 1]
2 [2, 2]
3 [3]

m = list(groupby([1, 1, 1, 2, 2, 3]))
m
[(1 said:
list(m[0][1])
[]


thanks,

bryan

I've tripped on this more than once, but it's in the docs
(http://docs.python.org/lib/itertools-functions.html):

"The returned group is itself an iterator that shares the underlying
iterable with groupby(). Because the source is shared, when the groupby
object is advanced, the previous group is no longer visible. So, if
that data is needed later, it should be stored as a list"

George

i read that description in the docs so many times before i posted here. now that
i read it about 10 more times, i finally get it. there's just something about
the wording that kept tripping me up, but i can't explain why :)

thanks,

bryan
 
P

Paul McGuire

Bryan said:
George Sakkis wrote:

i read that description in the docs so many times before i posted here. now that
i read it about 10 more times, i finally get it. there's just something about
the wording that kept tripping me up, but i can't explain why :)

thanks,

bryan

So here's how to save the values from the iterators while iterating over the
groupby:
m = [(x,list(y)) for x,y in groupby([1, 1, 1, 2, 2, 3])]
m
[(1, [1, 1, 1]), (2, [2, 2]), (3, [3])]

-- Paul
 
P

Paul McGuire

Paul McGuire said:
So here's how to save the values from the iterators while iterating over the
groupby:
m = [(x,list(y)) for x,y in groupby([1, 1, 1, 2, 2, 3])]
m
[(1, [1, 1, 1]), (2, [2, 2]), (3, [3])]

-- Paul

Playing some more with groupby. Here's a one-liner to tally a list of
integers into a histogram:

# create data set, random selection of numbers from 1-10
dataValueRange = range(1,11)
data = [random.choice(dataValueRange) for i in xrange(10)]
print data

# tally values into histogram:
# (from the inside out:
# - sort data into ascending order, so groupby will see all like values
together
# - call groupby, return iterator of (value,valueItemIterator) tuples
# - tally groupby results into a dict of (value, valueFrequency) tuples
# - expand dict into histogram list, filling in zeroes for any keys that
didn't get a value
hist = [ (k1,dict((k,len(list(g))) for k,g in
itertools.groupby(sorted(data))).get(k1,0)) for k1 in dataValueRange ]

print hist

Gives:
[9, 6, 8, 3, 2, 3, 10, 7, 6, 2]
[(1, 0), (2, 2), (3, 2), (4, 0), (5, 0), (6, 2), (7, 1), (8, 1), (9, 1),
(10, 1)]

Change the generation of the original data list to 10,000 values, and you
get something like:
[(1, 995), (2, 986), (3, 941), (4, 998), (5, 978), (6, 1007), (7, 997), (8,
1033), (9, 1038), (10, 1027)]

If you know there wont be any zero frequency values (or don't care about
them), you can skip the fill-in-the-zeros step, with one of these
expressions:
histAsList = [ (k,len(list(g))) for k,g in itertools.groupby(sorted(data)) ]
histAsDict = dict((k,len(list(g))) for k,g in
itertools.groupby(sorted(data)))

-- Paul
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,296
Messages
2,571,535
Members
48,281
Latest member
DaneLxa72

Latest Threads

Top