list conversion question

Kent Tenney · Sep 4, 2004

Howdy,

I am using PIL, and would like to
make a list of the pixel values
in an image, sorted by quantity of
pixels with that value.

im.histogram() returns a list which
is a pixel count for each brightness value.

So, I want to sort the histogram, and have
the resulting list contain
the indexes instead of the counts.

This seems like it'd be a fairly common
task, but I don't know what to call
it to look for guidance.

Any help is appreciated.

Thanks,
Kent

Paul McGuire · Sep 4, 2004

Kent Tenney said:
Howdy,

I am using PIL, and would like to
make a list of the pixel values
in an image, sorted by quantity of
pixels with that value.

im.histogram() returns a list which
is a pixel count for each brightness value.

So, I want to sort the histogram, and have
the resulting list contain
the indexes instead of the counts.

This seems like it'd be a fairly common
task, but I don't know what to call
it to look for guidance.

Any help is appreciated.

Thanks,
Kent

Assuming im.histogram() returns a list like [ 0, 1, 0, 5, 43, etc. ] how
about:

hist = [ 0, 1, 0, 5, 43 ]
values = [ i for i in enumerate(hist)]
values.sort(lambda a,b: cmp(b[1],a[1]))
indexes = [ a for a,b in values ]

-- Paul

Andrew Dalke · Sep 4, 2004

Paul said:
Assuming im.histogram() returns a list like [ 0, 1, 0, 5, 43, etc. ] how
about:

hist = [ 0, 1, 0, 5, 43 ]
values = [ i for i in enumerate(hist)]
values.sort(lambda a,b: cmp(b[1],a[1]))
indexes = [ a for a,b in values ]

or tweaked a bit for speed (a sort with a lambda is expensive)
and for clarity, IMO,

pairs = [(value, offset) for (offset, value) in enumerate(hist)]
pairs.sort()
indexes = [offset for (value, offset) in pairs]

In Python2.4 this is allowed

>>> hist = [ 0, 1, 0, 5, 43 ]
>>> [pair[0] for pair in sorted(enumerate(hist),

Click to expand...

Click to expand...

.... key=lambda pair: pair[1])]
[0, 2, 1, 3, 4]
Not yet sure that that's a good thing.

Andrew
(e-mail address removed)

Paul McGuire · Sep 4, 2004

Andrew Dalke said:
Paul said:

Assuming im.histogram() returns a list like [ 0, 1, 0, 5, 43, etc. ] how
about:

hist = [ 0, 1, 0, 5, 43 ]
values = [ i for i in enumerate(hist)]
values.sort(lambda a,b: cmp(b[1],a[1]))
indexes = [ a for a,b in values ]

Click to expand...

or tweaked a bit for speed (a sort with a lambda is expensive)
and for clarity, IMO,

pairs = [(value, offset) for (offset, value) in enumerate(hist)]
pairs.sort()
indexes = [offset for (value, offset) in pairs]

In Python2.4 this is allowed

hist = [ 0, 1, 0, 5, 43 ]
[pair[0] for pair in sorted(enumerate(hist),

Click to expand...

Click to expand...

... key=lambda pair: pair[1])]
[0, 2, 1, 3, 4]
Not yet sure that that's a good thing.

Andrew
(e-mail address removed)

I assumed that the resulting list should be sorted in descending frequency
order. Probably the fastest way to do this (since we are tweaking for
speed) would be to just change the pairs list comp to "pairs = [(-value,
offset) for (offset, value) in enumerate(hist)]", or your Py 2.4 key clause
from "key=lambda pair: pair[1]" to "key=lambda pair: -pair[1]".

-- Paul

Andrew Dalke · Sep 4, 2004

Paul said:
Probably the fastest way to do this (since we are tweaking for
speed) would be to just change the pairs list comp to "pairs = [(-value,
offset) for (offset, value) in enumerate(hist)]", or your Py 2.4 key clause
from "key=lambda pair: pair[1]" to "key=lambda pair: -pair[1]".

Ahh, right. The new Python2.4 'sorted' also has a
"reversed" flag, so

>>> hist = [ 0, 1, 0, 5, 43 ]
>>> [pair[0] for pair in sorted(enumerate(hist),

Click to expand...

Click to expand...

.... key=lambda pair: pair[1],
.... reverse=True)]
[4, 3, 1, 0, 2]
This isn't the same as as reverse() after the sort(),

pairs = [(value, offset) for (offset, value) in enumerate(hist)]
pairs.sort()
indexes = [offset for (value, offset) in pairs]
indexes.reverse()

or

list(reversed([pair[0] for pair in sorted(enumerate(hist),
key=lambda pair: pair[1])]))

Both of these give

[4, 3, 1, 2, 0]

The difference is that sort is now a guaranteed
stable sort. Using 'reverse=True' means the index
to the first 0 (at index 0) appears before the index
to the seond 0 (at index 2). Using reverse() or
reversed() flips that direction.

Andrew
(e-mail address removed)

which would give

[0, 2, 1, 3, 4]

Andrew

Jp Calderone · Sep 4, 2004

Andrew said:
Paul said:

Assuming im.histogram() returns a list like [ 0, 1, 0, 5, 43, etc. ] how
about:

hist = [ 0, 1, 0, 5, 43 ]
values = [ i for i in enumerate(hist)]
values.sort(lambda a,b: cmp(b[1],a[1]))
indexes = [ a for a,b in values ]

Click to expand...

or tweaked a bit for speed (a sort with a lambda is expensive)
and for clarity, IMO,

pairs = [(value, offset) for (offset, value) in enumerate(hist)]
pairs.sort()
indexes = [offset for (value, offset) in pairs]

In Python2.4 this is allowed

hist = [ 0, 1, 0, 5, 43 ]
[pair[0] for pair in sorted(enumerate(hist),

Click to expand...

Click to expand...

... key=lambda pair: pair[1])]
[0, 2, 1, 3, 4]

Even faster, though perhaps not as clear:

>>> import operator
>>> hist = [ 0, 1, 0, 5, 43 ]
>>> map(operator.itemgetter(0), sorted(enumerate(hist),

Click to expand...

Click to expand...

... key=operator.itemgetter(1)))
[0, 2, 1, 3, 4]

Jp

Peter Otten · Sep 4, 2004

Jp said:
Even faster, though perhaps not as clear:

import operator
hist = [ 0, 1, 0, 5, 43 ]
map(operator.itemgetter(0), sorted(enumerate(hist),

Click to expand...

Click to expand...

... key=operator.itemgetter(1)))
[0, 2, 1, 3, 4]

Yet another option, not benchmarked, perhaps clearer:

hist = [0, 1, 0, 5, 43]
indices = range(len(hist))
indices.sort(key=hist.__getitem__)
indices

Click to expand...

Click to expand...

[0, 2, 1, 3, 4]

Peter

Andrew Dalke · Sep 4, 2004

Peter said:
Yet another option, not benchmarked, perhaps clearer:

hist = [0, 1, 0, 5, 43]
indices = range(len(hist))
indices.sort(key=hist.__getitem__)
indices

Click to expand...

Click to expand...

[0, 2, 1, 3, 4]

Looks the fastest to me. Here's my results, with
the benchmark below. In this list:
sort0 = Paul McGuire's solution with a lambda
sort1 = my solution with the pair flipped so the
default sort works
sort2 = my solution using Python 2.4's "keys"
sort3 = Jp Calderone's variant of that with itemgetter
sort4 = your (Petter Otten's) solution

list of size 0
sort0 0.0932559967041
sort1 0.0773079395294
sort2 0.176142930984
sort3 0.23094701767
sort4 0.089989900589
list of size 10
sort0 0.866734981537
sort1 0.524667978287
sort2 0.553129911423
sort3 0.529242992401
sort4 0.265748023987
list of size 100
sort0 18.9233298302
sort1 5.04968500137
sort2 5.11950612068
sort3 4.75884985924
sort4 3.0580329895
list of size 1000
sort0 262.217221022
sort1 70.5779349804
sort2 69.9501719475
sort3 54.6528921127
sort4 39.5281660557

Here's my benchmark code

import random, operator

def make_random_list(n):
# Ensure there will be duplicates
choose_from = range(n//3+1)
return [random.choice(choose_from) for i in range(n)]

def do_sort0(hist):
values = [ i for i in enumerate(hist)]
values.sort(lambda a,b: cmp(b[1],a[1]))
return [ a for a,b in values ]

def do_sort1(hist):
pairs = [(value, offset) for (offset, value) in enumerate(hist)]
pairs.sort()
return [offset for (value, offset) in pairs]

def do_sort2(hist):
return [pair[0] for pair in sorted(enumerate(hist),
key=lambda pair: pair[1])]

def do_sort3(hist):
return map(operator.itemgetter(0), sorted(enumerate(hist),
key=operator.itemgetter(1)))

def do_sort4(hist):
indices = range(len(hist))
indices.sort(key=hist.__getitem__)
return indices

data = make_random_list(100)
assert (do_sort0(data) == do_sort1(data) == do_sort2(data) ==
do_sort3(data) == do_sort4(data))

import timeit

for size in [0, 10, 100, 1000]:
print " list of size", size
for i in range(5):
name = "sort%d" % i
t = timeit.Timer(
setup= ("import __main__\nhist=__main__.make_random_list(%d)" %
size),
stmt = "__main__.do_%s(hist)" % name)
print name, t.timeit(10000)

Andrew
(e-mail address removed)

Raymond Hettinger · Sep 5, 2004

[Peter Otten]

Even faster, though perhaps not as clear:

import operator
hist = [ 0, 1, 0, 5, 43 ]
map(operator.itemgetter(0), sorted(enumerate(hist),

Click to expand...

... key=operator.itemgetter(1)))
[0, 2, 1, 3, 4]

Click to expand...

Yet another option, not benchmarked, perhaps clearer:

hist = [0, 1, 0, 5, 43]
indices = range(len(hist))
indices.sort(key=hist.__getitem__)
indices

Click to expand...

Click to expand...

[0, 2, 1, 3, 4]

These are both fast and clean. Consider posting them to the ASPN
cookbook so they won't get lost.

Raymond Hettinger

Raymond Hettinger · Sep 5, 2004

[Peter Otten]

Yet another option, not benchmarked, perhaps clearer:

hist = [0, 1, 0, 5, 43]
indices = range(len(hist))
indices.sort(key=hist.__getitem__)
indices

Click to expand...

[0, 2, 1, 3, 4]

Click to expand...

[Andrew Dalke]

Looks the fastest to me. Here's my results, with
the benchmark below.

. . .

These results are good. Consider posting them in the ASPN Cookbook so they
won't get lost.

Raymond Hettinger

Alex Martelli · Sep 5, 2004

Paul McGuire said:
values = [ i for i in enumerate(hist)]

values = list(enumerate(hist))

is IMHO a better way to make a list in such cases.

Alex

Alex Martelli · Sep 5, 2004

Andrew Dalke said:
In Python2.4 this is allowed

hist = [ 0, 1, 0, 5, 43 ]
[pair[0] for pair in sorted(enumerate(hist),

Click to expand...

Click to expand...

... key=lambda pair: pair[1])]
[0, 2, 1, 3, 4]
Not yet sure that that's a good thing.

it makes the fast (DSU) way of sorting easier, and can be sped up
further as:

[ p[0] for p in sorted(enumerate(hist), operator.itemgetter(1)) ]

The new itemgetter and attrgetter HOFs in module operator are quite good
for this kind of use (which is exactly why they got introduced).

Alex

Average of MultiMode of a list of a list	1	Oct 28, 2022
How to use Densenet121 in monai	0	Feb 16, 2024
ImageEnhance.Contrast - is this fishy or what?	1	Jun 15, 2009
ANN: Phatch = PHoto bATCH processor and renamer based on PIL	26	Feb 18, 2008
Newbie question about tuples and list comprehensions	7	Jun 25, 2008
Request for help with Image color space conversion	4	Jan 4, 2008
Remove pointer from list	2	Mar 24, 2005
python 2.6: how to modify a PIL image from C without copying forthand back	4	Mar 3, 2010

list conversion question

Kent Tenney

Paul McGuire

Andrew Dalke

Paul McGuire

Andrew Dalke

Jp Calderone

Peter Otten

Andrew Dalke

Raymond Hettinger

Raymond Hettinger

Alex Martelli

Alex Martelli

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads