Handling lists

S

superprad

I have a question on python lists.
Suppose I have a 2D list
list = [[10,11,12,13,14,78,79,80,81,300,301,308]]
how do I convert it so that I arrange them into bins .
so If i hvae a set of consecutive numbers i would like to represent
them as a range in the list with max and min val of the range alone.
I shd get something like
list = [[10,14],[78,81],[300,308]]
 
M

Mage

I have a question on python lists.
Suppose I have a 2D list
list = [[10,11,12,13,14,78,79,80,81,300,301,308]]
how do I convert it so that I arrange them into bins .
so If i hvae a set of consecutive numbers i would like to represent
them as a range in the list with max and min val of the range alone.
I shd get something like
list = [[10,14],[78,81],[300,308]]
Maybe:

list = [10,11,12,13,14,78,79,80,81,300,301,308]

new_list = []
start = 0
for i in range(1,len(list) + 1):
if i == len(list) or list - list[i-1] <> 1:
new_list.append([list[start],list[i-1]])
start = i

print new_list
 
S

superprad

yes that makes sense.But the problem I am facing is if list=
[300,301,303,305] I want to consider it as one cluster and include the
range as [300,305] so this is where I am missing the ranges.
so If the list has l = [300,301,302,308,401,402,403,408] i want to
include it as [[300,308],[401,408]].
 
J

James Stroud

I have a question on python lists.
Suppose I have a 2D list
list = [[10,11,12,13,14,78,79,80,81,300,301,308]]
how do I convert it so that I arrange them into bins  .
so If i hvae a set of consecutive numbers i would like to represent
them as a range in the list with max and min val of the range alone.
I shd get something like
list = [[10,14],[78,81],[300,308]]



Here is an interesting way:
a = iter([1,2,3,4])
[(b,a.next()) for b in a]
[(1, 2), (3, 4)]

James

--
James Stroud
UCLA-DOE Institute for Genomics and Proteomics
Box 951570
Los Angeles, CA 90095

http://www.jamesstroud.com/
 
M

Michael Spencer

list = [[10,11,12,13,14,78,79,80,81,300,301,308]]
how do I convert it so that I arrange them into bins .
so If i hvae a set of consecutive numbers i would like to represent
them as a range in the list with max and min val of the range alone.
I shd get something like
list = [[10,14],[78,81],[300,308]] Mage:
Maybe:

list = [10,11,12,13,14,78,79,80,81,300,301,308]

new_list = []
start = 0
for i in range(1,len(list) + 1):
if i == len(list) or list - list[i-1] <> 1:
new_list.append([list[start],list[i-1]])
start = i

print new_list


yes that makes sense.But the problem I am facing is if list=
[300,301,303,305] I want to consider it as one cluster and include the
range as [300,305] so this is where I am missing the ranges.
so If the list has l = [300,301,302,308,401,402,403,408] i want to
include it as [[300,308],[401,408]].


Mage's solution meets the requirements that you initially stated of treating
*consecutive* numbers as a group. Now you also want to consider
[300,301,303,305] as a cluster.

You need to specify your desired clustering rule, or alternatively specify ho
many bins you want to create, but as an example, here is a naive approach, that
could be adapted easily to other clustering rules and (a bit less easily) to
target a certain number of bins

def lstcluster(lst):
# Separate neighbors that differ by more than the mean difference
lst.sort()
diffs = [(b-a, (a, b)) for a, b in zip(lst,lst[1:])]
mean_diff = sum(diff[0] for diff in diffs)/len(diffs)
breaks = [breaks for diff, breaks in diffs if diff > mean_diff]
groups = [lst[0]] + [i for x in breaks for i in x] + [lst[-1]]
igroups = iter(groups) # Pairing mechanism due to James Stroud
return [[i, igroups.next()] for i in igroups]

Note this is quite inefficient due to creating several intermediate lists. But
it's not worth optimizing yet, since I'm only guessing at your actual requirement.

lst0 = [10,11,12,13,14,78,79,80,81,300,301,308]
lst1 = [10,12,16,24,26,27,54,55,80,100, 105]
lst3 = [1,5,100,1000,1005,1009,10000, 10010,10019]
>>> lst0 = [10,11,12,13,14,78,79,80,81,300,301,308]
>>> lst1 = [10,12,16,24,26,27,54,55,80,100, 105]
>>> lst2 = [1,5,100,1000,1005,1009,10000, 10010,10019]
>>> lstcluster(lst0) [[10, 14], [78, 81], [300, 308]]
>>> lstcluster(lst1) [[10, 27], [54, 55], [80, 80], [100, 105]]
>>> lstcluster(lst2) [[1, 1009], [10000, 10019]]
>>>


Michael
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,236
Messages
2,571,185
Members
47,820
Latest member
HortenseKo

Latest Threads

Top