Overlap in python

Bearophile · Aug 5, 2009

Albert van der Horst:

That is an algorithmic question and has little to do with Python.<

Yes, but comp.lang.python isn't comp.lang.c, that kind of questions
are perfectly fine here. They help keep this place from becoming
boring.

Bye,
bearophile

Marcus Wanner · Aug 5, 2009

parts = [(5, 9, "a"), (7, 10, "b"), (3, 6, "c"), (15, 20, "d"), (18, 23, "e")]
parts.sort()
parts [(3, 6, 'c'), (5, 9, 'a'), (7, 10, 'b'), (15, 20, 'd'), (18, 23, 'e')]
# Merge overlapping intervals.
pos = 1
while pos < len(parts):

Click to expand...

Click to expand...

# Merge the pair in parts[pos - 1 : pos + 1] if they overlap.
p, q = parts[pos - 1 : pos + 1]
if p[1] >= q[0]:
parts[pos - 1 : pos + 1] = [(p[0], max(p[1], q[1]), p[2]
+ "." + q[2])]
else:
# They don't overlap, so try the next pair.
pos += 1

[(3, 10, 'c.a.b'), (15, 23, 'd.e')]

That's the best solution I've seen so far. It even has input/output
formatted as close as is reasonably possible to the format specified.

As we would say in googlecode, +1.

Marcus

nn · Aug 5, 2009

>>> parts = [(5, 9, "a"), (7, 10, "b"), (3, 6, "c"), (15, 20, "d"),
(18, 23, "e")]
>>> parts.sort()
>>> parts
[(3, 6, 'c'), (5, 9, 'a'), (7, 10, 'b'), (15, 20, 'd'), (18, 23, 'e')]
>>> # Merge overlapping intervals.
>>> pos = 1
>>> while pos < len(parts):
# Merge the pair in parts[pos - 1 : pos + 1] if they overlap.
p, q = parts[pos - 1 : pos + 1]
if p[1] >= q[0]:
parts[pos - 1 : pos + 1] = [(p[0], max(p[1], q[1]), p[2]
+ "." + q[2])]
else:
# They don't overlap, so try the next pair.
pos += 1

Click to expand...

>>> parts
[(3, 10, 'c.a.b'), (15, 23, 'd.e')]

Click to expand...

That's the best solution I've seen so far. It even has input/output
formatted as close as is reasonably possible to the format specified.

As we would say in googlecode, +1.

Marcus

How does it compare to this one?

http://groups.google.com/group/comp...ed9d05d11d0/56684b795fc527cc#56684b795fc527cc

Mark Lawrence · Aug 5, 2009

Jay said:
Hi everyone,

I wanted to thank you all for your help and *excellent* discussion. I
was able to utilize and embed the script by Grigor Lingl in the 6th
post of this discussion to get my program to work very quickly (I had
to do about 20 comparisons per data bin, with over 40K bins in
total). I am involved in genomic analysis research and this problem
comes up a lot and I was surprised to not have been able to find a
clear way to solve it. I will also look through all the tips in this
thread, I have a feeling they may come in handy for future use!

Thank you again,
Jay

I don't know if this is relevant, but http://planet.python.org/ has an
entry dated this morning which points here
http://www.logarithmic.net/pfh/blog/01249470842.

HTH.

Marcus Wanner · Aug 6, 2009

parts = [(5, 9, "a"), (7, 10, "b"), (3, 6, "c"), (15, 20, "d"),
(18, 23, "e")]
parts.sort()
parts
[(3, 6, 'c'), (5, 9, 'a'), (7, 10, 'b'), (15, 20, 'd'), (18, 23, 'e')]
# Merge overlapping intervals.
pos = 1
while pos < len(parts):
# Merge the pair in parts[pos - 1 : pos + 1] if they overlap.
p, q = parts[pos - 1 : pos + 1]
if p[1] >= q[0]:
parts[pos - 1 : pos + 1] = [(p[0], max(p[1], q[1]), p[2]
+ "." + q[2])]
else:
# They don't overlap, so try the next pair.
pos += 1
parts
[(3, 10, 'c.a.b'), (15, 23, 'd.e')]

Click to expand...

That's the best solution I've seen so far. It even has input/output
formatted as close as is reasonably possible to the format specified.

As we would say in googlecode, +1.

Marcus

Click to expand...

How does it compare to this one?

http://groups.google.com/group/comp...ed9d05d11d0/56684b795fc527cc#56684b795fc527cc

That is a different problem, and the solution is more complex.
I am not going to try to judge which is better.

Marcus

--
print ''.join([chr(((ord(z)+(ord("I'M/THE"[3])+sum(
[ord(x)for x in 'CRYPTOR'])))%(4*ord('8')+ord(
' ')))) for z in ''.join(([(('\xca\x10\x03\t'+
'\x01\xff\xe6\xbe\x0c\r\x06\x12\x17\xee\xbe'+
'\x10\x03\x06\x12\r\x0c\xdf\xbe\x12\x11\x13'+
'\xe8')[13*2-y]) for y in range(int(6.5*4)+1)]
))])

John Ladasky · Aug 6, 2009

Hi everyone,

I wanted to thank you all for your help and *excellent* discussion. I
was able to utilize and embed the script by Grigor Lingl in the 6th
post of this discussion to get my program to work very quickly (I had
to do about 20 comparisons per data bin, with over 40K bins in
total). I am involved in genomic analysis research and this problem
comes up a lot and I was surprised to not have been able to find a
clear way to solve it. I will also look through all the tips in this
thread, I have a feeling they may come in handy for future use!

Thank you again,
Jay

Hi Jay,

I know this is a bit off-topic, but how does this pertain to genomic
analysis? Are you counting the lengths of microsatellite repeats or
something?

Python code problem	2	Apr 23, 2023
Minimum Total Difficulty	0	Nov 15, 2023
Machine Learning.. Endless Struggle	3	Feb 16, 2023
Python point location of intersect between two lines	0	Feb 28, 2018
Problem with codewars.	5	Dec 4, 2023
What is causing the overlap?	1	Jun 12, 2005
Help with my responsive home page	2	Dec 14, 2022
BITCOIN PROGRAMMING - CODE INCLUDED - needs slight modification in linux terminal - NSA please do not block	0	Nov 2, 2024

Overlap in python

Bearophile

Marcus Wanner

nn

Mark Lawrence

Marcus Wanner

John Ladasky

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads