Overlapping region resolution

psaffrey · May 21, 2009

This may be an algorithmic question, but I'm trying to code it in
Python, so...

I have a list of pairwise regions, each with an integer start and end
and a float data point. There may be overlaps between the regions. I
want to resolve this into an ordered list with no overlapping
regions.

My initial solution was to sort the list by the start point, and then
compare each adjacent region, clipping any overlapping section in
half. I've attached code at the bottom. Unfortunately, this does not
work well if you have sections that have three or more overlapping
regions.

A more general solution is to work out where all the overlaps are
before I start. Then I can break up the region space based on what
regions overlap each section and take averages of all the datapoints
that are present in a particular section. Devising an algorithm to do
this is making my brain hurt. Any ideas?

Peter

# also validates the data
def clipRanges(regions):
for i in range(len(regions) - 1):
thispoint = regions
nextpoint = regions[i+1]
assert thispoint[1] > thispoint[0] and nextpoint[1] > nextpoint[0],
"point read not valid"
thisend = thispoint[1]
nextstart = nextpoint[0]
diff = thisend - nextstart
# a difference of zero is too close together
if diff > -1:
if diff % 2 == 1:
diff += 1
correction = diff / 2
newend = thisend - correction
newstart = newend + 1
assert newend > thispoint[0] and nextpoint[1] > newstart, "new
range not valid!"
newthispoint = (thispoint[0], newend, thispoint[2])
newnextpoint = (newstart, nextpoint[1], nextpoint[2])
regions = newthispoint
regions[i+1] = newnextpoint
return regions

regions = [ (0,10,2.5), (12,22,3.5), (15,25,1.2), (23, 30,0.01), (27,
37,1.23), (30, 35, 1.45) ]
regions2 = [ (0,10,2.5), (1,11,1.1), (2,12,1.2) ]

# works fine, produces [(0, 10, 2.5), (12, 18, 3.5), (19, 24, 1.2),
(25, 28, 0.01), (29, 33, 1.23), (34, 35, 1.45)]
print clipRanges(regions)
# violates "new range not valid" assertion
print clipRanges(regions2)

MRAB · May 21, 2009

This may be an algorithmic question, but I'm trying to code it in
Python, so...

I have a list of pairwise regions, each with an integer start and end
and a float data point. There may be overlaps between the regions. I
want to resolve this into an ordered list with no overlapping
regions.

My initial solution was to sort the list by the start point, and then
compare each adjacent region, clipping any overlapping section in
half. I've attached code at the bottom. Unfortunately, this does not
work well if you have sections that have three or more overlapping
regions.

A more general solution is to work out where all the overlaps are
before I start. Then I can break up the region space based on what
regions overlap each section and take averages of all the datapoints
that are present in a particular section. Devising an algorithm to do
this is making my brain hurt. Any ideas?

Peter

# also validates the data
def clipRanges(regions):
for i in range(len(regions) - 1):
thispoint = regions
nextpoint = regions[i+1]
assert thispoint[1] > thispoint[0] and nextpoint[1] > nextpoint[0],
"point read not valid"
thisend = thispoint[1]
nextstart = nextpoint[0]
diff = thisend - nextstart
# a difference of zero is too close together
if diff > -1:
if diff % 2 == 1:
diff += 1
correction = diff / 2
newend = thisend - correction
newstart = newend + 1
assert newend > thispoint[0] and nextpoint[1] > newstart, "new
range not valid!"
newthispoint = (thispoint[0], newend, thispoint[2])
newnextpoint = (newstart, nextpoint[1], nextpoint[2])
regions = newthispoint
regions[i+1] = newnextpoint
return regions

regions = [ (0,10,2.5), (12,22,3.5), (15,25,1.2), (23, 30,0.01), (27,
37,1.23), (30, 35, 1.45) ]
regions2 = [ (0,10,2.5), (1,11,1.1), (2,12,1.2) ]

# works fine, produces [(0, 10, 2.5), (12, 18, 3.5), (19, 24, 1.2),
(25, 28, 0.01), (29, 33, 1.23), (34, 35, 1.45)]
print clipRanges(regions)
# violates "new range not valid" assertion
print clipRanges(regions2)

Is the upper bound of a range inclusive or exclusive? If it's exclusive
(like in Python) then it's empty only if lower == upper, and an overlap
occurs only if first.upper > second.lower (and you can have newstart ==
newend in your code).

find overlapping lines & output times observed	1	May 6, 2013
More than one element of list changing when only one should be	1	Jun 7, 2008
Trying to build a SARIMAX model to forecast the S&P500 trend	0	Nov 5, 2023
Merging overlapping spans/ranges	11	May 10, 2005
I made a blockchain and want to make a cryptocurrency, but my code doesn't verify hash of each block	2	Jun 2, 2024
Python point location of intersect between two lines	0	Feb 28, 2018
Taskcproblem calendar	4	Aug 31, 2023
Working on mobile css menu with plenty of frustration!	2	Dec 29, 2022

Overlapping region resolution

psaffrey

MRAB

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads