R
Robert Bossy
Hi,
I stumbled into a sorted list intersection algorithm by Baeza-Yates
which I found quite elegant. For the lucky enough to have a springerlink
access, here's the citation:
http://dblp.uni-trier.de/rec/bibtex/conf/cpm/Baeza-Yates04
I implemented this algorithm in python and I thought I could share it.
I've done some tests and, of course, it can't compete against dict/set
intersection, but it will perform pretty well. Computing union and
differences are left as an exercise...
from bisect import bisect_left
def bisect_intersect(L1, L2):
inter = []
def rec(lo1, hi1, lo2, hi2):
if hi1 <= lo1: return
if hi2 <= lo2: return
mid1 = (lo1 + hi1) // 2
x1 = L1[mid1]
mid2 = bisect_left(L2, x1, lo=lo2, hi=hi2)
rec(lo1, mid1, lo2, mid2)
if mid2 < hi2 and x1 == L2[mid2]:
inter.append(x1)
rec(mid1+1, hi1, mid2+1, hi2)
else:
rec(mid1+1, hi1, mid2, hi2)
rec(0, len(L1), 0, len(L2))
inter.sort()
return inter
Cheers
RB
I stumbled into a sorted list intersection algorithm by Baeza-Yates
which I found quite elegant. For the lucky enough to have a springerlink
access, here's the citation:
http://dblp.uni-trier.de/rec/bibtex/conf/cpm/Baeza-Yates04
I implemented this algorithm in python and I thought I could share it.
I've done some tests and, of course, it can't compete against dict/set
intersection, but it will perform pretty well. Computing union and
differences are left as an exercise...
from bisect import bisect_left
def bisect_intersect(L1, L2):
inter = []
def rec(lo1, hi1, lo2, hi2):
if hi1 <= lo1: return
if hi2 <= lo2: return
mid1 = (lo1 + hi1) // 2
x1 = L1[mid1]
mid2 = bisect_left(L2, x1, lo=lo2, hi=hi2)
rec(lo1, mid1, lo2, mid2)
if mid2 < hi2 and x1 == L2[mid2]:
inter.append(x1)
rec(mid1+1, hi1, mid2+1, hi2)
else:
rec(mid1+1, hi1, mid2, hi2)
rec(0, len(L1), 0, len(L2))
inter.sort()
return inter
Cheers
RB