nearest neighbor in 2D

Miki Tebeka · Nov 5, 2003

Hello John,

Can anyone point me to some code or module that provides the
appropriate data structures and algorithms to handle this task
efficiently? The size of the list will likely be in the range of
10-1000 elements.

boost (http://www.boost.org) has a graph library and a good connection
to Python using Boost.Python.

HTH.
Miki

Peter Otten · Nov 5, 2003

Jim said:
I am new to timeit.py, but this is odd.

jim@grendel:~$ /usr/lib/python2.3/timeit.py -c ' p=1.6+2.5j;np=2.4+1.3j;
d=abs(p-np)' 100000 loops, best of 3: 3.1 usec per loop

The above is actually timed.

jim@grendel:~$ /usr/lib/python2.3/timeit.py -c -s'import math;
p=1.6+2.5j;np=2.4+1.3j; d=abs(p-np)' 10000000 loops, best of 3: 0.141 usec
per loop

Here you put everything in the setup parameter. As the timed statement
defaults to pass, you are essentially timing an empty loop :-(

Is it because the builtin math functions are much slower?

You should have used math.abs() or from math import abs to really get the
abs() function in the math module. But
False

Not there, so we cannot compare.

Peter

WEINHANDL Herbert · Nov 5, 2003

John said:
I have a list of two tuples containing x and y coord

(x0, y0)
(x1, y1)
...
(xn, yn)

Given a new point x,y, I would like to find the point in the list
closest to x,y. I have to do this a lot, in an inner loop, and then I
add each new point x,y to the list. I know the range of x and y in
advance.

Can anyone point me to some code or module that provides the
appropriate data structures and algorithms to handle this task
efficiently? The size of the list will likely be in the range of
10-1000 elements.

The plotting-library dislin (http://www.dislin.com)
contains a really fast triangulation subroutine
(~1 hour for 700000 points on an 1.5 GHz Athlon).

For an example triangulation/nearest-neighbor-search see
the attached python-script.

Dislin is available for many platforms (Linux, Winxxx, ...)
and it can be used for free if you are using free languages like
Python, Perl, etc.

Happy pythoning

Herbert

#!/usr/bin/python

# (c) by H. Weinhandl 04.Nov.2003

import math
import dislin

def dist( ia,ib ) :
return math.sqrt( (X1[ia]-X1[ib])**2 + (Y1[ia]-Y1[ib])**2 )

def find_nearest_neighb() :

print 'extracting neighbours ... ',

for i in range( nr_dat+3 ) :
# initualize list wit the point i itself
neighb.append( )

for i in range( nr_tri ) :
# get a indes-triple, dislin seems to start indices with 1,
# so I'm subtracting 1 to get a zero-based index
U,V,W = I1-1, I2-1, I3-1

# add indices from all triangles, which contain the point itself
if ( U in neighb ) :
if not (V in neighb ) : neighb.append( V ) # do not append V if already in the list
if not (W in neighb ) : neighb.append( W ) # do not append W if already in the list

if ( V in neighb[V] ) :
if not (U in neighb[V] ) : neighb[V].append( U ) # do not append U if already in the list
if not (W in neighb[V] ) : neighb[V].append( W ) # do not append W if already in the list

if ( W in neighb[W] ) :
if not (U in neighb[W] ) : neighb[W].append( U ) # do not append U if already in the list
if not (V in neighb[W] ) : neighb[W].append( V ) # do not append V if already in the list

print ' OK'
for i in range( nr_dat ) :
neighb.remove( i ) # remove the point i itself
neighb.sort()
r_mi = 9.9e9
i_mi = i

# search for the nearest neighbor of point i
for j in neighb :
r = dist( i, j )
if r < r_mi :
r_mi = r
i_mi = j
print 'NB %2d : r_mi=%5.3f, i_mi=%2d '% (i, r_mi, i_mi), neighb

if __name__ == '__main__' :

X1 = [ 1.0, 2.0, 3.0, 4.0, 5.0, 4.0, 3.0 ]
Y1 = [ 1.0, 6.0, 3.0, 3.0, 2.0, 6.0, 0.0 ]
nr_dat = len( X1 )
nr_max = 2 * nr_dat + 1
I1 = [ 0 ] * nr_max
I2 = [ 0 ] * nr_max
I3 = [ 0 ] * nr_max
neighb = [ ]

# padding the 2 input-lists with 3 additional elements is required by dislin
X1.append( 0 )
X1.append( 0 )
X1.append( 0 )
Y1.append( 0 )
Y1.append( 0 )
Y1.append( 0 )

# delaunay triangulation of the input lists
# I1,I2,I3 are the indices of the triangular edge-points
nr_tri = dislin.triang( X1,Y1, nr_dat, I1,I2,I3, nr_max )

print X1
print Y1
print nr_dat, nr_tri, nr_max
print I1
print I2
print I3

find_nearest_neighb()

#---- end ----

Jim Richardson · Nov 6, 2003

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

The above is actually timed.

Here you put everything in the setup parameter. As the timed statement
defaults to pass, you are essentially timing an empty loop :-(

Ah! that explains it, thanks. breaking the import off, gets the loop to
3.2 usec per loop, which makes more sense.

You should have used math.abs() or from math import abs to really get the
abs() function in the math module. But

False

Not there, so we cannot compare.

Actually, I just cut'n'pasted, with a tiny alteration (timeit.py isn't
in my path) I like the idea of using the complex number there, it's a
tad easier that the other method for me to visualize.

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.3 (GNU/Linux)

iD8DBQE/qbFpd90bcYOAWPYRArgzAKDiNBXw3tIpooWqglmxoqn2GebyxgCcDEKV
u60SPPopUrkJ1Kak0t4eias=
=+Guh
-----END PGP SIGNATURE-----

Colin J. Williams · Nov 6, 2003

This enquiry has generated a lot of interest. I thought it might be
useful to try numarray on the problem. numarray has a compress
function, which is used to discard points which are not of interest.

The code is below.

As would be expected, each scan over the points in the neigbourhood
discards, on average, a little more than half the points.

I have not tried it, but it should be possible, with numarray, to
generalize this from 2D to multimimensional space.

Colin W.

# Neighbour.py
''' To find the closest neighbour, in a neghbourhood P,
to a point p.
'''
import math
import numarray as N
import numarray.numerictypes as _nt
import numarray.random_array as R

trial= 0
lengthRemaining= []
def find(P, p):
''' In the 2D neighbourhood P, find the point closest to p.
The approach is based on the selection of a trial value Pt, from
P, and
discarding all values further than Pt from p.
To avoid repeated sqrt calculations the discard is based on an
enclosing square.
'''
global lengthRemaining, trial
lengthRemaining+= [[P.shape[0]]]
Pz= P - p # zero based neighbourhood
while len(Pz):
Pt= Pz[0] # trial value
Pta= N.abs(Pt)
Pz= Pz[1:]
pd= math.sqrt(N.dot(Pta, Pta)) # distance of p from the trial
value
goodCases= N.logical_and((Pz < pd),
(Pz > -pd))# use the enclosing square
goodCases= N.logical_and.reduce(goodCases, 1)
Pz= N.compress(goodCases, Pz) # discard points outside the square
if len(Pz) == 1:
Pt= Pz[0] # We have found the closest
Pz= []
lengthRemaining[trial]+= [len(Pz)]
z= 100
trial+= 1
return Pt + p

if __name__ == '__main__':
for sampleSize in range(100, 5000, 100):
P= R.random(shape= (sampleSize, 2))
for i in range(20):
p= R.random((1, 2)) # point
a= find(P, p)
## print 'Closest neighbour:', a[0]
## print 'Check - Point(p):', p[0]
## check= []
## for i in range(len(P)):
## check+= [(math.sqrt((P[i, 0]-p[0, 0])**2 + (P[i, 1]-p[0,
1])**2), P[i, 0], P[i, 1])]
## print P, math.sqrt((P[i, 0]-p[0, 0])**2 + (P[i, 1]-p[0, 1])**2)
## check.sort()
## print check[0]
## print check
print 'Number of scans:', sum([len(lst) for lst in lengthRemaining])
print 'Sample size:', P.shape[0], ' Average numner of scans:',
sum([len(lst) for lst in lengthRemaining])/float(len(lengthRemaining))

WEINHANDL said:

John said:

I have a list of two tuples containing x and y coord
(x0, y0)
(x1, y1)
...
(xn, yn)

Given a new point x,y, I would like to find the point in the list
closest to x,y. I have to do this a lot, in an inner loop, and then I
add each new point x,y to the list. I know the range of x and y in
advance.

Can anyone point me to some code or module that provides the
appropriate data structures and algorithms to handle this task
efficiently? The size of the list will likely be in the range of
10-1000 elements.

Click to expand...

The plotting-library dislin (http://www.dislin.com)
contains a really fast triangulation subroutine
(~1 hour for 700000 points on an 1.5 GHz Athlon).

For an example triangulation/nearest-neighbor-search see
the attached python-script.

Dislin is available for many platforms (Linux, Winxxx, ...)
and it can be used for free if you are using free languages like
Python, Perl, etc.

Happy pythoning

Herbert

------------------------------------------------------------------------

#!/usr/bin/python

# (c) by H. Weinhandl 04.Nov.2003

import math
import dislin

def dist( ia,ib ) :
return math.sqrt( (X1[ia]-X1[ib])**2 + (Y1[ia]-Y1[ib])**2 )

def find_nearest_neighb() :

print 'extracting neighbours ... ',

for i in range( nr_dat+3 ) :
# initualize list wit the point i itself
neighb.append( )

for i in range( nr_tri ) :
# get a indes-triple, dislin seems to start indices with 1,
# so I'm subtracting 1 to get a zero-based index
U,V,W = I1-1, I2-1, I3-1

# add indices from all triangles, which contain the point itself
if ( U in neighb ) :
if not (V in neighb ) : neighb.append( V ) # do not append V if already in the list
if not (W in neighb ) : neighb.append( W ) # do not append W if already in the list

if ( V in neighb[V] ) :
if not (U in neighb[V] ) : neighb[V].append( U ) # do not append U if already in the list
if not (W in neighb[V] ) : neighb[V].append( W ) # do not append W if already in the list

if ( W in neighb[W] ) :
if not (U in neighb[W] ) : neighb[W].append( U ) # do not append U if already in the list
if not (V in neighb[W] ) : neighb[W].append( V ) # do not append V if already in the list

print ' OK'
for i in range( nr_dat ) :
neighb.remove( i ) # remove the point i itself
neighb.sort()
r_mi = 9.9e9
i_mi = i

# search for the nearest neighbor of point i
for j in neighb :
r = dist( i, j )
if r < r_mi :
r_mi = r
i_mi = j
print 'NB %2d : r_mi=%5.3f, i_mi=%2d '% (i, r_mi, i_mi), neighb

if __name__ == '__main__' :

X1 = [ 1.0, 2.0, 3.0, 4.0, 5.0, 4.0, 3.0 ]
Y1 = [ 1.0, 6.0, 3.0, 3.0, 2.0, 6.0, 0.0 ]
nr_dat = len( X1 )
nr_max = 2 * nr_dat + 1
I1 = [ 0 ] * nr_max
I2 = [ 0 ] * nr_max
I3 = [ 0 ] * nr_max
neighb = [ ]

# padding the 2 input-lists with 3 additional elements is required by dislin
X1.append( 0 )
X1.append( 0 )
X1.append( 0 )
Y1.append( 0 )
Y1.append( 0 )
Y1.append( 0 )

# delaunay triangulation of the input lists
# I1,I2,I3 are the indices of the triangular edge-points
nr_tri = dislin.triang( X1,Y1, nr_dat, I1,I2,I3, nr_max )

print X1
print Y1
print nr_dat, nr_tri, nr_max
print I1
print I2
print I3

find_nearest_neighb()

#---- end ----

Click to expand...

[C language] Issue in the Lotka-Volterra model.	0	Jun 28, 2023
Copy string from 2D array to a 1D array in C	1	Nov 1, 2023
Help for my project in the last minute	0	Apr 23, 2022
Beginner. 2d rotation gives unexpected results.	4	Jul 23, 2013
Cannot convert (double) to (double*)	1	Sep 5, 2022
Weird Behavior with Rays in C and OpenGL	4	Feb 13, 2024
Numerical Integration of 2D data arrays	0	Nov 7, 2008
Problem with splice in a 2D ARRAY	6	Jun 17, 2013

nearest neighbor in 2D

Miki Tebeka

Peter Otten

WEINHANDL Herbert

Jim Richardson

Colin J. Williams

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads