Find the Nearest Neighbors of a Point

DESCRIPTION:

Find the k nearest neighbors of a vector x in a matrix of data contained in an object of class quad.tree.

USAGE:

find.neighbor(x, quadtree=quad.tree(x), k=1, metric="euclidean",  
              max.dist=NULL, drop.self=F) 

REQUIRED ARGUMENTS:

x
a vector (or matrix) containing the multidimensional point(s) at which the nearest neighbors are desired. The vector must have the same number of elements as the number of columns in the numeric matrix used to construct quadtree. If a matrix is used, the matrix must have the same number of columns as the numeric matrix used to construct quadtree, and nearest neighbors are found for each row in the matrix.

OPTIONAL ARGUMENTS:

quadtree
an object of class "quad.tree" containing the sorted matrix of data for which a nearest neighbor search is desired. Defaults to quad.tree(x) if x is a matrix but it is required when x is a vector.
k
the number of nearest neighbors to be found. If the data x is the same data that was used to construct the " quad.tree" object, then k = 1 results in each element having itself as its own nearest neighbor.
metric
a character string giving the metric to be used when finding "nearest" neighbors. Partial matching is allowed. Possible values are: "euclidean", "city block", and "maximum absolute value" for the l_2, l_1, and l-infinity norm, respectively. For two vectors x and y, these are defined as



l_1 = sum | x[i] - y[i] |,



L_2 = sqrt(sum ( x[i] - y[i] )^2), and



L_infinity = max | x[i] - y[i] |

max.dist
if max.dist is given, argument k is ignored, and all of the neighbors within distance max.dist of each row in x are found.
drop.self
a logical value, if TRUE then rows with distances equal to 0 and index1 == index2 (self neighbors) are dropped from the returned object. This definition retains coincident points as neighbors although their distance apart is zero. If quadtree is not supplied, k=1, and drop.self=T, a warning is printed (since this results in nothing being returned) and the value of k is set to 2.

VALUE:

a matrix with three named columns:
index1
if x is a matrix, the row in x for this nearest neighbor. If x is not a matrix, the value 1.
index2
the row in the matrix from which the quad tree was formed for this nearest neighbor. If the quad tree was formed from a matrix y, then x[index1[i],] and y[index2[i],] are neighbors.
distances
the corresponding nearest neighbor distances.

DETAILS:

An efficient recursive algorithm is used to find all nearest neighbors. First the quad tree is traversed to find the leaf with medians nearest the point for which neighbors are desired. Then all observations in the leaf are searched to find nearest neighbors. Finally, if necessary, adjoining leaves are searched for nearest neighbors.

REFERENCES:

Friedman, J., Bentley, J. L., and Finkel, R. A. (1977). An algorithm for finding best matches in logarithmic expected time. ACM Transactions on Mathematical Software 3, 209-226.

SEE ALSO:

.

EXAMPLES:

x <- cbind(sids$easting, sids$northing) 
sids.nhbr <- find.neighbor(x, max.dist = 30) 
# Find the nearest neighbors for the Lansing hickories 
hickory <- lansing[lansing[,3] == "hickory", 1:2] 
hickory.nhbr1 <- find.neighbor(hickory, k=2, drop.self=T) 
# Now find the closest maple for each hickory 
maple <- lansing[lansing[,3] == "maple", 1:2] 
hmn <- find.neighbor(hickory, quad.tree(maple)) 
# and plot the tree locations with lines joing the neighbors 
par(pty='s') 
plot(maple[,1], maple[,2], pch=16)  
points(hickory[,1], hickory[,2], pch=1, col=2) 
segments(hickory[hmn[,1],1], hickory[hmn[,1],2], 
     maple[hmn[,2],1], maple[hmn[,2],2])