k-nearest neighbour classification for test set from training set. For
each row of the test set, the
k nearest (in Euclidean distance)
training set vectors are found, and the classification is decided by
majority vote, with ties broken at random. If there are ties for the
k
th nearest vector, all candidates are included in the vote.
USAGE:
knn(train, test, cl, k=1, l=0, prob=F, use.all=T)
REQUIRED ARGUMENTS:
train
matrix or data frame of training set cases.
test
matrix or data frame of test set cases. A vector will be interpreted
as a row vector for a single case.
cl
factor of true classifications of training set
OPTIONAL ARGUMENTS:
k
number of neighbours considered.
l
minimum vote for definite decision, otherwise
doubt. (More
precisely, less than
k-l dissenting votes are allowed, even if
k
is increased by ties.)
prob
If this is true, the proportion of the votes for the winning class
are returned as attribute
prob.
use.all
controls handling of ties. If true, all distances equal to the
kth
largest are included. If false, a random selection of distances
equal to the
kth is chosen to use exactly
k neighbours.
VALUE:
factor of classifications of test set.
doubt will be returned as
NA.