knnFit

Purpose

Creates a K-D tree model from training data for efficient KNN predictions.

Format

mdl = knnFit(y, X, k)
Parameters:
  • y (Nx1 vector or string array) – The dependent, or target, variable.
  • X (NxP matrix) – The independent variables.
  • k (Scalar) – The number of neighbors.
Returns:

mdl (struct) –

An instance of a knnModel structure. For an instance named mdl, the members will be:

mdl.opaqueModel Column vector, containing the K-D tree in opaque form.
mdl.classIndices Px1 matrix, where P is the number of classes in the target vector y.
mdl.classNames Px1 string array, where P is the number of classes in the target vector y, containing the class names if the target vector was a string array.
mdl.k Scalar, the number of neighbors to search.

Examples

new;
library gml;

// Get file name with full path
fname = getGAUSSHome() $+ "pkgs/gml/examples/iris.csv";

// Load numeric predictors
X = loadd(fname, ". -Species");

// Load string labels
species = loaddSA(fname, "Species");

// Set seed for repeatable train/test sampling
rndseed 423432;

// Split data into (70%) train and (30%) test sets
{ y_train, y_test, X_train, X_test } = trainTestSplit(species, X, 0.7);

/*
** Train the model
*/

k = 3;

struct knnModel mdl;
mdl = knnFit(y_train, X_train, k);

/*
** Predictions on the test set
*/

y_hat = knnClassify(mdl, X_test);

print "prediction accuracy = " meanc(y_hat .$== y_test);

The above code will print the following output:

prediction accuracy = 0.956

See also

knnClassify()