knnFit

Purpose

Creates a K-D tree model from training data for efficient KNN predictions.

Format

mdl = knnFit(y, X, k)
Parameters
  • y (Nx1 vector or string array) – The dependent, or target, variable.

  • X (NxP matrix) – The independent variables.

  • k (Scalar) – The number of neighbors.

Returns

mdl (struct) –

An instance of a knnModel structure. For an instance named mdl, the members will be:

mdl.opaqueModel

Column vector, containing the K-D tree in opaque form.

mdl.classIndices

Px1 matrix, where P is the number of classes in the target vector y.

mdl.classNames

Px1 string array, where P is the number of classes in the target vector y, containing the class names if the target vector was a string array.

mdl.k

Scalar, the number of neighbors to search.

Examples

new;
library gml;

// Get file name with full path
fname = getGAUSSHome() $+ "pkgs/gml/examples/iris.csv";

// Load numeric predictors
X = loadd(fname, ". -Species");

// Load string labels
species = loaddSA(fname, "Species");

// Set seed for repeatable train/test sampling
rndseed 423432;

// Split data into (70%) train and (30%) test sets
{ y_train, y_test, X_train, X_test } = trainTestSplit(species, X, 0.7);

/*
** Train the model
*/

k = 3;

struct knnModel mdl;
mdl = knnFit(y_train, X_train, k);

/*
** Predictions on the test set
*/

y_hat = knnClassify(mdl, X_test);

print "prediction accuracy = " meanc(y_hat .$== y_test);

The above code will print the following output:

prediction accuracy = 0.956

See also

knnClassify()