# knnFit¶

## Purpose¶

Creates a K-D tree model from training data for efficient KNN predictions.

## Format¶

mdl = knnFit(y, X, k)
Parameters
• y (Nx1 vector or string array) – The dependent, or target, variable.

• X (NxP matrix) – The independent variables.

• k (Scalar) – The number of neighbors.

Returns

mdl (struct) –

An instance of a knnModel structure. For an instance named mdl, the members will be:

 mdl.opaqueModel Column vector, containing the K-D tree in opaque form. mdl.classIndices Px1 matrix, where P is the number of classes in the target vector y. mdl.classNames Px1 string array, where P is the number of classes in the target vector y, containing the class names if the target vector was a string array. mdl.k Scalar, the number of neighbors to search.

## Examples¶

new;
library gml;

// Get file name with full path
fname = getGAUSSHome() $+ "pkgs/gml/examples/iris.csv"; // Load numeric predictors X = loadd(fname, ". -Species"); // Load string labels species = loaddSA(fname, "Species"); // Set seed for repeatable train/test sampling rndseed 423432; // Split data into (70%) train and (30%) test sets { y_train, y_test, X_train, X_test } = trainTestSplit(species, X, 0.7); /* ** Train the model */ k = 3; struct knnModel mdl; mdl = knnFit(y_train, X_train, k); /* ** Predictions on the test set */ y_hat = knnClassify(mdl, X_test); print "prediction accuracy = " meanc(y_hat .$== y_test);


The above code will print the following output:

prediction accuracy = 0.956