kmeansPredict¶
Purpose¶
Partitions data into k clusters, based upon k user supplied centroids.
Format¶

assignments =
kmeansPredict
(mdl, X)¶ 
assignments =
kmeansPredict
(centroids, X) Parameters:  mdl (struct) – Instance of a
kmeansModel
structure.  centroids (kxP matrix) – Cluster centers.
 X (NxP matrix) – The data to partition.
Returns: assignments (Nx1 matrix) – The cluster to which each corresponding index of X has been assigned. range = 1k.
 mdl (struct) – Instance of a
Examples¶
Example 1: Basic example with a matrix of centroids.¶
library gml;
centroids = { 2 3,
2 3 };
X = { 1 1,
0 2,
2 0 };
// Assign each row of 'X' to either cluster 1 or cluster 2
assignments = kmeansPredict(centroids, X);
The above code will assign assignments equal to:
1
2
1
because, the points (1,1) and (2,0) are closer (euclidean distance) to the first centroid at point (2,3), while the second row of X (0,2) is closer to the second centroid (2,3).
Example 2: Use centroids from a kmeansModel structure¶
new;
library gml;
// Get dataset with full name
fname = getGAUSSHome() $+ "pkgs/gml/examples/iris.csv";
// Load data
X = loadd(fname, ". species");
// For repeatable sample
rndseed 234234;
// Split data into x_train and x_test
{ x_train, x_test } = splitData(X, 0.70);
// Number of clusters
n_clusters = 3;
// Declare kmeansModel struct
struct kmeansModel mdl;
// Fit kmeans model
mdl = kmeansFit(x_train , n_clusters);
// Assign test data to clusters
test_clusters = kmeansPredict(mdl, x_test);
print "The first three cluster assignments are: " test_clusters[1:3];
The above code will print the following:
The first three cluster assignments are:
2
2
1
See also
kmeansFit()
, kmeansControlCreate()