# splitData¶

## Purpose¶

Returns test and training splits for a single matrix of variables.

## Format¶

{ X_train, X_test } = splitData(X, train_pct)
Parameters: X (Nx1 vector, or NxP matrix.) – The matrix to split. train_pct (Scalar) – The percentage of observations to include in the training set. X_train – (train_pct * N) x P matrix of independent variables. X_test – The remaining observations from the original X which were not selected to be in the training set.

## Examples¶

library gml;

// Set seed for repeatable sampling
rndseed 23324;

X = { 1   3,
9   6,
6   1,
8   4,
9   5,
1   8 };

// Shuffle data and create training set with 2/3 of
// the observations and 1/3 for the test set
{ X_train, X_test } = splitData(X, 0.67);


After the above code:

X_train = 9    5
1    3
8    4
1    8

X_test =  9    6
6    1


## Remarks¶

The observations (rows) of X are kept together. For repeatable shuffling, use the rndseed keyword before calling splitData().