setcoltypes#
Purpose#
Set the types of columns in a matrix or dataframe. This is an alias for dftype().
Format#
- x_meta = setColTypes(X, types[, columns])#
- Parameters:
X (NxK matrix) – data.
types (Mx1 vector) – Specifies types to be assigned to columns specified in columns. Valid options include:
"string","date","number", and"category".columns (Mx1 vector) – Optional argument, indices of columns in X to assign types to. Default = all columns.
- Returns:
x_meta (NxK dataframe) – Data with the types specified in types assigned to the columns specified in columns.
Remarks#
Date Variables#
When a numeric column is set to type date with setcoltypes():
The data from the column is interpreted as POSIX time (seconds since Jan 1, 1970).
The default date format will be used. This can be changed with
asdate().
Categorical and String Variables#
When a numeric column is set to type category, or string with setcoltypes():
Each value will be converted to an integer to create the keys. The labels will be the string version of the number.
When a categorical or string variable is converted to a numeric column;
The updated column will contain the numeric keys associated with the string or category labels.
Examples#
Example 1: POSIX time numeric column to date column#
secs_per_day = 24 * 60 * 60;
// Create a 2x1 vector
x = 0 | secs_per_day;
// Set the numeric vector to be a date
x = setcoltypes(x, "date", 1);
print x;
Since the date vector is interpreted as seconds since Jan 1, 1970, the code above will print:
X1
1970-01-01
1970-01-02
Example 2: Category to number#
// Load 'cycles' and load 'amplitude' as a categorical variable
fname = getGAUSSHome("examples/yarn.xlsx");
yarn = loadd(fname, "cat(amplitude) + cycles");
// Set the first column to be a numeric column
yarn_n = setcoltypes(yarn, "number", 1);
After the above code, the first few rows look like this:
yarn = amplitude cycles yarn_n = amplitude cycles
low 674 1 674
low 370 1 370
low 292 1 292
med 338 2 338
Example 3: Integer column to category#
x = { 1,
0,
1,
2 };
// Make 'x' a dataframe and set its
// only column to be a category
x = setcoltypes(x, "category", 1);
After the above code, x will be a datframe as shown below:
X1
1
0
1
2
We can get the categorical labels and key values like this:
{ labels, keys } = getcollabels(x, 1);
They will equal:
labels = "0" keys = 0
"1" 1
"2" 2
We can set new labels with recodecatlabels() like this:
// Set the labels for 0, 1, and 2 to be
// alpha, beta and gamma
x = recodecatlabels(x, labels, "alpha"$|"beta"$|"gamma", 1);
Now x will be the following dataframe:
X1
beta
alpha
beta
gamma
See also
Functions dfname(), setColLabels(), setColMetadata(), asdate()