normalizecollabels#
Purpose#
Finds instances of identical labels that have differing keys in a dataframe and merges them so all labels are unique.
If identical labels are merged, all references to the key of the duplicate label will be updated.
Format#
- x_norm = normalizecollabels(x[, columns])#
- Parameters:
x – data.
columns (Mx1 scalar or string/string array.) – Optional. The names or indices of the string/category columns in x to normalize. All string/category columns will be processed if omitted.
- Returns:
x_norm (NxK dataframe) – Data with normalized string/categorical variables
Remarks#
The normalizecollabels()
procedure is useful when cleaning and merging categorical variables that may come from different sources. This is primarily a convenience function utilized by multiple string-related functions and in general should not need to be called explicitly by an end-user.
See also