normalizecollabels

Purpose

Finds instances of identical labels that have differing keys in a dataframe and merges them so all labels are unique.

If identical labels are merged, all references to the key of the duplicate label will be updated.

Format

x_norm = normalizecollabels(x[, columns])
Parameters:
  • x – data.

  • columns (Mx1 scalar or string/string array.) – Optional. The names or indices of the string/category columns in x to normalize. All string/category columns will be processed if omitted.

Returns:

x_norm (NxK dataframe) – Data with normalized string/categorical variables

Remarks

The normalizecollabels() procedure is useful when cleaning and merging categorical variables that may come from different sources. This is primarily a convenience function utilized by multiple string-related functions and in general should not need to be called explicitly by an end-user.

See also

dfappend()