Data cleaning#
Size#
| Returns number of columns in a matrix, string array or dataframe. | |
| Returns the number of dimensions in a matrix, string array or n-dimensional array. | |
| Returns the dimensions corresponding to matrix, string array or n-dimensional array. | |
| Returns number of rows in a matrix, string array or dataframe. | 
Selection#
| Removes variables from a dataframe specified by index or name. | |
| Removes rows of data based on a logical expression. | |
| Removes observations (rows) from a dataframe by index. | |
| Extracts the diagonal of a matrix. | |
| Gets a contiguous matrix from an N-dimensional array. | |
| Returns the first  | |
| Keeps rows of data based on a logical expression. | |
| Extracts a submatrix from a matrix. | |
| Extracts an Nx1 vector of elements from an NxK matrix. | |
| Returns the last  | |
| Trims rows from the top or bottom. | 
Merging#
| Vertically concatenates (or stacks) two dataframes. | |
| Performs a left, or full, outer join on two matrices based upon user-specified key columns. | |
| Inserts one or more new columns into a matrix or dataframe at a specified location. | |
| Joins two matrices, or dataframes based upon user-specified key columns, with non-matching rows removed. | |
| Returns elements from  | 
Duplicate observations#
| Drops duplicate observations from data. | |
| Identifies duplicate observations and prints report. | |
| Checks if all observations in the matrix or dataframe are unique. | |
| Returns a binary vector with a one for every row that is unique, otherwise a zero. | 
Missing values#
| Replaces missing values in the columns of a matrix by a specified imputation method. | |
| Returns true if the argument contains an infinity, NaN, or missing value. | |
| Returns 1 if matrix has any missing values, 0 otherwise. | |
| Creates a scalar missing value, or converts (or replaces) specified elements in a matrix to GAUSS’s missing value code. | |
| Converts numeric values to the missing value code according to the values given in a logical expression. | |
| Controls the symbol printed to represent missing values. | |
| Deletes the rows of a matrix that contain any missing values. | |
| Returns 1 if the input is a scalar missing value. | 
Searching#
| Indicates whether elements in a matrix fall between a specified lower and upper bound. | |
| Indicates whether one matrix, multidimensional array or string array contains any elements from another symbol. | |
| Returns number of elements of a vector falling in specified ranges. | |
| Returns weighted count of elements of a vector falling in specified ranges. | |
| Returns indices of elements falling within a specified range. | |
| Checks one numeric vector against another and returns the indices of the elements of the first vector in the second vector. | |
| Checks whether a symbol is an empty matrix. | |
| Checks whether each element of a matrix or string array matches any element from a separate symbol. | |
| Returns row number of largest element in each column of a matrix. | |
| Returns row number of smallest element in each column of a matrix. | |
| Checks whether any element in the row of a matrix or string array matches any element from a separate symbol. | 
Sorting and set functions#
| Returns the intersection of two vectors. | |
| Returns the unique elements in one vector that are not present in a second vector. | |
| Sorts a numeric matrix, character matrix or string array. | |
| Returns the sorted index of x. | |
| Sorts a matrix on multiple columns. | |
| Sorts the columns of a matrix of numeric or character data, with respect to a specified row. | |
| Returns the union of two vectors. | |
| Sorts and removes duplicate elements from a vector. | |
| Computes the sorted index of x, leaving out duplicate elements. | 
String and categorical variables#
| Returns the unique set of column labels and corresponding key values for a categorical variable. | |
| Replaces the labels in a categorical variable of a dataframe. | |
| Changes the order of the labels in a categorical variable of a dataframe. | |
| Sets a specified category to be the base case for a categorical variable. | 
These functions can be used to fix errors in categorical labels.
| Replaces a substring within a categorical label or string element. | |
| Converts a string or categorical variable of a dataframe to a numeric variable. | |
| Strips all white space characters from the left and right side of each element in a categorical variable or string array. | |
| Strips all white space characters from the left side of each element in a categorical variable or string array. | |
| Strips all white space characters from the right side of each element in a categorical variable or string array. | 
Transform#
| Allows a new variable to be created (coded) with different values depending upon which one of a set of logical expressions is true. | |
| Converts a GAUSS dataframe in wide panel format to long panel format. | |
| Converts a GAUSS dataframe in long panel format to wide panel format. | |
| Inserts a vector into the diagonal of a matrix. | |
| Creates a set of dummy (0/1) variables by breaking up a variable into specified categories. The highest (rightmost) category is unbounded on the right. | |
| Creates a set of dummy (0/1) variables. The highest (rightmost) category is bounded on the right. | |
| Creates a set of dummy (0/1) variables by breaking up a variable into specified categories. The highest (rightmost) category is unbounded on the right, and a specified column of dummies is dropped. | |
| Lags (or leads) a matrix a specified number of time periods for time series analysis. | |
| Lags (or leads) a vector a specified number of time periods and removes the incomplete rows. | |
| Performs an element by element comparison of two matrices and returns the maximum value for each element. | |
| Performs an element by element comparison of two matrices and returns the minimum value for each element. | |
| Reorder a matrix based on user-specified ordering. Relocates columns to the beginning of the dataset in the order in which the variables are specified. | |
| Computes time series differences of panel data. | |
| Computes time series lags of panel data. | |
| Replaces specified values of a matrix, array or string array | |
| Replaces values of a matrix or array within specified ranges | |
| Reverses the order of rows of a matrix. | |
| Reshapes a dataframe, matrix or string array to new dimensions. | |
| Rotates the rows of a matrix, wrapping elements as necessary. | |
| Shifts, lags or leads, columns of a matrix, filling in holes with a specified value. | |
| Shifts rows of a matrix, filling in holes with a specified value. | |
| Changes the values in a vector depending on the category a particular element falls in. | |
| Substitutes new values for old values in a matrix, depending on the outcome of a logical expression. | |
| Stacks columns or rows of a matrix to form a single column. | |
| Reshapes the lower triangular portion of a symmetric matrix into a column vector. | |
| Expands a column vector into a symmetric matrix. | 
Scaling and normalization#
| Scales the columns of a matrix using a specified centering and scaling method. | 
