Data cleaning¶
Size¶
| cols | Returns number of columns in a matrix, string array or dataframe. |
| getdims | Returns the number of dimensions in a matrix, string array or n-dimensional array. |
| getorders | Returns the dimensions corresponding to matrix, string array or n-dimensional array. |
| rows | Returns number of rows in a matrix, string array or dataframe. |
Selection¶
| delcols | Removes variables from a dataframe specified by index or name. |
| delif | Removes rows of data based on a logical expression. |
| delrows | Removes observations (rows) from a dataframe by index. |
| diag | Extracts the diagonal of a matrix. |
| getmatrix | Gets a contiguous matrix from an N-dimensional array. |
| head | Returns the first n rows of a matrix, dataframe or string array. |
| selif | Keeps rows of data based on a logical expression. |
| submat | Extracts a submatrix from a matrix. |
| subvec | Extracts an Nx1 vector of elements from an NxK matrix. |
| tail | Returns the last n rows of a matrix, dataframe or string array. |
| trimr | Trims rows from the top or bottom. |
Merging¶
| dfappend | Vertically concatenates (or stacks) two dataframes. |
| innerJoin | Performs a left, or full, outer join on two matrices based upon user-specified key columns. |
| insertcols | Inserts one or more new columns into a matrix or dataframe at a specified location. |
| outerJoin | Joins two matrices, or dataframes based upon user-specified key columns, with non-matching rows removed. |
| where | Returns elements from a or b, depending on condition. |
Duplicate observations¶
| dropduplicates | Drops duplicate observations from data. |
| getduplicates | Identifies duplicate observations and prints report. |
| isunique | Checks if all observations in the matrix or dataframe are unique. |
| isrowunique | Returns a binary vector with a one for every row that is unique, otherwise a zero. |
Missing values¶
| impute | Replaces missing values in the columns of a matrix by a specified imputation method. |
| isinfnanmiss | Returns true if the argument contains an infinity, NaN, or missing value. |
| ismiss | Returns 1 if matrix has any missing values, 0 otherwise. |
| miss, missrv | Creates a scalar missing value, or converts (or replaces) specified elements in a matrix to GAUSS’s missing value code. |
| missex | Converts numeric values to the missing value code according to the values given in a logical expression. |
| msym | Controls the symbol printed to represent missing values. |
| packr | Deletes the rows of a matrix that contain any missing values. |
| scalmiss | Returns 1 if the input is a scalar missing value. |
Searching¶
| between | Indicates whether elements in a matrix fall between a specified lower and upper bound. |
| contains | Indicates whether one matrix, multidimensional array or string array contains any elements from another symbol. |
| counts | Returns number of elements of a vector falling in specified ranges. |
| countwts | Returns weighted count of elements of a vector falling in specified ranges. |
| indexcat | Returns indices of elements falling within a specified range. |
| indnv | Checks one numeric vector against another and returns the indices of the elements of the first vector in the second vector. |
| isempty | Checks whether a symbol is an empty matrix. |
| ismember | Checks whether each element of a matrix or string array matches any element from a separate symbol. |
| maxindc | Returns row number of largest element in each column of a matrix. |
| minindc | Returns row number of smallest element in each column of a matrix. |
| rowcontains | Checks whether any element in the row of a matrix or string array matches any element from a separate symbol. |
Sorting and set functions¶
| intrsect | Returns the intersection of two vectors. |
| setdif | Returns the unique elements in one vector that are not present in a second vector. |
| sortc | Sorts a numeric matrix, character matrix or string array. |
| sortind, sortindc | Returns the sorted index of x. |
| sortmc | Sorts a matrix on multiple columns. |
| sortr, sortrc | Sorts the columns of a matrix of numeric or character data, with respect to a specified row. |
| union | Returns the union of two vectors. |
| unique | Sorts and removes duplicate elements from a vector. |
| uniqindx | Computes the sorted index of x, leaving out duplicate elements. |
String and categorical variables¶
| getcollabels | Returns the unique set of column labels and corresponding key values for a categorical variable. |
| recodeCatLabels | Replaces the labels in a categorical variable of a dataframe. |
| reorderCatLabels | Changes the order of the labels in a categorical variable of a dataframe. |
| setBaseCat | Sets a specified category to be the base case for a categorical variable. |
These functions can be used to fix errors in categorical labels.
| strreplace | Replaces a substring within a categorical label or string element. |
| strtof | Converts a string or categorical variable of a dataframe to a numeric variable. |
| strtrim | Strips all white space characters from the left and right side of each element in a categorical variable or string array. |
| strtriml | Strips all white space characters from the left side of each element in a categorical variable or string array. |
| strtrimr | Strips all white space characters from the right side of each element in a categorical variable or string array. |
Transform¶
| code | Allows a new variable to be created (coded) with different values depending upon which one of a set of logical expressions is true. |
| dfLonger | Converts a GAUSS dataframe in wide panel format to long panel format. |
| dfWider | Converts a GAUSS dataframe in long panel format to wide panel format. |
| diagrv | Inserts a vector into the diagonal of a matrix. |
| dummy | Creates a set of dummy (0/1) variables by breaking up a variable into specified categories. The highest (rightmost) category is unbounded on the right. |
| dummybr | Creates a set of dummy (0/1) variables. The highest (rightmost) category is bounded on the right. |
| dummydn | Creates a set of dummy (0/1) variables by breaking up a variable into specified categories. The highest (rightmost) category is unbounded on the right, and a specified column of dummies is dropped. |
| lagn | Lags (or leads) a matrix a specified number of time periods for time series analysis. |
| lagTrim | Lags (or leads) a vector a specified number of time periods and removes the incomplete rows. |
| maxv | Performs an element by element comparison of two matrices and returns the maximum value for each element. |
| minv | Performs an element by element comparison of two matrices and returns the minimum value for each element. |
| order | Reorder a matrix based on user-specified ordering. Relocates columns to the beginning of the dataset in the order in which the variables are specified. |
| pdDiff | Computes time series differences of panel data. |
| pdLag | Computes time series lags of panel data. |
| reclassify | Replaces specified values of a matrix, array or string array |
| reclassifyCuts | Replaces values of a matrix or array within specified ranges |
| rev | Reverses the order of rows of a matrix. |
| reshape | Reshapes a dataframe, matrix or string array to new dimensions. |
| rotater | Rotates the rows of a matrix, wrapping elements as necessary. |
| shiftc | Shifts, lags or leads, columns of a matrix, filling in holes with a specified value. |
| shiftr | Shifts rows of a matrix, filling in holes with a specified value. |
| subscat | Changes the values in a vector depending on the category a particular element falls in. |
| substute | Substitutes new values for old values in a matrix, depending on the outcome of a logical expression. |
| vec, vecr | Stacks columns or rows of a matrix to form a single column. |
| vech | Reshapes the lower triangular portion of a symmetric matrix into a column vector. |
| xpnd | Expands a column vector into a symmetric matrix. |