pdIsBalanced#
Purpose#
Checks if each group in a panel dataset covers the maximum time span.
Format#
- groupIsBalanced = pdIsBalanced(df[, groupvar, datevar])#
- Parameters:
df (Dataframe) – Contains long-form (stacked) panel data with (N_i * T_i) rows, where (N_i * T_i) is the total number of observations across all groups, and K columns representing variables. Must contain at least one categorical or string variable for identifying group membership and at least one date variable.
groupvar (String) – Optional, specifies the name of the variable used to identify group membership for panel observations. Defaults to the first categorical or string variable in the dataframe.
datevar (String) – Optional, specifies the name of the variable used to identify dates for panel observations. Defaults to the first date variable in the dataframe.
- Returns:
groupIsBalanced (Dataframe) – Indicates whether each group in the panel dataset spans the full time range of the dataset. Each group is assigned a value of 1 if it covers the full time span, 0 otherwise.
Examples#
// Example dataframe
df = asDF("Group Date Variable",
{ "A" 1 10,
"A" 2 20,
"B" 1 30,
"B" 3 40 });
// Check if each group covers the maximum time span
groupIsBalanced = pdIsBalanced(df);
The code above will return:
Group IsBalanced
-------------------
A 1
B 0
Remarks#
This function takes long-form panel data. To transform wide data to long-form data see dfLonger()
.
This function assumes panel is sorted by group and date. Note that panel data can be sorted using pdSort()
.
This function evaluates whether each group in a panel dataset spans the maximum time range observed across all groups.
If
groupvar
is not provided, the function defaults to the first categorical or string variable in the dataframe.If
datevar
is not provided, the function defaults to the first date variable in the dataframe.
The resulting dataframe contains each group and a corresponding indicator (1
or 0
) to represent whether the group covers the full time span.
See also