pdAllBalanced¶
Purpose¶
Checks if a panel dataset is strongly balanced and returns 1 if balanced, 0 otherwise.
Format¶
-
isBalanced =
pdAllBalanced(df[, groupvar, datevar])¶ Parameters: - df (Dataframe) – Contains long-form panel data with \(N_i \times T_i\) rows and K columns.
- groupvar (String) – Optional, specifies the name of the variable used to identify group membership for panel observations. Defaults to the first categorical or string variable in the dataframe.
- datevar (String) – Optional, specifies the name of the variable used to identify dates for panel observations. Defaults to the first date variable in the dataframe.
Returns: isBalanced (Scalar) – Indicates if the panel dataset is balanced. Returns 1 if balanced, 0 otherwise.
Examples¶
If your group variable is the first categorical variable in your dataframe and the date variable is a GAUSS date variable and not just a numeric column, you can just pass in the panel dataframe and GAUSS will locate the group and date variables for you.
// Import data
fname = getGAUSSHome("examples/pd_ab.gdat");
pd_ab = loadd(fname);
// Take a small sample for the example
pd_smpl = pd_ab[1:4 8:11,.];
// Print our sample
print pd_smpl;
id year emp wage
1 1977-01-01 5.0410 13.1516
1 1978-01-01 5.6000 12.3018
1 1979-01-01 5.0150 12.8395
1 1980-01-01 4.7150 13.8039
2 1977-01-01 71.3190 14.7909
2 1978-01-01 70.6430 14.1036
2 1979-01-01 70.9180 14.9534
2 1980-01-01 72.0310 15.4910
// Check to see if the panel is balanced
is_balanced = pdallbalanced(pd_smpl);
print is_balanced;
The above code will return:
1.000
Remarks¶
This function takes long-form panel data. To transform wide data to long-form data see dfLonger().
This function assumes panel is sorted by group and date. Note that panel data can be sorted using pdSort().
A strongly balanced panel dataset contains the same time points for each group. pdAllBalanced() examines the provided dataset to determine if it meets this condition.
- If groupvar is not provided, the function defaults to the first categorical or string variable in the dataframe.
- If datevar is not provided, the function defaults to the first date variable in the dataframe.
For datasets that are not strongly balanced, pdAllBalanced() returns 0.
See also:
See also
pdbalance(), pdSummary(), pdSize()