pdIsBalanced#
Purpose#
Checks if each group in a panel dataset covers the maximum time span.
Format#
- groupIsBalanced = pdIsBalanced(df[, groupvar, datevar])#
- Parameters:
df (Dataframe) – Contains long-form panel data with \(N_i \times T_i\) rows and K columns.
groupvar (String) – Optional, specifies the name of the variable used to identify group membership for panel observations. Defaults to the first categorical or string variable in the dataframe.
datevar (String) – Optional, specifies the name of the variable used to identify dates for panel observations. Defaults to the first date variable in the dataframe.
- Returns:
groupIsBalanced (Dataframe) – Indicates whether each group in the panel dataset spans the full time range of the dataset. Each group is assigned a value of 1 if it covers the full time span, 0 otherwise.
Examples#
// Load panel data and take the first 10 rows
pd = loadd(getGAUSSHome("examples/pd_ab.gdat"));
pd = pd[1:10,.];
print pd;
id year emp wage
1 1977-01-01 5.0409999 13.151600
1 1978-01-01 5.5999999 12.301800
1 1979-01-01 5.0149999 12.839500
1 1980-01-01 4.7150002 13.803900
1 1981-01-01 4.0929999 14.289700
1 1982-01-01 3.1659999 14.868100
1 1983-01-01 2.9360001 13.778400
2 1977-01-01 71.319000 14.790900
2 1978-01-01 70.642998 14.103600
2 1979-01-01 70.917999 14.953400
// Check to see if each group is balanced
is_balanced = pdIsBalanced(pd);
print is_balanced;
The code above will return:
id balanced
1 1.0000000
2 0.0000000
Remarks#
This function takes long-form panel data. To transform wide data to long-form data see dfLonger()
.
This function assumes panel is sorted by group and date. Note that panel data can be sorted using pdSort()
.
This function evaluates whether each group in a panel dataset spans the maximum time range observed across all groups.
If
groupvar
is not provided, the function defaults to the first categorical or string variable in the dataframe.If
datevar
is not provided, the function defaults to the first date variable in the dataframe.
The resulting dataframe contains each group and a corresponding indicator (1
or 0
) to represent whether the group covers the full time span.
See also