varFit#
Purpose#
Fit a VAR(p) model by ordinary least squares.
Format#
- result = varFit(y)#
- result = varFit(y, p)
- result = varFit(y, ctl)
- Parameters:
y (TxM matrix or dataframe) – endogenous variables. If a dataframe, column names are used as variable labels in output. If a matrix, variables are labeled “Y1”, “Y2”, etc.
p (scalar) – Optional input, lag order. Default = 1.
ctl (struct) –
Optional input, an instance of a
varControlstructure. Overrides p if provided. Set ctl.xreg for exogenous regressors. An instance is initialized by callingvarControlCreate()and the following members can be set:ctl.p
Scalar, lag order. Default = 1.
ctl.include_const
Scalar, 1 to include a constant (intercept), 0 to exclude. Default = 1.
ctl.quiet
Scalar, set to 1 to suppress printed output. Default = 0.
- Returns:
result (struct) –
An instance of a
varResultstructure containing:result.m
Scalar, number of endogenous variables.
result.p
Scalar, lag order.
result.n_obs
Scalar, effective number of observations (T - p).
result.n_total
Scalar, total number of observations (T).
result.include_const
Scalar, 1 if a constant was included.
result.var_names
Mx1 string array, variable names.
result.b
Kxm matrix, OLS coefficient estimates. Row layout: lag 1 coefficients (m rows), lag 2 (m rows), …, lag p (m rows), exogenous (if any), constant (last row if included). Column j = equation j.
result.se
Kxm matrix, standard errors.
result.tstat
Kxm matrix, t-statistics.
result.pval
Kxm matrix, two-sided p-values.
result.sigma
mxm matrix, residual covariance (ML estimate).
result.vcov
(Km)x(Km) matrix, full variance-covariance of vec(B).
result.loglik
Scalar, log-likelihood.
result.aic
Scalar, Akaike information criterion.
result.bic
Scalar, Bayesian information criterion (Schwarz).
result.hq
Scalar, Hannan-Quinn information criterion.
result.companion
(mp)x(mp) matrix, companion form.
result.is_stationary
Scalar, 1 if all companion eigenvalues are inside the unit circle.
result.max_eigenvalue
Scalar, modulus of the largest companion eigenvalue.
result.residuals
(T-p)xm matrix, residuals.
result.fitted
(T-p)xm matrix, fitted values.
result.y
Txm matrix, original data.
result.xreg
TxK matrix, exogenous regressors. Empty matrix if none.
Model#
The reduced-form VAR(p) is:
where \(y_t\) is \(m \times 1\), each \(B_\ell\) is \(m \times m\), \(x_t\) are optional exogenous regressors, \(u\) is the intercept, and \(\Sigma\) is the \(m \times m\) error covariance.
Stacking the regressors into \(X = [Y_{-1} \; Y_{-2} \; \cdots \; Y_{-p} \; X_{\text{exo}} \; \mathbf{1}]\) (a \(T_{\text{eff}} \times K\) matrix where \(K = mp + n_{\text{exo}} + 1\)), the system is estimated equation-by-equation by OLS:
Standard errors, t-statistics, and information criteria (AIC, BIC, HQ) are computed from the OLS residuals.
Algorithm#
Construct lag matrices: Build \(Y\) (dependent) and \(X\) (regressors with lags, exogenous, constant) from the raw data, consuming the first \(p\) rows as initial conditions.
OLS estimation: Solve the normal equations via QR decomposition for numerical stability. Complexity: \(O(T K^2 m)\).
Residual covariance: ML estimate \(\hat\Sigma = (Y - X\hat{B})'(Y - X\hat{B}) / T_{\text{eff}}\).
Companion form: Construct the \(mp \times mp\) companion matrix and compute its eigenvalues to assess stability.
Information criteria:
\[\begin{split}\text{AIC} &= \log|\hat\Sigma| + \frac{2 K m}{T_{\text{eff}}} \\ \text{BIC} &= \log|\hat\Sigma| + \frac{K m \log T_{\text{eff}}}{T_{\text{eff}}} \\ \text{HQ} &= \log|\hat\Sigma| + \frac{2 K m \log \log T_{\text{eff}}}{T_{\text{eff}}}\end{split}\]
Complexity: Sub-millisecond for typical macro systems (m < 10, T < 500).
Examples#
Monetary Policy VAR#
new;
library timeseries;
// Load US macro quarterly data
data = loadd(getGAUSSHome("pkgs/timeseries/examples/macro.dat"));
// Fit VAR(4)
result = varFit(data, 4);
Output:
================================================================================
VAR(4) Variables: 3
Method: OLS Observations: 200
Constant: Yes Effective obs: 196
================================================================================
AIC: -12.384 BIC: -11.927 HQ: -12.198
Log-Lik: 1232.86 |Sigma|: 2.41e-08
================================================================================
Companion eigenvalues: 0.981 0.854 0.723 (all inside unit circle)
================================================================================
Equation 1: GDP
Coef Std.Err. t-stat p-value
--------------------------------------------------------------------------------
GDP(-1) 0.8731 0.0543 16.076 0.000
CPI(-1) -0.0218 0.0437 -0.499 0.618
FFR(-1) 0.0012 0.0089 0.135 0.893
Constant 0.0080 0.0012 6.588 0.000
================================================================================
...
Lag Order Selection#
Compare AIC across lag orders to choose p:
new;
library timeseries;
data = loadd(getGAUSSHome("pkgs/timeseries/examples/macro.dat"));
// Automatic selection
struct varResult best;
best = varLagSelect(data, 8); // Test p = 1..8
print "Selected lag order:" best.p;
// Or compare manually
for p (1, 8, 1);
result = varFit(data, p, quiet=1);
print "p=" p " AIC=" result.aic " BIC=" result.bic;
endfor;
VAR with Exogenous Regressors#
Include oil price as an exogenous variable:
new;
library timeseries;
y = loadd(getGAUSSHome("pkgs/timeseries/examples/macro.dat"), "gdp + cpi + ffr");
X = loadd(getGAUSSHome("pkgs/timeseries/examples/macro.dat"), "oil");
ctl = varControlCreate();
ctl.p = 2;
ctl.xreg = X;
result = varFit(y, ctl);
Troubleshooting#
Non-stationary VAR: If the companion matrix has eigenvalues outside the unit circle, the model implies explosive dynamics. This does not necessarily indicate an error — it may reflect unit roots in the data. Consider:
Differencing the data or using growth rates.
Using a BVAR (
bvarFit()) where the prior regularizes toward stationarity.If the data is cointegrated, a VECM may be more appropriate.
Singular X’X matrix:
This occurs when the regressor matrix is rank-deficient, typically from collinear
variables or too many lags relative to the sample size. Reduce p, remove collinear
variables, or use bvarFit() where the prior regularizes the problem.
Choosing p:
Use varLagSelect() to compare information criteria. BIC tends to select
parsimonious models (smaller p), AIC selects larger p. In quarterly macro data,
p = 4 (one year of lags) is a common starting point (Lutkepohl 2005, Section 4.3).
Remarks#
Coefficient layout in result.b:
The coefficient matrix result.b is Kxm where K = m*p + n_exo + include_const and m is the number of endogenous variables:
Rows |
Content |
|---|---|
1 to m |
Lag 1 coefficients (mxm block) |
m+1 to 2m |
Lag 2 coefficients |
… |
… |
(p-1)*m+1 to pm |
Lag p coefficients |
pm+1 to pm+n_exo |
Exogenous coefficients (if any) |
K |
Constant (if include_const = 1) |
Column j corresponds to equation j (variable j as dependent variable). This layout matches the standard convention in Lutkepohl (2005, Section 3.2.1).
Stability:
The companion form eigenvalues are computed automatically and printed in the summary. A VAR is stable (stationary) if all eigenvalues of the companion matrix have modulus strictly less than 1. Non-stationary models produce a warning but are not rejected — they may be appropriate for cointegrated systems.
When to use VAR vs BVAR:
For estimation and hypothesis testing with standard inference (t-stats, p-values,
Granger causality), use varFit. For forecasting, especially with m > 3 variables,
bvarFit() dominates in out-of-sample accuracy because the Minnesota prior
regularizes noisy cross-variable coefficients (Banbura, Giannone & Reichlin 2010).
Verification#
Verified against R vars 1.6-1 (R 4.5.2) with 22 tests at \(10^{-6}\) tolerance,
covering coefficients, \(\Sigma\), residuals, log-likelihood, companion eigenvalues,
IRF, FEVD, Granger causality, and forecasts on identical data (2-variable VAR(1),
300 observations, known DGP).
Additionally, verified against ECB BEAR Toolbox OLS output at \(10^{-8}\) tolerance on the 3-variable ECB default dataset (YER, HICSA, STN), confirming all 13 coefficients, 6 \(\Sigma\) elements, and companion eigenvalues match.
See gausslib-var/tests/r_benchmark.rs and the Verification and Cross-Validation page.
References#
Banbura, M., D. Giannone, and L. Reichlin (2010). “Large Bayesian vector auto regressions.” Journal of Applied Econometrics, 25(1), 71-92.
Lutkepohl, H. (2005). New Introduction to Multiple Time Series Analysis. Springer.
Sims, C.A. (1980). “Macroeconomics and reality.” Econometrica, 48(1), 1-48.
Library#
timeseries
Source#
var.src
See also
Functions varLagSelect(), bvarFit(), irfCompute(), grangerTest(), varCompanion(), varCoefTable()
See also
Guides Choosing a VAR Model, Verification and Cross-Validation