varFit#

Purpose#

Fit a VAR(p) model by ordinary least squares.

Format#

result = varFit(y)#
result = varFit(y, p)
result = varFit(y, ctl)
Parameters:
  • y (TxM matrix or dataframe) – endogenous variables. If a dataframe, column names are used as variable labels in output. If a matrix, variables are labeled “Y1”, “Y2”, etc.

  • p (scalar) – Optional input, lag order. Default = 1.

  • ctl (struct) –

    Optional input, an instance of a varControl structure. Overrides p if provided. Set ctl.xreg for exogenous regressors. An instance is initialized by calling varControlCreate() and the following members can be set:

    ctl.p

    Scalar, lag order. Default = 1.

    ctl.include_const

    Scalar, 1 to include a constant (intercept), 0 to exclude. Default = 1.

    ctl.quiet

    Scalar, set to 1 to suppress printed output. Default = 0.

Returns:

result (struct) –

An instance of a varResult structure containing:

result.m

Scalar, number of endogenous variables.

result.p

Scalar, lag order.

result.n_obs

Scalar, effective number of observations (T - p).

result.n_total

Scalar, total number of observations (T).

result.include_const

Scalar, 1 if a constant was included.

result.var_names

Mx1 string array, variable names.

result.b

Kxm matrix, OLS coefficient estimates. Row layout: lag 1 coefficients (m rows), lag 2 (m rows), …, lag p (m rows), exogenous (if any), constant (last row if included). Column j = equation j.

result.se

Kxm matrix, standard errors.

result.tstat

Kxm matrix, t-statistics.

result.pval

Kxm matrix, two-sided p-values.

result.sigma

mxm matrix, residual covariance (ML estimate).

result.vcov

(Km)x(Km) matrix, full variance-covariance of vec(B).

result.loglik

Scalar, log-likelihood.

result.aic

Scalar, Akaike information criterion.

result.bic

Scalar, Bayesian information criterion (Schwarz).

result.hq

Scalar, Hannan-Quinn information criterion.

result.companion

(mp)x(mp) matrix, companion form.

result.is_stationary

Scalar, 1 if all companion eigenvalues are inside the unit circle.

result.max_eigenvalue

Scalar, modulus of the largest companion eigenvalue.

result.residuals

(T-p)xm matrix, residuals.

result.fitted

(T-p)xm matrix, fitted values.

result.y

Txm matrix, original data.

result.xreg

TxK matrix, exogenous regressors. Empty matrix if none.

Model#

The reduced-form VAR(p) is:

\[y_t = B_1 y_{t-1} + B_2 y_{t-2} + \cdots + B_p y_{t-p} + \Phi x_t + u + \varepsilon_t, \quad \varepsilon_t \sim N(0, \Sigma)\]

where \(y_t\) is \(m \times 1\), each \(B_\ell\) is \(m \times m\), \(x_t\) are optional exogenous regressors, \(u\) is the intercept, and \(\Sigma\) is the \(m \times m\) error covariance.

Stacking the regressors into \(X = [Y_{-1} \; Y_{-2} \; \cdots \; Y_{-p} \; X_{\text{exo}} \; \mathbf{1}]\) (a \(T_{\text{eff}} \times K\) matrix where \(K = mp + n_{\text{exo}} + 1\)), the system is estimated equation-by-equation by OLS:

\[\hat{B} = (X'X)^{-1} X'Y, \qquad \hat{\Sigma} = \frac{1}{T_{\text{eff}}} (Y - X\hat{B})'(Y - X\hat{B})\]

Standard errors, t-statistics, and information criteria (AIC, BIC, HQ) are computed from the OLS residuals.

Algorithm#

  1. Construct lag matrices: Build \(Y\) (dependent) and \(X\) (regressors with lags, exogenous, constant) from the raw data, consuming the first \(p\) rows as initial conditions.

  2. OLS estimation: Solve the normal equations via QR decomposition for numerical stability. Complexity: \(O(T K^2 m)\).

  3. Residual covariance: ML estimate \(\hat\Sigma = (Y - X\hat{B})'(Y - X\hat{B}) / T_{\text{eff}}\).

  4. Companion form: Construct the \(mp \times mp\) companion matrix and compute its eigenvalues to assess stability.

  5. Information criteria:

    \[\begin{split}\text{AIC} &= \log|\hat\Sigma| + \frac{2 K m}{T_{\text{eff}}} \\ \text{BIC} &= \log|\hat\Sigma| + \frac{K m \log T_{\text{eff}}}{T_{\text{eff}}} \\ \text{HQ} &= \log|\hat\Sigma| + \frac{2 K m \log \log T_{\text{eff}}}{T_{\text{eff}}}\end{split}\]

Complexity: Sub-millisecond for typical macro systems (m < 10, T < 500).

Examples#

Monetary Policy VAR#

new;
library timeseries;

// Load US macro quarterly data
data = loadd(getGAUSSHome("pkgs/timeseries/examples/macro.dat"));

// Fit VAR(4)
result = varFit(data, 4);

Output:

================================================================================
VAR(4)                                  Variables:             3
Method: OLS                             Observations:        200
Constant: Yes                           Effective obs:       196
================================================================================
AIC:       -12.384          BIC:       -11.927          HQ:        -12.198
Log-Lik:    1232.86         |Sigma|:    2.41e-08
================================================================================
Companion eigenvalues: 0.981  0.854  0.723  (all inside unit circle)
================================================================================

Equation 1: GDP
                    Coef      Std.Err.     t-stat    p-value
--------------------------------------------------------------------------------
GDP(-1)            0.8731      0.0543     16.076      0.000
CPI(-1)           -0.0218      0.0437     -0.499      0.618
FFR(-1)            0.0012      0.0089      0.135      0.893
Constant           0.0080      0.0012      6.588      0.000
================================================================================
...

Lag Order Selection#

Compare AIC across lag orders to choose p:

new;
library timeseries;

data = loadd(getGAUSSHome("pkgs/timeseries/examples/macro.dat"));

// Automatic selection
struct varResult best;
best = varLagSelect(data, 8);   // Test p = 1..8

print "Selected lag order:" best.p;

// Or compare manually
for p (1, 8, 1);
    result = varFit(data, p, quiet=1);
    print "p=" p "  AIC=" result.aic "  BIC=" result.bic;
endfor;

VAR with Exogenous Regressors#

Include oil price as an exogenous variable:

new;
library timeseries;

y = loadd(getGAUSSHome("pkgs/timeseries/examples/macro.dat"), "gdp + cpi + ffr");
X = loadd(getGAUSSHome("pkgs/timeseries/examples/macro.dat"), "oil");

ctl = varControlCreate();
ctl.p = 2;
ctl.xreg = X;

result = varFit(y, ctl);

Troubleshooting#

Non-stationary VAR: If the companion matrix has eigenvalues outside the unit circle, the model implies explosive dynamics. This does not necessarily indicate an error — it may reflect unit roots in the data. Consider:

  • Differencing the data or using growth rates.

  • Using a BVAR (bvarFit()) where the prior regularizes toward stationarity.

  • If the data is cointegrated, a VECM may be more appropriate.

Singular X’X matrix: This occurs when the regressor matrix is rank-deficient, typically from collinear variables or too many lags relative to the sample size. Reduce p, remove collinear variables, or use bvarFit() where the prior regularizes the problem.

Choosing p: Use varLagSelect() to compare information criteria. BIC tends to select parsimonious models (smaller p), AIC selects larger p. In quarterly macro data, p = 4 (one year of lags) is a common starting point (Lutkepohl 2005, Section 4.3).

Remarks#

Coefficient layout in result.b:

The coefficient matrix result.b is Kxm where K = m*p + n_exo + include_const and m is the number of endogenous variables:

Rows

Content

1 to m

Lag 1 coefficients (mxm block)

m+1 to 2m

Lag 2 coefficients

(p-1)*m+1 to pm

Lag p coefficients

pm+1 to pm+n_exo

Exogenous coefficients (if any)

K

Constant (if include_const = 1)

Column j corresponds to equation j (variable j as dependent variable). This layout matches the standard convention in Lutkepohl (2005, Section 3.2.1).

Stability:

The companion form eigenvalues are computed automatically and printed in the summary. A VAR is stable (stationary) if all eigenvalues of the companion matrix have modulus strictly less than 1. Non-stationary models produce a warning but are not rejected — they may be appropriate for cointegrated systems.

When to use VAR vs BVAR: For estimation and hypothesis testing with standard inference (t-stats, p-values, Granger causality), use varFit. For forecasting, especially with m > 3 variables, bvarFit() dominates in out-of-sample accuracy because the Minnesota prior regularizes noisy cross-variable coefficients (Banbura, Giannone & Reichlin 2010).

Verification#

Verified against R vars 1.6-1 (R 4.5.2) with 22 tests at \(10^{-6}\) tolerance, covering coefficients, \(\Sigma\), residuals, log-likelihood, companion eigenvalues, IRF, FEVD, Granger causality, and forecasts on identical data (2-variable VAR(1), 300 observations, known DGP).

Additionally, verified against ECB BEAR Toolbox OLS output at \(10^{-8}\) tolerance on the 3-variable ECB default dataset (YER, HICSA, STN), confirming all 13 coefficients, 6 \(\Sigma\) elements, and companion eigenvalues match.

See gausslib-var/tests/r_benchmark.rs and the Verification and Cross-Validation page.

References#

  • Banbura, M., D. Giannone, and L. Reichlin (2010). “Large Bayesian vector auto regressions.” Journal of Applied Econometrics, 25(1), 71-92.

  • Lutkepohl, H. (2005). New Introduction to Multiple Time Series Analysis. Springer.

  • Sims, C.A. (1980). “Macroeconomics and reality.” Econometrica, 48(1), 1-48.

Library#

timeseries

Source#

var.src