bvarFit#

Purpose#

Fit a Bayesian VAR with conjugate Minnesota prior.

Format#

result = bvarFit(y)#

result = bvarFit(y, p=1, prior="minnesota", n_draws=5000, seed=42, overall_tightness=0.2, xreg={}, quiet=0, ctl={})

Parameters:

y (TxM matrix or dataframe) – endogenous variables. If a dataframe, column names are used as variable labels in output. If a matrix, variables are labeled “Y1”, “Y2”, etc.
p (scalar) – Optional keyword, lag order. Default = 1.
prior (string) – Optional keyword, prior type. "minnesota" for conjugate Normal-Inverse-Wishart (default), "flat" for diffuse prior estimated via Gibbs sampling.
n_draws (scalar) – Optional keyword, number of posterior draws. Default = 5000.
seed (scalar) – Optional keyword, RNG seed for reproducible posterior draws. Default = 42.
overall_tightness (scalar) – Optional keyword, overall tightness. Controls how much data vs prior matters. Default = 0.2.
xreg (TxK matrix or dataframe) – Optional keyword, exogenous regressors included in every equation. Pass {} (default) for no exogenous variables.
quiet (scalar) – Optional keyword, set to 1 to suppress printed output. Default = 0.

ctl (struct) –

Optional keyword, an instance of a bvarControl structure. An instance is initialized by calling bvarControlCreate() and the following members can be set:

ctl.p

Scalar, lag order. Default = 1.

ctl.const

Scalar, 1 to include a constant, 0 to exclude. Default = 1.

ctl.prior

String, prior type.

`"minnesota"`	Conjugate Normal-Inverse-Wishart Minnesota prior. (Default)
`"flat"`	Diffuse prior with Gibbs sampling.

ctl.overall_tightness

Scalar, overall tightness. Controls how much data vs prior matters. Smaller values = tighter prior. Default = 0.2.

ctl.cross_shrinkage

Scalar, cross-variable shrinkage. Other variables’ lags are shrunk by this factor relative to own lags. Default = 0.5.

ctl.lag_decay

Scalar, lag decay. Higher lags are shrunk by \(\ell^{-\lambda_3}\). Default = 1.0.

ctl.constant_tightness

Scalar, constant tightness. Default = 1e5 (effectively uninformative).

ctl.exogenous_tightness

Scalar, exogenous variable tightness. Default = 1.0.

ctl.soc_tightness

Scalar, sum-of-coefficients tightness (Doan, Litterman & Sims 1984). Set to 0 to disable. Typical range: 1-10. Default = 0 (disabled).

ctl.sur_tightness

Scalar, single-unit-root tightness (Sims 1993). Set to 0 to disable. Typical range: 1-10. Default = 0 (disabled).

ctl.exogenous_scale

Scalar, exogenous regressor prior tightness. Default = 1.0.

ctl.ar

Scalar, AR(1) prior mean for own-lag coefficients.

1.0	Random walk prior (for levels data). (Default)
0.0	White noise prior (for stationary/growth rate data).

ctl.alpha0

Scalar, Inverse-Wishart degrees of freedom. Default = 0, which uses m+2 (least informative proper prior).

ctl.n_draws

Scalar, number of posterior draws. Default = 5000.

ctl.seed

Scalar, random number generator seed for reproducibility. Default = 42.

ctl.quiet

Scalar, set to 1 to suppress printed output. Default = 0.

Returns:

result (struct) –

An instance of a bvarResult structure containing:

result.m	Scalar, number of endogenous variables.
result.p	Scalar, lag order.
result.n_obs	Scalar, effective number of observations (T - p).
result.n_total	Scalar, total number of observations (T).
result.const	Scalar, 1 if a constant was included.
result.var_names	Mx1 string array, variable names.
result.prior_type	String, `"minnesota"` or `"flat"`.
result.b_mean	Kxm matrix, posterior mean of B.
result.b_median	Kxm matrix, posterior median of B.
result.b_sd	Kxm matrix, posterior standard deviation of B.
result.b_lower	Kxm matrix, 16th percentile of posterior (lower 68% credible band).
result.b_upper	Kxm matrix, 84th percentile of posterior (upper 68% credible band).
result.sigma_mean	mxm matrix, posterior mean of the error covariance \(\Sigma\).
result.log_ml	Scalar, log marginal likelihood. Only available for conjugate Minnesota prior; missing otherwise.
result.aic	Scalar, Akaike information criterion (evaluated at posterior mean).
result.bic	Scalar, Bayesian information criterion.
result.hq	Scalar, Hannan-Quinn information criterion.
result.companion_mean	(mp)x(mp) matrix, companion matrix at posterior mean.
result.is_stationary	Scalar, 1 if stationary at posterior mean.
result.max_eigenvalue	Scalar, largest eigenvalue modulus at posterior mean.
result.residuals	(T-p)xm matrix, residuals at posterior mean.
result.b_draws	(n_drawsK)xm matrix, stacked raw posterior draws of B. Draw d is rows [(d-1)K+1 : d*K, .].
result.sigma_draws	(n_drawsm)xm matrix, stacked raw posterior draws of \(\Sigma\). Draw d is rows [(d-1)m+1 : d*m, .].
result.y	Txm matrix, original data.
result.xreg	TxK matrix, exogenous regressors. Empty matrix if none.
result.n_draws	Scalar, number of retained draws.
result.seed	Scalar, RNG seed used.

Examples#

Monetary Policy VAR on US Macro Data#

Estimate a 3-variable BVAR(4) on GDP growth, CPI inflation, and the federal funds rate:

new;
library timeseries;

// Load US macro quarterly data
fname = getGAUSSHome("pkgs/timeseries/examples/data/us_macro_quarterly.csv");
data = loadd(fname);

result = bvarFit(data, p=4, ar=0);

Output:

================================================================================
BVAR(4) with Conjugate Minnesota Prior    Variables:             3
Draws: 5000                               Observations:        200
Prior: minnesota (conjugate NIW)          Effective obs:       196
================================================================================

Posterior Mean of B (68% Credible Intervals)
Equation: GDP
                                Mean    Std.Dev      [16%      84%]
-----------------------------------------------------------------------
GDP(-1)                       0.2414     0.0724    0.1695    0.3126
CPI(-1)                       0.0312     0.0485   -0.0170    0.0798
FFR(-1)                      -0.0031     0.0074   -0.0105    0.0043
    ⋮
GDP(-4)                       0.0189     0.0612   -0.0418    0.0801
CPI(-4)                      -0.0104     0.0473   -0.0574    0.0369
FFR(-4)                       0.0008     0.0071   -0.0063    0.0079
Constant                      0.0023     0.0018   -0.0013    0.0059

Log marginal likelihood: -812.34
================================================================================

Compare Lag Orders with Bayes Factors#

new;
library timeseries;

fname = getGAUSSHome("pkgs/timeseries/examples/data/us_macro_quarterly.csv");
data = loadd(fname);

r1 = bvarFit(data, p=1, quiet=1);
r2 = bvarFit(data, p=2, quiet=1);
r4 = bvarFit(data, p=4, quiet=1);

print "Log ML(p=1):" r1.log_ml;
print "Log ML(p=2):" r2.log_ml;
print "Log ML(p=4):" r4.log_ml;

// Bayes factor: p=4 vs p=2
print "BF(4 vs 2):" exp(r4.log_ml - r2.log_ml);

A Bayes factor above 3 is “substantial evidence” (Kass & Raftery 1995); above 20 is “strong.”

Forecasting GDP with SOC/SUR Priors#

Sum-of-coefficients and single-unit-root priors stabilize long-horizon forecasts for levels data:

new;
library timeseries;

fname = getGAUSSHome("pkgs/timeseries/examples/data/us_macro_quarterly.csv");
data = loadd(fname);

ctl = bvarControlCreate();
ctl.soc_tightness = 5;          // Sum-of-coefficients
ctl.sur_tightness = 5;          // Single-unit-root

result = bvarFit(data, p=4, ctl=ctl);

// 8-step-ahead forecast
fc = bvarForecast(result, 8);

Data-Driven Hyperparameters (GLP 2015)#

Let the marginal likelihood choose all \(\lambda\) values:

new;
library timeseries;

fname = getGAUSSHome("pkgs/timeseries/examples/data/us_macro_quarterly.csv");
data = loadd(fname);

// Optimize overall_tightness, soc_tightness, sur_tightness jointly
ho = bvarHyperopt(data);

print "Optimal overall_tightness:" ho.overall_tightness;
print "Optimal soc_tightness:" ho.soc_tightness;
print "Optimal sur_tightness:" ho.sur_tightness;

// Fit with optimized hyperparameters
result = bvarFit(data, ctl=ho.ctl);

This implements Algorithm 1 of Giannone, Lenza & Primiceri (2015), which maximizes the log marginal likelihood over a grid of hyperparameter values.

Model#

The BVAR(p) model is:

\[y_t = B_1 y_{t-1} + B_2 y_{t-2} + \cdots + B_p y_{t-p} + u + \varepsilon_t, \quad \varepsilon_t \sim N(0, \Sigma)\]

where \(y_t\) is an \(m \times 1\) vector, each \(B_\ell\) is \(m \times m\), \(u\) is an \(m \times 1\) intercept, and \(\Sigma\) is the \(m \times m\) error covariance matrix.

Stacking all coefficients, \(B = [B_1 \; B_2 \; \cdots \; B_p \; u]'\) is \(K \times m\) where \(K = mp + 1\).

Prior: The default Minnesota prior (Kadiyala & Karlsson 1997) places a conjugate Normal-Inverse-Wishart prior on \((B, \Sigma)\):

\[\begin{split}\text{vec}(B) | \Sigma &\sim N\bigl(\text{vec}(B_0),\; \Sigma \otimes \Omega\bigr) \\ \Sigma &\sim IW(S_0, \alpha_0)\end{split}\]

The prior mean \(B_0\) encodes the belief that each variable follows a random walk (when ar = 1) or white noise (when ar = 0). Cross-variable coefficients are centered at zero.

The diagonal prior covariance \(\Omega\) is governed by the \(\lambda\) hyperparameters:

\[\begin{split}\Omega_{j,\ell} = \begin{cases} (\lambda_1 / \ell^{\lambda_3})^2 & \text{own lag } \ell \\ (\lambda_1 \lambda_2 / \ell^{\lambda_3})^2 \cdot (\hat\sigma_j^2 / \hat\sigma_i^2) & \text{cross lag from variable } j \text{ to equation } i \\ (\lambda_1 \lambda_4)^2 & \text{constant} \end{cases}\end{split}\]

where \(\hat\sigma_i^2\) are residual variances from univariate AR(p) regressions.

The prior scale \(S_0 = (\alpha_0 - m - 1) \cdot \text{diag}(\hat\sigma_1^2, \ldots, \hat\sigma_m^2)\) centers the prior on the univariate residual variances.

Posterior: The conjugate prior yields a closed-form posterior:

\[\begin{split}\Sigma | Y &\sim IW(\bar{S}, \bar{\alpha}) \\ \text{vec}(B) | \Sigma, Y &\sim N\bigl(\text{vec}(\bar{B}),\; \Sigma \otimes \bar{\Phi}\bigr)\end{split}\]

Draws are exact — no MCMC iteration, no burn-in, no convergence diagnostics needed. The log marginal likelihood is available in closed form for formal Bayesian model comparison.

Algorithm#

OLS pre-estimation: Fit univariate AR(p) models to each variable to obtain \(\hat\sigma_i^2\), used to scale the prior.
Prior construction: Build \(B_0\), \(\Omega\), \(S_0\), \(\alpha_0\) from the hyperparameters and AR residual variances.
Posterior update: Apply the Normal-Inverse-Wishart conjugate update (Kadiyala & Karlsson 1997, Eqs. 12-14):

\[\begin{split}\bar{\Phi} &= (X'X + \Omega^{-1})^{-1} \\ \bar{B} &= \bar{\Phi}(X'Y + \Omega^{-1} B_0) \\ \bar{S} &= S_0 + \hat{S} + (B_0 - \hat{B})' (\Omega + (X'X)^{-1})^{-1} (B_0 - \hat{B}) \\ \bar{\alpha} &= \alpha_0 + T\end{split}\]
Draw from posterior: Sample \(\Sigma \sim IW(\bar{S}, \bar{\alpha})\) then \(B | \Sigma \sim N(\bar{B}, \Sigma \otimes \bar{\Phi})\). Each draw is independent (no Markov chain).
Sum-of-coefficients and single-unit-root priors (when soc_tightness > 0 or sur_tightness > 0): Implemented via dummy observations appended to the data before the posterior update (Doan, Litterman & Sims 1984; Sims 1993).

Complexity: \(O(K^2 m)\) for the posterior update, plus \(O(K^3)\) per draw for the Cholesky factorization. With 5,000 draws on a 3-variable VAR(4), typical wall-clock time is 0.05–0.10 seconds.

Hyperparameter Guide#

Parameter	Default	Guidance
overall_tightness	0.2	Overall tightness. Smaller = prior dominates. For a small system (m=3), 0.1–0.2 works well. For large systems (m > 10), tighter values (0.01–0.05) prevent overfitting. Use `bvarHyperopt()` to optimize automatically (Giannone, Lenza & Primiceri 2015).
cross_shrinkage	0.5	Cross-variable shrinkage. A value of 0.5 means other variables’ lags are shrunk twice as much as own lags. Range: 0.1–1.0.
lag_decay	1.0	Lag decay exponent. Higher lags are shrunk by \(\ell^{-\lambda_3}\). Default of 1.0 is standard. Values above 2 aggressively penalize distant lags.
constant_tightness	1e5	Constant tightness. Default is effectively uninformative. Set to 100 if you want the prior to also regularize the intercept (as in BEAR Toolbox).
soc_tightness	0 (off)	Sum-of-coefficients prior (Doan, Litterman & Sims 1984). Pulls lag coefficient sums toward the identity, preventing explosive long-horizon forecasts. Typical values: 1–10. Essential for levels data when forecasting beyond 4 steps.
sur_tightness	0 (off)	Single-unit-root prior (Sims 1993). Pulls all variables toward a common stochastic trend, stabilizing cointegrated systems. Typical values: 1–10.
ar	1.0	Prior mean for own first lag. Set to 1 for levels data (random walk prior). Set to 0 for growth rates or stationary data (white noise prior). Set to 0.8 for “mostly persistent” data. See the Choosing a VAR Model guide.
alpha0	0 (= m+2)	Inverse-Wishart degrees of freedom. Default of m+2 is the least informative proper prior. Increase for stronger prior on \(\Sigma\).

Troubleshooting#

Non-stationary posterior mean: The largest eigenvalue of the companion matrix exceeds 1. This means the posterior mean coefficients imply explosive dynamics. Common fixes:

Set ar = 0 if your data is in growth rates (you may be using the wrong prior).
Increase soc_tightness (sum-of-coefficients) to pull lag sums toward unity.
Tighten the prior (reduce overall_tightness).

Prior too tight / too loose: If all coefficients are near zero, the prior is too tight — increase overall_tightness or use bvarHyperopt(). If the posterior equals OLS (credible bands match frequentist confidence intervals), the prior is too loose — decrease overall_tightness.

“Log ML is missing”: The log marginal likelihood is only available for the conjugate Minnesota prior (prior = "minnesota"). For flat priors, consider using the DIC or WAIC instead.

Levels vs growth rates: This is the single most common specification error. If your data is in levels (GDP, not GDP growth), set ar = 1 (random walk prior). If in growth rates, set ar = 0. Using the wrong setting will produce either explosive forecasts (ar=0 on levels) or excessive shrinkage (ar=1 on growth rates). See the Choosing a VAR Model guide.

Verification#

bvarFit has been verified against two independent reference implementations:

R vars package (OLS component): 22 tests at \(10^{-6}\) tolerance against R 4.5.2 vars::VAR(), covering coefficients, \(\Sigma\), IRF, FEVD, Granger causality, and forecasts on identical data.

R BVAR package (Bayesian posterior): 7 structural validation tests against the R BVAR package (Kuschnig & Vashold 2021) using 200,000-draw ground truth. Validates:

Conjugate posterior RMSE < Gibbs RMSE < 1.0 vs R reference
\(\Sigma\) elements within 50% relative error across three prior forms
Shrinkage toward \(B_0\) exceeds 60% for all methods

ECB BEAR Toolbox: 45 matched-prior coefficient tests (overall_tightness=0.1, ar=0.8, constant_tightness=100) and 17 IRF tests at horizons 0, 10, and 20 against BEAR v5.0. OLS components match to \(10^{-8}\). BVAR posterior means agree within 0.06 (prior-form difference between conjugate and independent Normal-Wishart).

Remarks#

Conjugate draws are exact: With prior = "minnesota" (default), the posterior is available in closed form. All draws are independent (no MCMC chain, no burn-in, no thinning needed). For stochastic volatility or non-conjugate priors, use bvarSvFit().

Log marginal likelihood: result.log_ml is only available for the conjugate Minnesota prior (closed-form computation). It can be used for formal Bayesian model comparison — the model with the highest log ML is preferred. The Bayes factor between models A and B is \(\exp(\log ML_A - \log ML_B)\). See Kass & Raftery (1995) for interpretation guidelines.

When to use BVAR instead of OLS VAR: A BVAR with Minnesota prior always weakly dominates an OLS VAR in forecast accuracy (Banbura, Giannone & Reichlin 2010). The prior acts as regularization, reducing out-of-sample forecast error by shrinking small, noisy coefficients toward zero. This benefit grows with the number of variables. For m > 5, BVAR is strongly preferred.

References#

Banbura, M., D. Giannone, and L. Reichlin (2010). “Large Bayesian vector auto regressions.” Journal of Applied Econometrics, 25(1), 71-92.
Doan, T., R. Litterman, and C. Sims (1984). “Forecasting and conditional projection using realistic prior distributions.” Econometric Reviews, 3, 1-100.
Giannone, D., M. Lenza, and G. E. Primiceri (2015). “Prior selection for vector autoregressions.” Review of Economics and Statistics, 97(2), 436-451.
Kadiyala, K.R. and S. Karlsson (1997). “Numerical methods for estimation and inference in Bayesian VAR-models.” Journal of Applied Econometrics, 12(2), 99-132.
Kass, R.E. and A.E. Raftery (1995). “Bayes factors.” Journal of the American Statistical Association, 90(430), 773-795.
Sims, C. (1993). “A nine-variable probabilistic macroeconomic forecasting model.” In Business Cycles, Indicators, and Forecasting, 179-212. NBER.

Library#

timeseries

Source#

bvar.src