mvnTest#
Purpose#
Tests multivariate normality using one or more methods: Henze-Zirkler (default), Mardia’s skewness/kurtosis, Doornik-Hansen, or Royston.
Format#
- out = mvnTest(X[, ctl])#
- out = mvnTest(data, formula[, ctl])
- out = mvnTest(filename, formula[, ctl])
- Parameters:
X (NxK matrix) – data matrix with K >= 2 variables.
data (dataframe) – dataframe containing variables.
filename (string) – name of dataset.
formula (string) – formula string specifying variables, e.g.,
"x1 + x2 + x3".ctl (struct) –
Optional argument, instance of an
mvnTestControlstructure containing the following members:ctl.output
scalar, print results. Default = 1.
- 1:
Print results.
- 0:
Suppress output.
ctl.miss
scalar, missing value handling. Default = 0.
- 0:
Error if missing values present.
- 1:
Listwise deletion of rows with missing values.
ctl.method
string, test method to use. Default =
"hz"."hz":Henze-Zirkler test (recommended omnibus test).
"mardia":Mardia skewness and kurtosis tests.
"dh":Doornik-Hansen test.
"royston":Royston test (based on Shapiro-Wilk).
"all":Run all four methods.
- Returns:
out (struct) –
instance of
mvnTestOutstructure:out.n
scalar, sample size after any listwise deletion.
out.k
scalar, number of variables.
out.skewStat
scalar, Mardia normalized skewness statistic (approx N(0,1)).
out.skewP
scalar, p-value for skewness test.
out.kurtStat
scalar, Mardia normalized kurtosis statistic (approx N(0,1)).
out.kurtP
scalar, p-value for kurtosis test.
out.combStat
scalar, Mardia combined chi-squared statistic.
out.combP
scalar, p-value for combined test (chi-sq df=2).
out.hzStat
scalar, Henze-Zirkler test statistic.
out.hzP
scalar, Henze-Zirkler p-value (lognormal approximation).
out.hzBeta
scalar, Henze-Zirkler smoothing parameter.
out.dhStat
scalar, Doornik-Hansen chi-squared statistic.
out.dhP
scalar, Doornik-Hansen p-value.
out.dhDf
scalar, Doornik-Hansen degrees of freedom (2*k).
out.royStat
scalar, Royston H statistic.
out.royP
scalar, Royston p-value.
out.royDf
scalar, Royston equivalent degrees of freedom.
Examples#
Example 1: Basic usage with matrix input#
rndseed 42;
// Generate multivariate normal data
X = rndn(100, 3);
// Test normality using default Henze-Zirkler method
out = mvnTest(X);
Multivariate Normality Test
Observations: 100 Variables: 3
Henze-Zirkler Test (Henze & Zirkler 1990)
Test Statistic p-value
HZ 0.6210 7.4283e-01
Beta 1.4788
The large p-value (0.74) indicates a failure to reject normality, as expected for data generated from a multivariate normal distribution.
Example 2: Run all tests#
rndseed 42;
X = rndn(100, 3);
// Create control structure
struct mvnTestControl ctl;
ctl = mvnTestControlCreate();
ctl.method = "all";
// Run all four tests
out = mvnTest(X, ctl);
Multivariate Normality Tests
Observations: 100 Variables: 3
Mardia's Test (Mardia & Foster 1983)
Component Statistic p-value
Skewness -0.6600 7.4538e-01
Kurtosis -0.7006 7.5822e-01
Combined 0.7283 6.9480e-01
Henze-Zirkler Test (Henze & Zirkler 1990)
Test Statistic p-value
HZ 0.6210 7.4283e-01
Beta 1.4788
Doornik-Hansen Test (Doornik & Hansen 2008)
Test Statistic df p-value
DH 3.3778 6 7.6015e-01
Royston Test (Royston 1992)
Test Statistic eq.df p-value
H 1.4209 3.00 7.0081e-01
All four tests fail to reject the null hypothesis of multivariate normality.
Example 3: Checking individual results#
// Check individual results programmatically
if out.hzP < 0.05;
print "Henze-Zirkler rejects normality";
else;
print "Henze-Zirkler: fail to reject normality (p = " out.hzP ")";
endif;
Remarks#
The Henze-Zirkler test is recommended as the default because it has good power against a wide range of alternatives and is affine invariant.
For the Henze-Zirkler test, observations are limited to 5000 due to O(N^2) memory requirements.
The Royston test requires 4 <= N <= 2000 observations.
The Doornik-Hansen test requires N >= 8 observations.
All tests require at least 2 variables (K >= 2).
Fields in the output structure are set to missing (.) for methods not run.
References#
Henze, N. & Zirkler, B. (1990). “A Class of Invariant Consistent Tests for Multivariate Normality.” Communications in Statistics - Theory and Methods, 19(10), 3595-3617.
Mardia, K.V. & Foster, K. (1983). “Omnibus Tests of Multinormality Based on Skewness and Kurtosis.” Communications in Statistics - Theory and Methods, 12(2), 207-221.
Doornik, J.A. & Hansen, H. (2008). “An Omnibus Test for Univariate and Multivariate Normality.” Oxford Bulletin of Economics and Statistics, 70, 927-939.
Royston, J.P. (1992). “Approximating the Shapiro-Wilk W-Test for Non-Normality.” Statistics and Computing, 2, 117-119.
See also
Functions mvnTestControlCreate(), shapiroWilk()