mvnTest#
Purpose#
Tests multivariate normality using one or more methods: Henze-Zirkler (default), Mardia’s skewness/kurtosis, Doornik-Hansen, or Royston.
Format#
- out = mvnTest(X[, ctl])#
- out = mvnTest(data, formula[, ctl])
- out = mvnTest(filename, formula[, ctl])
- Parameters:
X (NxK matrix) – data matrix with K >= 2 variables.
data (dataframe) – dataframe containing variables.
filename (string) – name of dataset.
formula (string) – formula string specifying variables, e.g.,
"x1 + x2 + x3".ctl (struct) –
Optional argument, instance of an
mvnTestControlstructure containing the following members:ctl.output
scalar, print results. Default = 1.
- 1:
Print results.
- 0:
Suppress output.
ctl.miss
scalar, missing value handling. Default = 0.
- 0:
Error if missing values present.
- 1:
Listwise deletion of rows with missing values.
ctl.method
string, test method to use. Default =
"hz"."hz":Henze-Zirkler test (recommended omnibus test).
"mardia":Mardia skewness and kurtosis tests.
"dh":Doornik-Hansen test.
"royston":Royston test (based on Shapiro-Wilk).
"all":Run all four methods.
- Returns:
out (struct) –
instance of
mvnTestOutstructure:out.n
scalar, sample size after any listwise deletion.
out.k
scalar, number of variables.
out.skewStat
scalar, Mardia normalized skewness statistic (approx N(0,1)).
out.skewP
scalar, p-value for skewness test.
out.kurtStat
scalar, Mardia normalized kurtosis statistic (approx N(0,1)).
out.kurtP
scalar, p-value for kurtosis test.
out.combStat
scalar, Mardia combined chi-squared statistic.
out.combP
scalar, p-value for combined test (chi-sq df=2).
out.hzStat
scalar, Henze-Zirkler test statistic.
out.hzP
scalar, Henze-Zirkler p-value (lognormal approximation).
out.hzBeta
scalar, Henze-Zirkler smoothing parameter.
out.dhStat
scalar, Doornik-Hansen chi-squared statistic.
out.dhP
scalar, Doornik-Hansen p-value.
out.dhDf
scalar, Doornik-Hansen degrees of freedom (2*k).
out.royStat
scalar, Royston H statistic.
out.royP
scalar, Royston p-value.
out.royDf
scalar, Royston equivalent degrees of freedom.
Examples#
Example 1: Basic usage with matrix input#
// Generate multivariate normal data
X = rndn(100, 3);
// Test normality using default Henze-Zirkler method
out = mvnTest(X);
Example 2: Using dataframe with formula#
// Load data
data = loadd("mydata.csv");
// Test specific variables
out = mvnTest(data, "income + age + education");
Example 3: Run all tests with control structure#
// Create control structure
struct mvnTestControl ctl;
ctl = mvnTestControlCreate();
ctl.method = "all";
// Run all four tests
out = mvnTest(X, ctl);
// Check individual results
if out.hzP < 0.05;
print "Henze-Zirkler rejects normality";
endif;
Remarks#
The Henze-Zirkler test is recommended as the default because it has good power against a wide range of alternatives and is affine invariant.
For the Henze-Zirkler test, observations are limited to 5000 due to O(N^2) memory requirements.
The Royston test requires 4 <= N <= 2000 observations.
The Doornik-Hansen test requires N >= 8 observations.
All tests require at least 2 variables (K >= 2).
Fields in the output structure are set to missing (.) for methods not run.
References#
Henze, N. & Zirkler, B. (1990). “A Class of Invariant Consistent Tests for Multivariate Normality.” Communications in Statistics - Theory and Methods, 19(10), 3595-3617.
Mardia, K.V. & Foster, K. (1983). “Omnibus Tests of Multinormality Based on Skewness and Kurtosis.” Communications in Statistics - Theory and Methods, 12(2), 207-221.
Doornik, J.A. & Hansen, H. (2008). “An Omnibus Test for Univariate and Multivariate Normality.” Oxford Bulletin of Economics and Statistics, 70, 927-939.
Royston, J.P. (1992). “Approximating the Shapiro-Wilk W-Test for Non-Normality.” Statistics and Computing, 2, 117-119.
See also
Functions mvnTestControlCreate(), shapiroWilk()