Asian Science Citation Index is committed to provide an authoritative, trusted and significant information by the coverage of the most important and influential journals to meet the needs of the global scientific community.  
ASCI Database
308-Lasani Town,
Sargodha Road,
Faisalabad, Pakistan
Fax: +92-41-8815544
Contact Via Web
Suggest a Journal
Journal of Mathematics and Statistics
Year: 2009  |  Volume: 5  |  Issue: 1  |  Page No.: 54 - 62

A Robust Rescaled Moment Test for Normality in Regression

Md.Sohel Rana, Habshah Midi and A.H.M. Rahmatullah Imon    

Abstract: Problem statement: Most of the statistical procedures heavily depend on normality assumption of observations. In regression, we assumed that the random disturbances were normally distributed. Since the disturbances were unobserved, normality tests were done on regression residuals. But it is now evident that normality tests on residuals suffer from superimposed normality and often possess very poor power. Approach: This study showed that normality tests suffer huge set back in the presence of outliers. We proposed a new robust omnibus test based on rescaled moments and coefficients of skewness and kurtosis of residuals that we call robust rescaled moment test. Results: Numerical examples and Monte Carlo simulations showed that this proposed test performs better than the existing tests for normality in the presence of outliers. Conclusion/Recommendation: We recommend using our proposed omnibus test instead of the existing tests for checking the normality of the regression residuals.



yi = The i-th observed response


= A px1 vector of predictors
β = A px1 vector of unknown finite parameters
= Uncorrelated random errors with mean 0 and variance σ2

Writing Y = (y1,y2,….yn)T , X = (x1,x2,…xn)T and ∈ = (∈1, ∈2,… ∈n)T the model is:

Y = X β+∈

where, E(∈) = 0, V(∈) = σ2I and I is an identity matrix of order n.

Now we establish general expressions for the moments, coefficients of skewness and kurtosis of the true errors assuming only that the errors are uncorrelated, zero mean and identically distributed and the existence of their first four moments. Let us define the k-th moment about the origin of the i-th error by:


As E(∈) = 0, on the null model, we need not distinguish here between moments about the origin and those about the mean. Simple forms of the coefficients of skewness and kurtosis are given by:


In practice population moments are often estimated by sample moments. We generally define the k-th sample moment by:


and hence the raw coefficients of skewness and kurtosis can be defined as:


Much work has been done in producing omnibus tests for normality, combining S and K in one test statistic. So a large deviation either of S from 0 or K from 3 should give a significant result, regardless of which one deviates from normal values. D'Agostino and Pearson[6] first suggested this kind of test. However, it requires an assumption of independence between S and K which is only asymptotically correct. Bowman and Shenton[2] suggested a normality test, popularly known as Jarque-Bera test, with the test statistic:



σ2(S) = 6/n and σ2(K) = 24/n are the asymptotic variance of S and K respectively
S = Under normality asymptotically, is distributed as x2 distribution with 2 degrees of freedom

Gel and Gastwirth[8] give a robust version of the Jarque-Bera test using a robust estimate of spread which is less influenced by outliers in the denominators of the sample estimates of skewness and kurtosis. They consider the average absolute deviation from the sample median (MAAD) proposed by Gastwirth[7] and is defined by:


where, . The robust sample estimates of skewness and kurtosis are , where are the 3rd and 4th order of the estimated sample moments respectively, which lead the development of Gel and Gastwirth Robust Jarque-Bera (RJB) test statistic:


To obtain the constants B1 and B2, we need to find the expressions for for a finite sample size n, which can be calculated as suggested by Geary[10]. However, such calculations are quite tedious and are not of practical use since the convergence of estimators of kurtosis to the asymptotic normal distribution is very slow. We obtain B1 and B2 from the Monte Carlo simulation results given by Gel and Gastwirth[8]. In particular, if one desires to preserve the nominal level of 0.05, they recommend B1 = 6 and B2 = 64.

The above test procedures are designed in a fashion that we know the value of each observation and we can easily compute the value of a test statistic and can come to a conclusion. It is a common practice over the years to use the Ordinary Least Squares (OLS) residuals as substitutes of true errors but there is evidence that residuals have very different skewness and kurtosis than their corresponding true errors and hence normality tests on OLS residuals may perform poorly unless the test statistics are modified.

Let us assume that X is an nxp matrix of full column rank p, the OLS estimator of β is . The OLS residuals are given by:


In matrix notation, the residual vector is:


The residual vector can also be expressed in terms of unobservable errors as:


We shall follow Chatterjee and Hadi[3] by referring to as the residual hat matrix. The elements hij of this matrix will be termed as residual hat elements, which play a very important role in linear regression. The quantities are often referred to as leverage values which measure how far the input vector xi are from the rest of the data. Let the k-th moment of the i-th OLS residual be defined as:


for i = 1,…,n, where we note that . Using Eq. 12, the i-th residual can be expressed in terms of true errors as:


Hence the second, third and fourth order moments of OLS residuals can be expressed[11] as:




The coefficients of skewness and kurtosis of are:


In the case of OLS residuals, their sample mean is zero and therefore, the k-th sample moment is defined as:


The sample coefficients of skewness and kurtosis on OLS residuals are then directly obtained from (5) by substituting ∈ by in it and thus coefficients of skewness and kurtosis based on OLS residuals can be defined as:


Imon[13] shows that the sample moments as defined in (19) are biased and he also suggests unbiased estimates of the second, third and fourth order moments defined as:




Imon[13] also suggests some simple approximations for the scaling factors which avoid the necessity of additional computation with the hij's after the regression. He shows that an expression such as , where k>1, is often dominated by the term . When also summing over i, the replacement of by the k-th power of the average value of hij may be considered as possible approximations. Thus we might approximate:




Substituting the values (24) and (25) in (21-23), Imon[13] suggests Rescaled Moments (RM) of OLS residuals as:




where, c = n/(n-p). Using the above approximations, the rescaled coefficients of skewness and kurtosis become:


and the Rescaled Moment (RM) normality test statistic is defined as:


Likewise the JB statistic, the RM statistic follows a chi-square distribution with 2 degrees of freedom. Now using the average absolute deviation from the sample median (MAAD) as a robust estimate of spread we can define the Robust Rescaled Moment (RRM) test statistic as:


Under the null hypothesis of normality, the RRM test statistic asymptotically follows a chi-square distribution with 2 degrees of freedom. B1andB2 are computed similar to the RJB test statistic in Eq. 9 as suggested by Geary[10].


Numerical examples: We consider few real life data sets for testing normality assumption in the presence of outliers.

Belgian road accident data: Our first example presents the number of road accidents in Belgium recorded between 1975 and 1981 taken from Rousseeuw and Leroy[19] and shown in Table 1. It has been reported by many authors that the Belgian road accident data contains a single outlier (record of 1979) which must cause nonnormality of residuals. Table 2 exemplifies the power of normality tests of this data.

Shelf-stocking data: Next we consider the shelf-stocking data given by Montgomery et al.[17].

Table 1: Belgian road accident data

Table 2: Power of normality tests for Belgian road accident data

Table 3: Shelf- stocking data (original and modified)

Table 4: Power of normality tests for original data modified Shelf- stocking data

This data presents the time required for a merchandiser to stock a grocery store shelf with a soft drink product as well as the number of cases of product stocked. We deliberately change one data point (putted in parenthesis) to get an outlier and both the original and modified data are shown in Table 3. Table 4 shows the power of normality test of this original and modified data.

Power simulations: We carry out a simulation experiment to compare the performance of the newly proposed RRM test with the existing tests of normality, in particular, with the Jarque-Bera, the Rescaled Moment (RM) and the Robust Jarque-Bera (RJB) tests. The estimated powers of these tests under various distributions for different sample sizes are shown in Table 5-7. We consider simulated power for ten different distributions; the normal, the exponential, the t-distribution with 3 and 5 degree of freedom, the logistic, the Cauchy, the exponential, the log-normal and the contaminated normal distributions with shifts in location, scale and containing outliers.

Table 5: Simulated power of different tests for normality for n = 20

Table 6: Simulated power of different tests for normality for n = 50

Table 7: Simulated power of different tests for normality for n = 100

For contaminating normal distributions, 90% observations are generated from standard normal distribution and the remaining 10% observations come from a normal with either shifted mean or variance or containing outliers. All our results are given at the 5% level of significance and are based on 10,000 simulations.


Here we discuss the results that we have obtained in previous section using real data sets and Monte Carlo simulation experiments.

Figure 1 shows us a genuine picture that the residuals for this data do not follow normality pattern. We compute the coefficients of skewness (1.764) and kurtosis (4.601) for this data which are also far from normality.

Fig. 1: Normal probability plot of the residuals of the Belgian road accident data

But it is interesting to observe from the results shown in Table 2 that the classical Jarque-Bera test fails to detect the nonnormality pattern of residuals for this data even at the 10% level of significance. The rescaled moments tests as suggested by Imon[13] can successfully detect nonnormality at the 5% level. Both the robust RJB and RRM rejects normality in a highly significant way, but the RRM possesses higher power than the RJB for this data.

The normal probability plots of the original and modified shelf-stocking data are shown in Fig. 2. It is not clear from these plots whether the residuals are normally distributed or not. We apply the classical and robust methods to check the normality and the results are shown in Table 4.

For the original data all the methods show that the residuals for this data are normally distributed. Since we have inserted an outlier in the modified data it is expected that the normality pattern of residuals will be affected. It is worth mentioning that all commonly used outlier detection techniques such as the Least Median of Squares (LMS), the Least Trimmed Squares (LTS), the Block Adaptive Computationally-Efficient Outlier Nominator (BACON)[4] can easily identify one observation as an outlier. The standard theory tells us that the normality should break down in the presence of outliers. But it is interesting to observe that both the Jarque-Bera and the RM test fail to detect non-normality here. The RJB test also fails to detect non-normality at the 5% level of significance. But the performance of the RRM test is quite satisfactory in this occasion. It can detect the problem of non-normality even at 1.6% level of significance.

Fig. 2: Normal probability plot of the residuals of (a): The original and (b): The modified shelf-stocking data

Simulation results shown in Table 5-7 show that when the data come from normal distributions the performance of classical normality tests are good. Both the RJB and the RRM show a slightly higher size but the values are not that big and tend to decrease with the increase in sample size. But the classical tests perform poorly when the errors come from either heavier tailed or contaminated normal distributions. But the newly proposed RRM test performed best throughout and hence may be considered as the most powerful test for normality.

Figure 3 shows the merit of using the RRM test for normality. We observe that this test possesses much higher power than the Jarque-Bera, the RM and the robust Jarque-Bera tests under a variety of error distributions.

Fig. 3: Power of the normality tests for different distributions


In this study we develop a new test for normality in the presence of outliers, called the RRM, which is a robust modification of the rescaled moments test for normality. The real data sets and Monte Carlo simulation shows that the modified rescaled moment test offers substantial improvements over the existing tests and performs superbly to test the normality assumption of regression residuals.

" target="_blank">View Fulltext    |   Related Articles   |   Back
  Related Articles

No Article Found
Copyright   |   Desclaimer   |    Privacy Policy   |   Browsers   |   Accessibility