Research Article
Significant Tests of Coefficient Multiple Regressions by using Permutation Methods
Department of Statistics, University of Payame Noor, 19395-4697, Islamic Republic of Iran
The first descriptions of permutation tests for linear statistical models, including analysis of variance and regression, can be traced back to the work of Fisher (1935).
Permutation tests were first introduced by Fisher (1935) and Bizhannia et al. (2010). But they were not pleased so much because they needed time consuming calculations. But nowadays these calculations by production of fast and powerful computers that calculating of a p-value is faster than finding an amount of charts for a parametric test. Parametrical tests take place in the area of free- distribution. As we know for doing the question of testing of hypothesis, in addition to hypothesizes like independency of error term, stability of their variance and random sampling, we need to the hypothesis of normal distribution of error term, while for doing permutation tests we dont need to basic hypothesizes. In fact we for doing a non parametrical test, we use series of simple hypothesizes that leaves the researcher. One of these hypothesizes which in fact is the base for permutation tests is hypothesis of exchange ability of observations which defined like this.
Definition 1: Let's realize exchange ability of random vector (n) dimensional, x = (x1, x2, , xn) with joint distribution of f x1, x2, , xn (x1, x2, , xn) X is called exchangeable if joint density of observations per each permutation vector is suitable. So for each permutation vector which is shown by X we have:
Various permutational strategies have been proposed for testing nullity of a partial regression coefficient in a multiple regression model (AL-Salihi et al., 2010; Bughio et al., 2002; Edriss et al., 2008; Kandhro et al., 2002; Laghari et al., 2003; Rashid et al., 2002; Serhat Odabas et al., 2007; Tariq et al., 2003; Alam, 2004). The proposed permutation methods for such tests have different bases in term of their philosophies and have been proposed in different contexts.
In Anderson and Legendre (1999) point of view just four cases of these strategies are suitable. These four methods are: Manly method (Manly, 1991), Kennedy method (Kennedy and Cade, 1996) Freedman-Lane method (Freedman and Lane, 1983) and Ter braak (Ter-Braak, 1992) but in this article we just compare Kennedys method and Freedman and Lanes method.
KENNEDYS METHOD
Suppose variable answer Y and variable x1, x2, , xp and assume we have n observation for regression. Then we have this equation:
(1) |
where, ε is (sentence wrong) error term and has an indeterminate distribution F with by a mean zero variance σ2. We can write the Eq. 1 as the following vector-matrix:
(2) |
Where:
We want to test hypothesis H0: βp = 0 in comparison with, H2: βp≠ 0. For doing this test by permutation method we follow Kennedy algorithm like this: we define matrix:
and then multiple correspondents of Eq. 2 in (In Hi) to solve equation shown here:
(3) |
Where:
Step 1: | Then by using least error squares, we estimate βp like this: |
After that we estimate βp, amount of statistic test calculate and call it referential t.
Step 2: | We do permutation for amounts and show it as |
Step 3: | We make a regression equation on , model of Eq. 3, from found vector. Then we estimate βp by least sum of square error, like one given here: |
Step 4: | By repeating second and third step and finding t*'s we also find permutation distribution t's, next, by using that we find p-value which is the relation of amount of permutation statistics that their absolute value is larger than their referential E absolute value and at least we admit or reject zero hypothesis. |
PROBLEM WITH KENNEDY METHOD
In permutation methods, permutating amounts of y, causes amounts of ε permutate. Thus a question comes to mind is that whether this obligatory permutation changes its parametrical distributions or not? So we first give this definition.
Permutative matrix: Definition 2 permutative matrix is a square matrix that there are numbers of zero and only a one each row and the position of number one in each row is different from other rows. This matrix is shown by sample P and has a special feature of Pt P = P Pt.
Now, with respect to the above definition, we analyze the mentioned problem: We know that:
so, considering the above definition, we can put the permutated vector ε which is known by the notation of ε* this way ε* pε. As a result:
It is clear that by the permutation of values of y does not change the distribution of ε. Now, we study this in case of the reduced equation of which Kennedy has made use Eq. 3.
It can easily be proved that:
Regarding the definition 2, we can easily prove that:
Which shows the variance of is dependent on thus by any permutation of , is multiplied by a new number that changes the variance. Therefore, by any permutation of , the parameters of distribution of changes. So due to the null hypothesis H0: βP = 0. The distribution of also changes by any permutation.
MODIFIED KENNEDYS METHOD
Huh and Jhun (2001) improve this problem, with regard to this point that the matrix (In Hi) is the one to a power of its own value and with the rank of n = n-p, added the following steps to kennedys algorithm:
Step 1: | First, we work the eigenvectors and rank of the matrix (In Hi) out |
Step 2: | We save the eigenvectors equivalent to the particular measures of one which equals the rank of in the form of (In Hi) the matrix |
Step 3: | Through the orthogonalizing process of Gram Schmith, we change/turn this matrix to an orthogonal one |
Step 4: | Divide each column of this matrix by the norm of that column so that an orthogonal unit vector is resulted. We come to realize this new matrix by V1 the dimensions of which are (nxn) |
Step 5: | Using the spectrographic analysis and knowing the point that particular measures of matrix V1 is one, we reduce the matrix (In Hi) this way: |
Step 6: | After reduction of this matrix, we multiply two sides of the Eq. 3 by the matrix So that a new equation is resulted in this form: |
Where:
Now, we turn to the distribution of the permutated vector of . To do so, we again refer to the definition 2:
So it is pinned down that and this gives clue to the fact that this new method keeps the distribution of the error expressions constant and stable even after the permutation.
After the modification/improvement of the Eq. 3 and turning it into the form (4), we apply the steps contributed by Kennedy to the new resulted equation.
SIMULATION
Anderson and Legendre (1999), having an extent simulation done (ignoring the error in Kennedys method), showed that in Freedman-Lawn method, a Type I error probability is less than that the Kennedys methods. Shadrokh and d'Aubigny (2010), Shadrokh (2011) analytically show that the Type I error of Freedman and Lane method is lower than that of Kennedys approach.
Here, with the aid of a simulation, we intend to check out whether the claim Anderson and Legendre (1999) and Shadrokh (2011) have made remains valid after modifying the error in the method of Kennedy or not? To do so, we calculate the empirical probability of type I error in all the three permutating methods on the basis of a double regression model y = β0+β1x1+β2x2+ε considering the four following factors:
• | The sample size {10, 20, 30} |
• | The correlation between the two variances x1, x2, ρ = {0.1, 0.9} |
• | The quantity/value of the coefficient of regression which is not tested β = {0.5, 1.5} |
• | Thes distribution of error expressions: the exponential with the parameter 2 |
The focused simulation is done employing the software S-plus and the results are drawn as the following charts.
In all four case (showed in four chart) the probability of type one error of Freedman and Lane method is lower than that of Kennedys method and modified Kennedys method (Huh and Jhuns method).
The certain considerable point is that, considering Fig. 1a-b, once β equals a small quantity, ignoring the fact that the correlation is high or low and when sample size reaches 30, Freedman-Lanes method and modified Kennedys method which is depicted in the charts as the Huh and Jhun method, both, show a convex quantity.
Fig. 1: | The consistent views in one view chart; (a) Beta 1 = 0.5, r = 0.1, (b) Beta 1 = 1.5, r = 0.1, (c) Beta 1 = 0.5, r = 0.9 and (d) Beta 1 = 1.5, r = 0.9 |
The objective of this article was to select the best test of significance of a single partial regression coefficient in a multiple regression model. Hence the Kennedys method, modified Kennedys method and Freedman and Lanes methods compared by simulation. Then, with accordance to the results of the simulation the best method was selected. This led us to the fact that the results Shadrokhs analytical and results Andersons simulation still remain unquestionable after the modification of the problem with Kennedys method and it can be asserted that when selecting a method for testing one of the coefficients of multiple regressions, Freedman and Lanes method is to be the first choice.