HOME JOURNALS CONTACT

Asian Journal of Mathematics & Statistics

Year: 2012 | Volume: 5 | Issue: 2 | Page No.: 39-49
DOI: 10.3923/ajms.2012.39.49
On Performance of Simultaneous Equation Model Estimators Using Average Parameter Estimates in the Presence of Correlated Random Deviates
S. O. Oyamakin

Abstract: This study examined how six estimation methods of a simultaneous equation model cope with varying degrees of correlation between pairs of random deviates using the Average of parameter estimates. A two-equation simultaneous system was considered with assumed covariance matrix. The model was structured to have a mutual correlation between pairs of random deviates, which is a violation of the assumption of mutual independence between pairs of such random deviates. The correlation between the pairs of normal deviates were generated using three scenarios of r = 0.0, 0.3 and 0.5. The performances of various estimators considered were examined at various sample sizes, correlation levels and 50 replications. The sample size, N = 20, 25, 30 each replicated 50 times was considered. Using the Average of parameter estimates criterion, 2-3SLIML are the best estimators followed by Full Information Maximum Likelihood and by Ordinary Least Squares for the three cases studied. Also, as the sample size increases from 20 to 25 and 30, 2-3SLIML still performed the best (i.e., 2-3SLIML is consistent).

Fulltext PDF Fulltext HTML

How to cite this article
S. O. Oyamakin , 2012. On Performance of Simultaneous Equation Model Estimators Using Average Parameter Estimates in the Presence of Correlated Random Deviates. Asian Journal of Mathematics & Statistics, 5: 39-49.

Keywords: upper and lower triangular matrix, simulations, estimators, consistency, Monte Carlo and covariance matrix

INTRODUCTION

In simultaneous equation system, Monte Carlo experiments design requires the generation of orthogonal normal deviates or mutually independent sequences distributed as N (0, 1). These normal deviates are mostly transformed to ensure that the error terms are normally distributed as N (0, Σ) without being serially correlated, where Σ is the assumed variance-covariance matrix of the disturbances (Adepoju, 2009b). Monte Carlo simulation is a method of analysis based on repeatedly recreating a chance process using a computer at different replicates and directly observing the results. This involves generating data sets and stochastic terms based on assumptions, which are free of the problems of multicollinearity, non-spherical disturbances, measurement error and even specification error. However, in real life situation, the errors are not completely free of correlation (Johnston and DiNardo, 1984; Anderson and Sawa, 1973). Mendes and Pala (2004) used Monte Carlo to study and compare Type I error rates of four tests under non-normality and heterogeneity of variance assumptions. Cochran test, Brown-Forsythe test, modified Brown-Forsythe test and approximate ANOVA F-test were evaluated for three-and six different groups and at the end of 50,000 simulation trials, they found out that Type I error rates for four tests were affected by the sample size, variance ratio, the number of groups and the relationship between sample sizes and groups variances (Adepoju, 2009b), Mobasheri et al. (2009), through simulations, used the AVL FIRE code for a Spark Ignition (SI) engine which were compared with experimental data and reported that the Computational Fluid Dynamics (CFD) was able to significantly reduce the number of experimental tests and measurements and lower the development time and costs. Also, Memeledje et al. (2011) applied three simulation configurations: C12, C13 and C23 to show difficulties to ventilate room. These difficulties proceeded from the incapability to guide air at will. So, they concluded that the level of velocity was feeble in central regions, which were often occupied in a room. To further establish the importance of simulation, Masaka and Khumbula (2007), simulated to determine the effect of different compaction levels in the nursery on the emergence and biometric characteristics of coffee. El-Messoussi et al. (2007) affirmed that simulation modeling is an important tool for identifying insect pest population’s size and can help to determine the urgency of action and evaluating options for management. Numerical simulations of a transient impinging jet being issued from a biolistic device have been performed to study momentum and heat transfer characteristics, with an emphasis on the gas properties immediately above a skin target. It is found that during the operation the impingement heat transfer is very unsteady (Liu, 2006).

Adepoju (2009a, 2008) mentioned a few areas where Monte Carlo methods have been used and also applied in solving complex problems, these include: Operation research, nuclear physics and econometrics to mention but few. In recent times several investigations have concerned themselves with the Monte Carlo Methods, notable among them were; Wagnar (1958), Nagar (1960), Johnston (1984), Anderson and Sawa (1979), Basmann (1963), Cragg (1966), Anderson (1980), Metropolis (1987), Fomby et al. (1988) and Smith (1973).

Monte Carlo simulation study applied by Midi et al. (2010) indicated that the high leverage points are the source of multicollinearity. Also, Jamjoom (2006), developed some useful formulae relating the estimated and the actual values of the parameters of Burr type II and Burr type XII probability distribution functions. These formulae simplify the calculations for Monte Carlo Simulations executed for the estimation of some reference statistics.

This study therefore, examined the performance of the estimators of two-equation simultaneous model to varying degrees of correlation between pairs of normal deviates.

Hence, the objectives are to:

Determine the performance of the parameter estimates across upper and lower triangular matrices
Rank simultaneous estimators to know the best estimator for serious consideration as an estimation procedure for structural parameters
Determine the performance of simultaneous estimation methods as the sample size varies
Know the effect of the varying correlation coefficients among the random normal deviates in simultaneous equation techniques

MATERIALS AND METHODS

In simultaneous equation model, it is important to study the small sample properties of various estimators and one has to work with finite samples. It is impossible to obtain real world samples in which the exogenous variables are held constant. Hence, if we are to judge the small sample performance of alternative estimators of structural parameters, we must abstract from all influences not directly related to such estimators. Obviously, it is inappropriate to use real world data. Hence, the use of artificial models through which artificial data would be generated.

Basic theory behind random number generation with computers offers a simple example of Monte Carlo simulation to understand the average parameter estimates computed from sample data. In other words, estimators will be test driven, figuring out how different recipes performed under different circumstances. The procedure is quite simple: in each case, an artificial environment, in which the values of important parameters and the nature of the chance process are specified, then the computer will run the chance process repeatedly; finally, the computer will display the results of the experiment.

Monte Carlo methods comprise that branch of experimental mathematics, which is concerned with utilization of random normal deviates. The random deviates are generated to have zero mean and unit standard deviation so as to be independent and normally distributed. The random number and deviates thus generated are subsequently transformed to have prescribed characteristics, which are of interest to the investigator.

The general framework of the study: Given a two-equation model with both exactly identified with pairs of random deviates:

(1)

(2)

Which are of the following order of correlations:

Case I: No correlation between the random deviates (rε1, ε2 = 0)
Case II: 0.3 correlation level between the random deviates (rε1, ε2 = 0.3)
Case III: 0.5 correlation level between the random deviates (rε1, ε2 = 0.5)

Simultaneous Equation Models (SEM), as the name makes clear, the heart of this class of models lies in a data generation process that depends on more than one equation interacting together to produce the observed data.

Unlike the single-equation model in which a dependent (y) variable is a function of independent (x) variables, other y variables are among the independent variables in each SEM equation. The y variables in the system are jointly (or simultaneously) determined by the equations in the system.

Assume the following two structural equations:

These equations can be rewritten as follows:

The two equations above are exactly identified.

The reduced form model is derived as:

where, π = β-1 Γ.

And by extension the following endogenous equations were obtained:

Generation of Monte Carlo data: Monte Carlo simulation was adopted to understand the properties of different statistic computed from sample data. In other words, estimators were test-driven, figuring out how different recipes perform under different circumstances. The procedure is quite simple: In each case, an artificial environment in which the values of important parameters and the nature of the chance process are specified was set up; then the computer ran the chance process over and over until the final results of the experiment was displayed.

The main task was the generation of stochastic dependent (endogenous) variables Yit (i = 1, 2; t = 1, ..., T), which were subsequently used in estimating the parameters of the model.

In achieving this, the following had to be assumed:

Values of the predetermined variables X1t, X2t and X3t (t = 1, ..., T)
Values of the parameters, β12, β21, γ11, γ1232
Values of the elements Ω

The simulation of the error term Uit (i = 1, 2, ...., T) was another step in generating stochastic dependent variables. In setting up the Monte Carlo experiment, The following steps were followed:

The sample size N was specified as N = 20, 25, 30
Numerical values were assigned arbitrarily to each of the structural parameters as follows:

β12 = 1.5, β21 = 1.8, γ11 = 1.5, γ11 = 1.5, γ12 = 0.5, γ32 = 2.0 for all cases.

The covariance matrix of the disturbances was specified arbitrarily as follows:

The standard random number generator with values obtained from uniform distribution with mean 0 and standard deviation 1 by Kmenta (1971) is used to generate values of the exogenous variables, Xit (i = 1, 2, 3; t = 1, ..., T).

Generation of Random Disturbance Term, U: A 3-stage process is employed here to generate random disturbance terms. In the first stage, independent series of normal deviates of required length (N = 20, 25, 30) are generated. At the second stage, these series were then standardized to have a normal distribution with mean zero and variance 1. Lastly, the random disturbance terms were generated assuming three degrees of correlation between the pairs of random deviates.

Case I: No correlation between the random deviates (rε1, ε2 = 0)
Case II: 0.3 correlation level between the random deviates (rε1, ε2 = 0.3)
Case III: 0.5 correlation level between the random deviates (rε1, ε2 = 0.5)

The samples sizes considered for each scenario are N = 20, 25 and 30. The pairs of random normal deviates based on these sample sizes were generated, each replicated 50 times. The deviates were then standardized and appropriately transformed to have a specific variance-covariance matrix Σ assumed in the model. Numerical values were generated for exogenous variables of the model as described above.

Those selected (ε1t ε2t) are then transformed to be distributed as N (0, Σ) where Σ is Cov (Ut U’t = Ω IT and elements of Ω are decomposed by a non-singular matrix ρ such that:

Recall, V = β-1 U:

According to Nagar (1960, 1959), M independent terms of standard normal deviates of length N can be transformed into M series of random normal variables with mean 0 and predetermined covariance matrix. In this model, M = 2 i.e., U1t, U2t if the covariance matrix is:

where, var (U1) = σ11, var (U2) = σ22 and cov (U1 U2) = σ12 considering both upper and lower triangular matrices. Let upper triangular matrix be given by:

and lower triangular matrix as:

Then:

The pair of standard deviates can be transformed into a pair of random normal variables with mean Zn variance σ11, σ22 and covariance σ12 by using:

to obtain a pair of random disturbances for the upper triangular matrix:

where, t = 1, 2, ..., T

Similarly, an alternative solution can be obtained for the lower triangular matrix:

Generation of endogenous variables: With the numerical values already assigned to the structural parameters, we have all the values required for the generation of the endogenous variables. Considering the upper and lower triangular matrix Ut1, Ut2 defined as:

And lower triangular matrix U’1t, U’2t:

Solving Yt1 and Yt2 using upper triangular matrix yielded the followings:



Solving Yt1 and Yt2 using lower triangular matrix yielded the followings:

where, Y1t and Y2t are Equation 1 and 2, respectively.

SIMULATION RESULTS

In theory and as confirmed by Johnston (1984) when an equation is just identified, estimates of the parameter obtained by 2SLS, 3SLS and LIML should be identical. The results obtained in this study show that 2SLS, 3SLS and LIML estimators yielded virtually identical results while OLS, ILS and FIML yielded results that are clearly different from those estimators.

Meanwhile, since 2SLS, 3SLS and LIML have the same results; it shall be denoted as 2-3SLIML.

Careful study reveals that in case I, 2-3SLIML performed best having the closet values to the assumed values in most cases (22 cases to be precise) followed by FIML in 8 cases and OLS in 5 cases, ILS did not perform at all. Also, as the sample size increases from 20 to 25 and to 30, the value of the estimates get closer to the true estimates of the parameters in about 72% of the cases across the upper and lower triangular matrices. For Eq. 1, the estimates get better from lower triangular matrices to upper triangular matrices.

Case II revealed that as the sample size increases the estimates obtained by 2 3 SLIML are better in most cases than the remaining estimators which did not show any clear pattern. For both P1 and P2 comparing case I case II and case III, across the lower and upper triangular matrices, the performance of estimators under case I was better than that of case II and case III.

Case I and Case III revealed that as the sample size increases from 20 to 25 and to 30, the value of the estimates get closer to the true estimates of the parameters across the upper and lower triangular matrices i.e., (Consistency). For Eq. 1, the estimates get better from lower triangular matrices to upper triangular matrices.

As an illustration, for OLS over the three magnitudes of the correlation coefficient the estimates of β21 fell consistently for sample sizes N = 20, 25 and 30, i.e., column wise comparison for the six estimates.

A comparison of the three entries in each row of Table 1 shows that the estimates rose and fell in CASE 2 and rose consistently in both CASE 1 and CASE 3. Also, along the columns the estimates fell consistently at the three cases of the correlation coefficient at the sample sizes N = 20, 25 and 30.

The best OLS estimates for β21, γ11 and γ21 of Eq. 1, respectively are 0.92455 (CASE 1), 0.9256 (CASE 1), 0.9286 (CASE 1) for β21, 0.0077 (CASE 2), 0.0487 (CASE 2), 0.0323 (CASE 1), for γ11v and 0.0065 (CASE 2), 0.0594 (CASE 3), 0.0022 (CASE 3) for γ21. Thus, the entries 3 (r = 0.0), 0 (r = 0.3) and 0 (r = 0.5) under β21 in Table 1, 1 (r = 0.0), 2 (r = 0.3), 0 (r = 0.5) under γ11 and 0 (r = 0.0), 1 (r = 0.3), 2 (r = 0.5) under γ21 in that Table 1.

Similarly, for Eq. 2, the best OLS estimates for γ12 are in case 1. Hence, 3 (r = 0.0), 0 (r = 0.3) and 0 (r = 0.5). For β12 they are 0 (r = 0.0), 1 (r = 0.3 i.e., 1.0757) and 2 (r = 0.5 i.e., 1.0944, 1.0914) and finally, 1 (r = 0.0 i.e., 0.06858), 1 (r = 0.3 i.e., 0.0272) and 1 (r = 0.5 i.e., 0.0955) for γ32.

Table 1: Performance of parameter estimates at different sample sizes

Table 2: Sensitivity of estimators using average n = 20, 25, 30, r = 50 (P1)

Table 3: Performance of estimators using average of parameter estimate n = 30, r = 50 (P2)

This is repeated for the other three estimators. The results are displayed in Table 1 and 2 for P1 and P2, respectively. Hence Table 1 and 2 reflects the sensitivity of distribution of best estimates to varying correlation coefficients.

Table 4 and 5 are derived from Table 2 and 3. Each table contains correlation-based distribution of estimators, which yielded ‘best’ estimates of not less than 50% of the parameters of each equation. Table 4 and 5 show that CASE 2 where the error term has 0.3 level of correlation has the least proportion of ‘best’ estimates and hence few ‘best’ estimators. The most frequent estimator in this interval is the ILS and 2-3SLS.

Table 4: Correlation-based sample size-free distribution of best estimators n = 20, 25, 30. r = 50, (P1)
Source; Table 2

Table 5: Correlation-based sample size-free distribution of best estimators n = 20, 25, 30, r = 50, (P2)
Source; Table 3

Table 6: Sample and replication-free distribution of best estimates of P1

Table 7: Rank of estimators using level of correlation (P1) for Eq.1 and 2

As shown in Table 6 under P1, when error terms are not correlated i.e., r = 0.0, OLS, 2-3SLS and FIML are best for estimating Eq. 1, while OLS and ILS are good at CASE 2 i.e., r = 0.3 and 2-3SLS is best at CASE 3 i.e., r = 0.5. For Eq. 2, 2-3SLS is best at CASE 1 while ILS is best at CASE 2 and FIML performed best at CASE 3.

Under P2, the parameters of the first equation are poorly estimated at CASE 2 of the correlation coefficient i.e., r = 0.3, while ILS is best at CASE 1 followed by OLS at CASE 3. It can be said that 2-3SLS performed equally well for this equation when error term are positively correlated i.e., CASE 3.

For Eq. 2, OLS and ILS are best at CASE 1, 2-3SLS is best at CASE 2 while FIML is best at CASE 3. There is a greater scope of estimating Eq. 2 at the three cases of correlation coefficient by several estimators.

The scope of estimating the parameter of the first equation is more sensitive to the varying correlation between the error terms than for the Eq. 2 and this observation is more obvious for P2 than for P1.

The ranking of the estimators as displayed in Table 7 to 10 showed that the estimators rank differently depending on whether the upper (P1) or lower (P2) triangular matrices used.

Table 8: Sample and replication-free distribution of best estimates of P2

Table 9: Rank of estimators using level of correlation (P2) for Eq. 1 and Eq. 2

Table 10: Sample and replication-free distribution of best estimates of P1 and P2

Table 11: Rank of estimators using level of correlation (P1 and P2 combined)

The ranking also shows that while ILS ranks high as the best estimator of error term with r = 0.0, OLS is best with error term with r = 0.3 and FIML is best with error term with r = 0.5. The ranking of estimators in Table 11 in which P1 and P2 are combined is partly dominated by the ranking obtained under P2. In that table, ILS ranks high in case 1, 2-3SLS in case 2 and FIML ranks high case 3 where the error terms are positively correlated.

CONCLUSION

The finite sampling property of estimators used in this work is the average of parameter estimate. Using the Average of parameter estimates criterion, 2-3SLIML are the best estimators followed by FIML and by OLS for the three cases studied. Also, as the sample size increases from 20 to 25 and 30, 2-3SLIML still performed the best (i.e., 2-3SLIML is consistent). As the sample size increased, the estimates are closer to the true parameter estimate in most cases.

REFERENCES

  • Jamjoom, A.A., 2006. Some useful formulae for monte carlo simulations relating to burr type II, XII distributions. J. Applied Sci., 6: 553-558.
    CrossRef    Direct Link    


  • Adepoju, A.A., 2009. Comparative assessment of simultaneous equation techniques to correlated random deviates. Eur. J. Sci. Res., 28: 253-265.
    Direct Link    


  • Adepoju, A.A., 2009. Performances of the full information estimators in a Two-equation structural model with correlated disturbances. Global J. Pure Applied Sci., 15: 101-107.
    Direct Link    


  • Adepoju, A.A., 2008. Comparative performance of the limited information technique in a two-equation structural model. Eur. J. Sci. Res., 20: 197-205.


  • Anderson, G.J., 1980. The structure of simultaneous estimation: A comment. J. Economet., 14: 271-276.
    CrossRef    


  • Anderson, T.W. and T. Sawa, 1973. Distributions of estimates of coefficients of a single equation in a simultaneous system and their asymptotic expansions. Econometrica, 41: 683-714.
    CrossRef    


  • Anderson, T.W. and T. Sawa, 1979. Evaluation of the distribution function of the two-stage least squares estimate. Econometrical, 47: 163-182.
    CrossRef    


  • Basmann, R.L., 1963. A note on the exact finite sample frequency functions of generalized classical linear estimators in a leading three-equation case. J. Am. Stat. Assoc., 58: 161-171.
    CrossRef    


  • Cragg, J.G., 1966. On the sensitivity of simultaneous-equations estimators to the stochastic assumptions of the models. J. Am. Stat. Assoc., 61: 136-151.
    CrossRef    


  • Fomby, T.B., R.C. Hill and S.R Johnson, 1988. Advanced Econometrics Methods. Springer-Verlag, New York, USA., ISBN-10: 0071422803 pp: 472-528


  • Midi, H., A. Bagheri and A.H.M.R. Imon, 2010. The application of robust multicollinearity diagnostic method based on robust coefficient determination to a non-collinear data. J. Applied Sci., 10: 611-619.
    CrossRef    


  • Johnston, J., 1984. Econometric Methods. 1st Edn. McGraw-Hill, New York


  • Johnston, J. and J. DiNardo, 1984. Econometric Methods. 4th Edn., McGraw-Hill International, New York


  • Kmenta, J., 1971. Elements of Econometrics. MacMillian, New York, ISBN: 0023650605


  • Nagar, A.L., 1959. The bias and moment matrix of the general k-Class estimators of the parameters in simultaneous equations. Econometrica, 27: 575-595.
    CrossRef    


  • Nagar, A.L., 1960. A Monte Carlo study of alternative simultaneous equation estimators. Econometrica, 28: 573-590.
    Direct Link    


  • Mendes, M. and A. Pala, 2004. Evaluation of four tests when normality and homogeneity of variance assumptions are violated. J. Applied Sci., 4: 38-42.
    CrossRef    Direct Link    


  • Metropolis, N., 1987. The beginning of monte carlo method. Los Alamos Sci., 15: 125-130.
    Direct Link    


  • Smith, V.K., 1973. Monte Carlo Methods: Their Role for Econometrics. DC Heath, Lexington, Mass


  • Wagnar, H.M., 1958. A Monte Carlo study of estimates of simultaneous linear structural equations. Econometrica, 26: 117-133.
    CrossRef    Direct Link    


  • Mobasheri, R., Y. Fotrosy and S. Jalalifar, 2009. Modeling of a spark ignition engine combustion: A computational and experimental study of combustion process effects on NOx emissions. Asian J. Applied Sci., 2: 318-330.
    Direct Link    


  • Memeledje, A., M. Djoman, A. Fofana, A. Gbane and A. Sako, 2011. Numerical study of the natural ventilation in house: Cases of three rectangular scale models. Asian J. Applied Sci., 4: 137-154.
    CrossRef    Direct Link    


  • Masaka, J. and N. Khumbula, 2007. The effect of soil compaction levels on germination and biometric characteristics of coffee (Coffee arabica) seedlings in the nursery. Int. J. Agric. Res., 2: 581-589.
    CrossRef    Direct Link    


  • El-Messoussi, S., H. Hafid, A. Lahrouni and M. Afif, 2007. Simulation of temperature effect on the population dynamic of the Mediterranean fruit fly Ceratitis capitata (diptera: tephritidae). J. Agron., 6: 374-377.
    CrossRef    Direct Link    


  • Liu, Y., 2006. Quantitative evaluation of the skin heat transfer characteristics subjected to a transient high-speed helium gas impingement. J. Biological Sci., 6: 231-237.
    CrossRef    Direct Link    

  • © Science Alert. All Rights Reserved