HOME JOURNALS CONTACT

Asian Journal of Mathematics & Statistics

Year: 2011 | Volume: 4 | Issue: 4 | Page No.: 174-180
DOI: 10.3923/ajms.2011.174.180
Effect of Transformation on the Parameter Estimates of a Simple Linear Regression Model: A Case Study of Division of Variables by Constants
O. E. Okereke

Abstract: The ability to forecast accurately the future values of a given variable within the minimum possible time gives organizations, governments and business enterprises the opportunity for appropriate decision and policy making. Accurate predictions can be made with a correctly specified model. It is evident that an estimated model comprises parameter estimates. Hence, different sets of estimates may give rise to different forecasts. Again, researchers and experimenters often report large values in some standard forms which include thousands, millions and billions so as to save time required for compilation and computation. In this study, an attempt was made to provide estimates of the parameters of the models involving the transformation obtained by dividing the variables by constants. The effect of the transformation on the parameter estimates was also emphasized. The relationships between the estimates of the parameters of the original model and those involving the transformed variables were derived. It has been shown that the division of the independent variable by a constant did affect the estimate of the slope only. On the other hand, the estimate of the slope of the original model remained unaffected when both variables were divided by the same constant where as other obtained parameter estimates appeared to differ from those of the original model. The theoretically derived estimates were substantiated by empirical data analysis. Moreover, the regression models fitted based on the various transformed variables differed from that of the original model. As a result, transformation by means of dividing the variables by constants affects the parameter estimates as well as the predictability of the model.

Fulltext PDF Fulltext HTML

How to cite this article
O. E. Okereke , 2011. Effect of Transformation on the Parameter Estimates of a Simple Linear Regression Model: A Case Study of Division of Variables by Constants. Asian Journal of Mathematics & Statistics, 4: 174-180.

Keywords: empirical data, slope, parameters, estimates, Simple linear regression, analysis and transformed variables

INTRODUCTION

Regression models are considered to be veritable tools for describing the functional form of the relationship between variables (Ding, 2006). They also play a key role in the implementation of multivariate tools like principal component analysis (Igwenagu, 2011) and factor analysis (Abdullah and Asngari, 2011). It is customary to estimate the model. With an estimated model, one can predict the value of the dependent variable corresponding to a given value of the independent variable (Sarkar and Midi, 2010).

Regression models are classified into two broad category namely linear and non-linear models (Rajarathinam and Parmar, 2011; El-Shhawy, 2008). Linear regression models are those ones that are linear in parameters. These include simple linear, multiple linear and polynomial regression models. A simple linear regression model is the one which involves one dependent variable and one independent variable. A simple linear regression is specified as:

(1)

where, Yi, β0, β1 and e1 represents ith value of the dependent variable, intercept and slope of the regression line and ith value of the error associated with the prediction of Yi.

The least squares method of estimating the parameters of the model in Eq. 1 is usually preferred to other methods because it yields unbiased estimators (El-Salam, 2011; Ramirez et al., 2002). Olaomi and Ifederu (2008) pointed out that the assumption of lack of autocorrelation between the error terms is required for parameter estimation and inference in ordinary least squares regression. The least square estimates b1 and b0 of β and α, respectively are given by:

(2)

and

(3)

In practice, we often face the difficulty involved in fitting a regression model to data involving large values. Estimation of parameters of the regression model can be tedious and time consuming. Subtraction of constants from the variables is said to facilitates parameter estimation in regression analysis (Obioma, 2005). There are situations where subtraction of constants may not reduce the time required for the necessary computation. This include when the values are multiples of a given constant. In this case division outperforms subtraction. The effect of such transformation on the parameter estimates of the original model is the main focus of this study.

ESTIMATION OF PARAMETERS OF SIMPLE LINEAR REGRESSION MODELS INVOLVING SOME FUNCTIONS OF THE DEPENDENT AND INDEPENDENT VARIABLES

In this section, the estimates of the parameters of regression models obtained when either one or both of the variables are divided by constants are considered. Emphasis is also laid on the relationships between the estimates of the parameters of the original model and their counterparts obtained when the variables are divided by constants.

Estimation of the parameter of the regression model when the independent variable is divided by a constant: Let α be a constant such that χ = X/α. Suppose we wish to regress Y on χ. Then the associated regression model is of the form:

(4)

The symbols Y, χ, β01, β11 and ε in Eq. 4 stand for the dependent variable, independent, variable, intercept of the line and slope of the line and the associated error term.

The estimates b01 and b11 of β01 and β11, respectively are obtained using Eq. 2 and 3 as follows:

(5)

and

(6)

Estimates of the model parameters when the dependent variable is divided by a constant: Consider the regression model:

(7)

where, the symbols γ, X, β02, β12 and ε in Eq. 4 stand for the dependent variable, independent, variable, intercept of the line and slope of the line and the associated error term.

Also, γ = Y/d and d is a constant.

Using Eq. 2:

(8)

Using Eq. 3:

(9)

Estimates of the parameters of the model when the dependent variable and independent variable are divided by different constants: and where, e and f are non-zero constants.

Consider the regression model:

(10)

If b03 and b13 denote the required estimates of the parameters β03 and β13, respectively. Using Eq. 2, we obtain:

(11)

Using Eq. 3

(12)

Estimates of the parameters of the model when the dependent and independent variables are divided by the same constant: Consider the regression model:

(13)

Where:

and g = a non-zero constant.

The least squares estimate of the shape β14 is obtained with help of Eq. 2 as follows:

(14)

The corresponding estimate of the intercept β04 is obtained from Eq. 3 as:

(15)

NUMERICAL ILLUSTRATION

The relationships between the parameter estimates are verified in this section using the data on the export (in billions of dollars) (X) and import (in billions of dollars) (Y) provided by the Bureau of Economic Analysis of the U.S Department of Commerce for the date range 2006-01-01 to 2010-10-01.

The nature of the relationship between X and Y is determined with help of the scatter diagram in Fig. 1.

It can be deduced from Fig. 1 that there is a linear relationship between imports and exports of goods and services in U.S within the given period.

Also, the analysis of variance in Table 1 shows that the simple linear regression of import on export is significant at α = 0.05.

Summary of regression analysis of the various functions of the dependent variable on their corresponding independent variables is given in Table 2.


Fig. 1: The scatter diagram for determining the nature of the relationship between X and Y

Table 1: ANOVA table for regression of import on export

Table 2: Regression analysis of the transformed dependent variables on their associated independent variables

For simplicity and clarity the constants 10, 100 and 1000 are used in this numerical illustration. As we can see from Table 2, the estimate of the slope is not affected only when the variables involved in regression are divided by the same constant. On the other hand, the estimate of the intercept of the original variable remains unaffected when only the independent variable is divided by a constant. All the estimates in Table 2 agree with results obtained in section 2.

DISCUSSION

A statistical technique for technique for reducing the time required for estimation of parameters in simple linear regression has been examined in this study. Estimates of parameters of four regression models resulting from the division of the variables by constants were compared with those of the original model. On dividing the independent variable by a constant, it was observed that the estimate of the slope of the resulting model was equal to that of the original model multiplied by the given constant. The estimated intercept remained unaffected by the transformation. Dividing the dependent variable by a constant and regressing the quotient on the independent variable yielded parameter estimates equal to their corresponding estimates in the original model divided by the constant.

The regression model involving the variables divided by different constants was also fitted. This gave rise to the estimate of the slope which could be obtained by multiplying that of the original model by the constant by which the independent has been divided and dividing the product by the constant by which the dependent was divided. The associated estimate of the intercept would be obtained by dividing that of the original model by the constant by which the dependent has been divided.

Furthermore, the division of the variables by the same constant appeared not to affect the estimate of the slope of the original variable. This agrees with the results in the literature with regard to addition and subtraction of constants from the variables (Steel and Torrie, 1981; Spiegel et al., 2000). The result obtained based on this transformation suggested that the estimate of the intercept be found by dividing that of the original model by the chosen constant. Since estimated regression model is a function of the concerned parameter estimates, prediction in simple linear regression can be affected by division of variables by constants.

CONCLUSION

Based on the results obtained in this research, the division of either or both of the variables by different constants in simple linear regression could affect the estimates of the parameters of the original model. Again, division of the variables by the same constant could also affect the estimate of the intercept of the original model while the estimate of the intercept would be unaffected by division of the independent variable by a constant. Hence, it is obvious that division of any or both of the variables in simple linear regression affects the predictability of the fitted model. It is now recommended that the derived relationships be considered by analysts using the proposed transformations in regression analysis so as to ensure proper prediction.

REFERENCES

  • Ding, C.S., 2006. Using regression mixture analysis in educational research, practical assessment. Res. Eval., 11: 1-11.
    Direct Link    


  • El-Shhawy, S.A., 2008. Selection of a NRL-model by re-sampling technique. Asian J. Math. Stat., 1: 109-117.


  • Igwenagu, C.M., 2011. Principal component analysis of global warming with respect to CO2 emission in Nigeria: An exploratory study. Asian J. Math. Stat., 4: 71-80.
    CrossRef    Direct Link    


  • Abd El-Salam, M.E.F., 2011. An efficient estimation procedure for determining ridge regression parameter. Asian J. Math. Stat., 4: 90-97.
    CrossRef    Direct Link    


  • Abdullah, L. and H. Asngari, 2011. Factor analysis evidence in describing consumer preferences for a soft drink product in Malaysia. J. Applied Sci., 11: 139-144.
    CrossRef    Direct Link    


  • Obioma, N.V., 2005. Principles of Statistical Inferences. Peace Publishers Ltd., Owerri, Nigeria, pp: 179


  • Olaomi, J.O. and A. Ifederu, 2008. Understanding estimators of linear regression model with AR(1) error which are correlated with exponential regressor. Asian J. Math. Stat., 1: 14-23.
    CrossRef    Direct Link    


  • Rajarathinam, A. and R.S. Parmar, 2011. Application of parametric and nonparametric regression models for area, production and productivity trends of castor (Ricinus communis L.) crop. Asian J. Applied Sci., 4: 42-52.
    CrossRef    Direct Link    


  • Ramirez, O.A., S.K. Misra and J. Nelson, 2002. Estimation of efficient regression models for applied agricultural economics research. Proceedings of the Annual Meeting of Agricultural Economics Association, July 28-31, Long Beach, CA., USA., pp: 1-32.


  • Sarkar, S.K. and H. Midi, 2010. Importance of assessing the model adequacy of binary logistic regression. J. Applied Sci., 10: 479-486.
    CrossRef    Direct Link    


  • Spiegel, M.R., J.J. Schiller and R.A. Srinivasan, 2000. Schaum's Outline of Theory and Problems of Probability and Statistics. 2nd Edn., Tata McGraw-Hill Publishing Company Limited, New Delhi, India, ISBN-13: 9780071350044, Pages: 408


  • Steel, R.G.D. and H.J. Torrie, 1981. Principles and Procedures of Statistics: A Biometrical Approach. McGraw-Hill International Book Company, London, ISBN-0070609268, Pages: 245

  • © Science Alert. All Rights Reserved