Regression models are considered to be veritable tools for describing the functional
form of the relationship between variables (Ding, 2006).
They also play a key role in the implementation of multivariate tools like principal
component analysis (Igwenagu, 2011) and factor analysis
(Abdullah and Asngari, 2011). It is customary to estimate
the model. With an estimated model, one can predict the value of the dependent
variable corresponding to a given value of the independent variable (Sarkar
and Midi, 2010).
Regression models are classified into two broad category namely linear and
non-linear models (Rajarathinam and Parmar, 2011; El-Shhawy,
2008). Linear regression models are those ones that are linear in parameters.
These include simple linear, multiple linear and polynomial regression models.
A simple linear regression model is the one which involves one dependent variable
and one independent variable. A simple linear regression is specified as:
where, Yi, β0, β1 and e1 represents ith value of the dependent variable, intercept and slope of the regression line and ith value of the error associated with the prediction of Yi.
The least squares method of estimating the parameters of the model in Eq.
1 is usually preferred to other methods because it yields unbiased estimators
(El-Salam, 2011; Ramirez et al.,
2002). Olaomi and Ifederu (2008) pointed out that
the assumption of lack of autocorrelation between the error terms is required
for parameter estimation and inference in ordinary least squares regression.
The least square estimates b1 and b0 of β and α,
respectively are given by:
In practice, we often face the difficulty involved in fitting a regression
model to data involving large values. Estimation of parameters of the regression
model can be tedious and time consuming. Subtraction of constants from the variables
is said to facilitates parameter estimation in regression analysis (Obioma,
2005). There are situations where subtraction of constants may not reduce
the time required for the necessary computation. This include when the values
are multiples of a given constant. In this case division outperforms subtraction.
The effect of such transformation on the parameter estimates of the original
model is the main focus of this study.
ESTIMATION OF PARAMETERS OF SIMPLE LINEAR REGRESSION MODELS INVOLVING SOME FUNCTIONS OF THE DEPENDENT AND INDEPENDENT VARIABLES
In this section, the estimates of the parameters of regression models obtained when either one or both of the variables are divided by constants are considered. Emphasis is also laid on the relationships between the estimates of the parameters of the original model and their counterparts obtained when the variables are divided by constants.
Estimation of the parameter of the regression model when the independent variable is divided by a constant: Let α be a constant such that χ = X/α. Suppose we wish to regress Y on χ. Then the associated regression model is of the form:
The symbols Y, χ, β01, β11 and ε in Eq. 4 stand for the dependent variable, independent, variable, intercept of the line and slope of the line and the associated error term.
The estimates b01 and b11 of β01 and β11, respectively are obtained using Eq. 2 and 3 as follows:
Estimates of the model parameters when the dependent variable is divided by a constant: Consider the regression model:
where, the symbols γ, X, β02, β12 and ε in Eq. 4 stand for the dependent variable, independent, variable, intercept of the line and slope of the line and the associated error term.
Also, γ = Y/d and d is a constant.
Using Eq. 2:
Using Eq. 3:
Estimates of the parameters of the model when the dependent variable and
independent variable are divided by different constants:
and where, e and f are non-zero constants.
Consider the regression model:
If b03 and b13 denote the required estimates of the parameters β03 and β13, respectively. Using Eq. 2, we obtain:
Using Eq. 3
Estimates of the parameters of the model when the dependent and independent variables are divided by the same constant: Consider the regression model:
g = a non-zero constant.
The least squares estimate of the shape β14 is obtained with help of Eq. 2 as follows:
The corresponding estimate of the intercept β04 is obtained from Eq. 3 as:
The relationships between the parameter estimates are verified in this section using the data on the export (in billions of dollars) (X) and import (in billions of dollars) (Y) provided by the Bureau of Economic Analysis of the U.S Department of Commerce for the date range 2006-01-01 to 2010-10-01.
The nature of the relationship between X and Y is determined with help of the scatter diagram in Fig. 1.
It can be deduced from Fig. 1 that there is a linear relationship between imports and exports of goods and services in U.S within the given period.
Also, the analysis of variance in Table 1 shows that the simple linear regression of import on export is significant at α = 0.05.
Summary of regression analysis of the various functions of the dependent variable on their corresponding independent variables is given in Table 2.
||The scatter diagram for determining the nature of the relationship
between X and Y
||ANOVA table for regression of import on export
|| Regression analysis of the transformed dependent variables
on their associated independent variables
For simplicity and clarity the constants 10, 100 and 1000 are used in this numerical illustration. As we can see from Table 2, the estimate of the slope is not affected only when the variables involved in regression are divided by the same constant. On the other hand, the estimate of the intercept of the original variable remains unaffected when only the independent variable is divided by a constant. All the estimates in Table 2 agree with results obtained in section 2.
A statistical technique for technique for reducing the time required for estimation of parameters in simple linear regression has been examined in this study. Estimates of parameters of four regression models resulting from the division of the variables by constants were compared with those of the original model. On dividing the independent variable by a constant, it was observed that the estimate of the slope of the resulting model was equal to that of the original model multiplied by the given constant. The estimated intercept remained unaffected by the transformation. Dividing the dependent variable by a constant and regressing the quotient on the independent variable yielded parameter estimates equal to their corresponding estimates in the original model divided by the constant.
The regression model involving the variables divided by different constants was also fitted. This gave rise to the estimate of the slope which could be obtained by multiplying that of the original model by the constant by which the independent has been divided and dividing the product by the constant by which the dependent was divided. The associated estimate of the intercept would be obtained by dividing that of the original model by the constant by which the dependent has been divided.
Furthermore, the division of the variables by the same constant appeared not
to affect the estimate of the slope of the original variable. This agrees with
the results in the literature with regard to addition and subtraction of constants
from the variables (Steel and Torrie, 1981; Spiegel
et al., 2000). The result obtained based on this transformation suggested
that the estimate of the intercept be found by dividing that of the original
model by the chosen constant. Since estimated regression model is a function
of the concerned parameter estimates, prediction in simple linear regression
can be affected by division of variables by constants.
Based on the results obtained in this research, the division of either or both of the variables by different constants in simple linear regression could affect the estimates of the parameters of the original model. Again, division of the variables by the same constant could also affect the estimate of the intercept of the original model while the estimate of the intercept would be unaffected by division of the independent variable by a constant. Hence, it is obvious that division of any or both of the variables in simple linear regression affects the predictability of the fitted model. It is now recommended that the derived relationships be considered by analysts using the proposed transformations in regression analysis so as to ensure proper prediction.