HOME JOURNALS CONTACT

Asian Journal of Mathematics & Statistics

Year: 2008 | Volume: 1 | Issue: 2 | Page No.: 100-108
DOI: 10.3923/ajms.2008.100.108
On Simulation and Approximation in the Circular Regression Model
Ibrahim Mohamed, Abdul Ghapor Hussin and Ahmad Hazwan Abdul Wahab

Abstract: In this study, simulation activities are suggested as a tool in understanding the properties of estimators of the coefficients of linear regression model when the variables are circular. This activities are suitable for undergraduate students who have learned simple linear regression theory and would like to extend the idea of regression to the case when the sets of measurements are directions, as well as the using of various approximations in parameter estimations.

Fulltext PDF Fulltext HTML

How to cite this article
Ibrahim Mohamed, Abdul Ghapor Hussin and Ahmad Hazwan Abdul Wahab, 2008. On Simulation and Approximation in the Circular Regression Model. Asian Journal of Mathematics & Statistics, 1: 100-108.

Keywords: simulation, maximum likelihood, regression and Circular random variables

INTRODUCTION

Simulation activities has been suggested as a complement to the teaching of theory of statistics. The main objective is to show the probabilistic properties of certain estimators using empirical results of simulation actions. Armero and Ferrandiz (2002) proposed a simulation activity to show empirically the probabilistic properties of least squares estimators of slope, intercept and residual variance of a simple linear regression model. In this research, we extend the activity by considering the theory of circular regression model which is the simple linear regression of directional data. The topic is chosen due to its similarity to the simple linear regression theory and to motivate students to think on theory beyond their present knowledge. Appropriate programs in SPlus are provided as simulation tools.

CIRCULAR RANDOM VARIABLE

A circular random variable is a variable which takes values on the circumference of a circle, for example the angle is in the range [0, 2π) radians or [0°, 360°). Regression involving a circular response variable is common in a number of areas of application, particularly in biological, geological, astronomical, meteorological and economical sciences. Examples include the relationship between the direction an animal moves and the distance moved; the dependence of the strike of a fault plane on displacement; and the dependence of wind direction on wind speed.

THE VON MISES DISTRIBUTION

A circular random variables θ is said to have a Circular Normal (CN) or von Mises distribution, denoted by VM(μ, κ), if it has the density function:

(1)

Where, 0≥μ<2π and κ≥0 are parameters. Here I0(κ) in the normalizing constant is the modified Bessel function of the first kind and order zero and is given by:

(2)

This distribution also known as a Circular Normal distribution to emphasize its important and similarities to the Normal distribution on the real line was first introduced by von Mises and was discussed by Mardia (1972), who provide a nice discussion of this distribution and some of its properties. SPlus program for generating sample from this distribution is given in Appendix A.

CIRCULAR REGRESSION MODEL

Let X and Y to be a circular explanatory variable and a circular response variable respectively. The circular variable X is usually assumed to be a variable that is controllable by the experimenter. Therefore the experiment is designed so as to choose the values X and observe the corresponding value of Y. Suppose the true relationship between Y and X is linear and that the observation Y at each fixed value x of X is circular random variable. For each observation Y, the model is given by

(3)

Where, ε is a circular random error having a von Mises distribution with mean circular 0 and concentration parameter κ, i.e., ε~VM(0, κ). The ε are also assumed to be uncorrelated with each other. For practical purposes, we will consider β≈1, as an example is the measurements of wind direction by two different techniques. Knowing ε with fixed values of x, the values for y can be generated using (3). SPlus program CirDat given in Appendix B can be used to generate y with fixed values of x.

MAXIMUM LIKELIHOOD ESTIMATES OF α, β AND κ

The maximum likelihood estimators of α and β, denoted by and , respectively, is obtain iteratively using

(4)

Where:

(5)

and

(6)

Since both the x and y are measurements of the same quantity (as an example, the wind directions), unity would be logical initial estimate of β and so a possible initial estimate for iteration is β0 = 1.0. We can then update α and β and proceed iteratively. This iteration procedure will continue until the convergence criterion is satisfied.

Using the final maximum likelihood estimate of α and β obtained above, then maximum likelihood estimate of κ is given by

(7)

A simple and reasonably accurate approximation to A-1 (w) was given by Best and Fisher (1981), which is

Further, the asymptotic properties of , and can be obtained by the inverse of Fisher information matrix, Hussin et al. (2004) and given by:

(8)

(9)

(10)

Where:

For large n, the estimator , and are normally distributed with means α, β and κ and variances (8)-(10), respectively. These estimates and their variances can be obtained by using SPlus program given in Appendix C.

OBJECTIVES OF ACTIVITIES

To show that there is no closed-form available for maximum likelihood estimator for circular regression model compared to simple linear regression model and the estimate of α, β and κ may be obtained iteratively.
To show that for large sample size, all the estimators i.e, , and are normally distributed with means α, β and κ as well as variances 8-10, respectively.

SIMULATION ACTIVITIES

The activities can be arranged in 4 different steps as follows.

Step 1: Simulating the Circular Data
The simulation will be based on model given by (3):

Let the student choose a value of α, β and κ, say, α = 4, β = 1 and κ = 3 with size 30. Then, we can generate εi from VM (0, κ = 3) using the VM SPlus procedure given in Appendix A by typing VM (0, 3, 30) in the command window. Consequently, a data set, {(xi, yi), I = 1, 2, ...., 30} is generated with where xi are fixed by the instructor as xi = 12i° while yi/xi = 4+xii (mod 2π). For example, with seed number 100, the data set given in Table 1 is obtained using CirDat programs given in Appendix B. It is done by typing CirDat (0, 3, 30, 4, 1) in the command window.

Step 2: Finding the Estimates of α, β and κ
Using the programs CirReg given in Appendix C, the estimates , and can be obtained by typing CirReg (x, y, 1, 100) giving the estimates and variances as follows:

= 4.1324, = 1.0003 and = 2.5562

and

Var() = 0.06432, Var() = 1.5663e-006 and Var() = 0.3250

The estimates are close to the true values of the parameters of α = 4, β = 1 and κ = 3, respectively.

Step 3: Replicating for m Times
We let the students to repeat step 2, say m = 40 repetitions, giving a list of 40 values of , and . Using SPlus program given in Appendix D with seed number 100 by typing simuCirReg (0, 3, 30, 4, 1, 1, 100, 40) the estimates are obtained as shown in Table 2.

Step 4: Analysing the Parameter Estimates
Students can now evaluate the parameter estimates obtained from step 3. Firstly, students will look at the accuracy of the estimates as given in Table 3. The values of mean and median for , and are close suggesting the distributions are nearly symmetrical. Students should be encouraged to plot the histograms as shown in Fig. 1 using hist in SPlus. Further, the bias is small for and but quite large for . Similarly, the standard deviation of is also larger compared to the others.

Table 1: Single simulated data set

Table 2: Values of estimates for 40 replications

Table 3: Statistics of estimates


Fig. 1: Histogram of estimates

Table 4: Results by Kolmogorov-Smirnov method

Fig. 2: The quantile-quantile plots of the estimator

Next, we want to show that the estimators follow a normal distribution. This can be shown by using graphical tools such as the quantile-quantile normal plot or hypothesis test approach such as the Kolmogorov-Smirnov method (Montgomery, 1992). Figure 2 gives the quantile-quantile plots of the estimator , and . It can be observed that the triangular points all lie close to the straight lines. Furthermore, the Kolmogorov-Smirnov method gives non-significant results as given in Table 4. That means the null hypothesis of the observed follows normal distribution is not rejected. Thus, we conclude , and follow normal distribution.

CONCLUSION

The simulation activities as suggested is an example of statistical exercises that can be used to encourage students to investigate some aspect of statistical theory using simulation and approximation approach. These activities can motivate students` interest in extending the theory of statistics to other research area which has practical applications.

Appendix A

Appendix B

Appendix C


Appendix D

REFERENCES

  • Armero, C. and J. Ferrandiz, 2002. Simulation in the simpe linear regression model. Teach. Stat., 24: 12-16.
    CrossRef    Direct Link    


  • Best, D.J. and N.I. Fisher, 1981. The bias of the maximum likelihood estimators of the von Mises-Fisher concentration parameters. Commun. Statist. Simul. Comput., B10: 493-502.
    Direct Link    


  • Hussin, A.G., N.R.J. Fieller and E.C. Stillman, 2004. Linear regression model for circular variables with application to directional data. J. Applied Sci. Technol., 9: 1-6.
    CrossRef    Direct Link    


  • Mardia, K.V., 1972. Statistics of Directional Data. 1st Edn., Academic Press, London


  • Montgomery, D.C. and E.A. Peck, 1992. Introduction to Linear Regression Analysis 2nd Edn., John Wiley and Sons Inc., New York

  • © Science Alert. All Rights Reserved