HOME JOURNALS CONTACT

Asian Journal of Mathematics & Statistics

Year: 2008 | Volume: 1 | Issue: 3 | Page No.: 150-158
DOI: 10.3923/ajms.2008.150.158
On the Performance and Estimation of Spectral and Bispectral Analysis of Time Series Data
J.F. Ojo

Abstract: In this study, discrete spectral and bi-spectral analysis of time series data were considered to determine which of them perform better. The parameters of the spectral and bi-spectral models were estimated using Modified Newton Raphson Iterative method. Since the order of a model cannot be increased indiscriminately because of the closeness of some parameters to zero; discrete spectral and bi-spectral analysis model of orders one to five were fitted to the real series. Akaike Information Criterion (AIC) and Bayesian Information Criterion (BIC) were used to determine the best order of the model. To determine the best model, the residual variance attached to the spectral and bi-spectral models was used. Order one and order four gave the best order in spectral and bi-spectral models, respectively. Residual variances of spectral and bi-spectral models compared favourably with each other. The residual variance of bi-spectral model was smaller than the residual variance of spectral model and this made us to conclude that bi-spectral analysis of time series data performed better than spectral analysis of time series data.

Fulltext PDF Fulltext HTML

How to cite this article
J.F. Ojo , 2008. On the Performance and Estimation of Spectral and Bispectral Analysis of Time Series Data. Asian Journal of Mathematics & Statistics, 1: 150-158.

Keywords: bayesian information criterion, residual variance, Modified newton raphson, akaike information criterion and time series data

INTRODUCTION

Spectral analysis, which essentially a modification of Fourier series have been used quite successfully for analyzing time series data in Physics, Engineering and Medicine since many phenomena in these areas exhibit pattern of more or less cyclical variation in the continuous type (Shittu and Shangodoyin, 2008). With the enormous growth of spectral analysis, if bispectral term is introduced in the spectral expression it will definitely give a better fit.

Fourier introduced Fourier analysis in the eighteenth century to solve differential equation arising in the problems of heat conduction. Fourier series is concerned with approximating a function by a sum of sine cosine terms. The primary objective of spectral analysis is to decompose a time varying quantities into sums (or integral) of sine and cosine functions.

The spectral density function is a commonly used tool in the analysis of time series. It is one of the oldest and most widely used analysis techniques in the physical sciences especially in signal processing, in geophysics and in the analysis of heart rate variability in medicine. The basic idea behind spectral analysis is to decompose the variances of a time series into a number of components, each one of which can be associated with a particular Fourier frequency. It is a method of analysis that describes the fluctuation of time series in terms of sinusoidal behavior at various frequencies.

The theory of spectral estimation can be classified into two categories, the parametric approach and non-parametric approach. While the former estimates the parameters contained in the model used in the spectral density function assuming a parametric model for the underlying series, the non-parametric model introduces some statistics; say the periodogram for estimating the spectral density directly without assuming any parametric model.

The wide and diverse ranges of the idea involve in the spectral analysis and the enormous growth in its use over the years with applications in many areas of human endeavours has amply demonstrated its considerable importance as a scientific tools. Spectral method of analysis is particularly useful where most data were not merely string of independent data, from one side of the mean to the other, they were sufficiently violent, but with the presence of periodic functions.

An important use of the bispectrum is the detection of nonlinearities. Because of its sensitivities to nonlinearities, it is believed that the bispectrum has practical uses in the area of structural health monitoring and damage detection (Debra et al., 2000). The bispectrum has been shown to be an indicator of fatigue cracks in cantilever beams (Rivola and White, 1998) and has been applied to damage detection in rotating machinery (Li et al., 1991). The bispectrum has been applied as an analysis tool for quasi-periodic signals in such diverse areas as diagnosis of heart conditions (Shen and Sun, 1997), machine monitoring (Barker and Hinich, 1993) and tidal wave motion analysis (Beard et al., 1999).

In the present study, with the usefulness of spectral and bispectral analysis in various field, estimation and performance of bispectral analysis of discrete time series shall be considered vis-a-vis spectral analysis of time series data.

MATERIALS AND METHODS

Discrete Spectral and Bi-Spectral Models
A finite periodic sequence can be transformed into a finite set of Fourier coefficients by the use of orthogonal conditions. Though spectral method of analysis is meant for continuous data set, it can be used to analyzed discrete data provided the continuous sample record are properly transformed into discrete parameter form (ω called Fourier frequencies) by reading off its values at discrete time interval t. No matter what the recording apparatus, in the end, we can only make measurement at discrete time if there is to be any meaningful analysis. Thus the continuous record must be sampled to give discrete number for analysis. In this study, we shall consider estimation of parameters of a series using discrete spectral and bi-spectral method.

Let Xt be a stochastic process that is periodic with period 2π which can be expressed as an infinite linear combination of sine and cosine function defined as:

i.e.,

(1)

(for the spectral model) where aj and bj are Fourier coefficients and εt is the random error with mean zero and constant variance. For the discrete bi-spectral model Xt is define by the relation:

(2)

The periodic variations in (1) and (2) can be partitioned into N periods with frequencies:

The components at frequency

are the kth harmonic frequencies. Thus to estimate the parameters by Modified Newton Raphson Iterative Method ω is restricted to one of the values

Least Squares Estimation of Discrete Spectral Model
The spectral model is represented by the relation:

(3)

The parameters αn and βn can be estimated by minimizing:

(4)

with respect to each of the parameters. Therefore differentiating Q with respect to αn and βn setting the derivatives equal to zero, we obtain the normal equations:

(5)

and

(6)

Where:

(7)

The solution to Eq. 5 and 6 are considerably simplified if all the i′s are integral multiple of the periods of the sine and cosine terms in Eq. 7. This would arise if the trigonometric terms in (3) represent a seasonal component with known periods. In this case, we may write:

(8)

where, p1, p2,...,pk are all integers satisfying,

being N/2 when N is even or (N-1)/2 when N is odd. In this study we shall assume that N is even integer.

Using well known orthogonality relations for the

namely:

(9)

Thus, when ωi′ s are of the form (8) and assuming that no pi is either 0 or N/2 when N is even, we have:

(10)

Using Eq. (10) in (5) and (6) we obtain for i = (1, 2, ….., k):

(11)

(12)

However, when pi = 0 and Ci = N:

(13)

is the sample mean.

Modified Newton Raphson Iterative Method of Estimation of Discrete Bi-Spectral Model

Maximizing the likelihood function is the same as minimizing the function Q(G)

where,

with respect to the parameter For convenience, we shall write G1 = α1, G2 = α2,..., GR = Uvl where, R = k+vl. Then the partial derivatives of Q(G) are given by:

(14)

(15)

where, these partial derivatives of et satisfy the recursive equations:

(16)

(17)

(18)

(19)

(20)

(21)

(22)

(23)

(24)

(25)

(26)

(27)

and let H(G) = [d2Q(G)/dGidGj] be a matrix of second partial derivatives. Expanding V (G), near G = Ĝ in a Taylor series, we obtain:

[V(Ĝ)]Ĝ = G = 0 = V(G)+H(G)+H(G)(Ĝ -G)

Rewriting this equation and as in Wojtek (1998) we get Ĝ-G = -H-1 (G)V(G) and thus obtain an iterative equation given by:

(28)

where, G(k) is the set of estimates obtained at the kth stage of iteration. The estimates obtained by the above iterative equations usually converge. For starting the iteration, we need to have good sets of initial values of the parameters.

Distributional Properties
It can be shown from Eq. 11 that:

(29)

recall that ∈t~N(0,1)

It follows that E(Xt) = 0 and that

(30)

Using the orthogonal condition we have:

(31)

which implies that is an unbiased estimate of αi.

Similarly

(32)

which implies that is an unbiased estimate of βi.

Also from equation (2.11)

(33)

Similarly,

(34)

also it could be seen that the

However, an estimate of may be obtained by using the usual expression for the unbiased estimate of the residual variance in a regression model. If the model (1) involves k parameters we obtain:

(35)

Residual Variance
Residual variance or unexplained variance in analysis of variance and regression analysis is that part of the variance which cannot be attributed to specific causes. The unexplained variance can be divided into two parts. First, the part related to random, everyday, normal and free will differences in a population or sample. Among any aggregation of data these conditions equal out. Second, the part that comes from some condition that has not been identified, but that is systematic. That part introduces a bias and if not identified can lead to a false conclusion.

Akaike Information Criteria (AIC)
The Akaike information criterion (AIC) (pronounced ah-kah-ee-keh), developed by Anderson (1971) and proposed in Akaike (1974), is a measure of the goodness of fit of an estimated statistical model. The AIC is an operational way of trading off the complexity of an estimated model against how well the model fits the data.

In the general case, the AIC is AIC = 2k-2ln(L) where, k is the number of parameters and L is the likelihood function. It will be assumed that the model errors are normally and independently distributed. Let n be the number of observations and RSS be the residual sum of squares. Then AIC becomes AIC = 2k+nln(RSS/n). Increasing the number of free parameters to be estimated improves the goodness of fit, regardless of the number of free parameters in the data generating process. Hence AIC not only rewards goodness of fit, but also includes a penalty, that is, an increasing function of the number of estimated parameters. This penalty discourages overfitting. The preferred model is the one with the lowest AIC value. The AIC methodology attempts to find the model that best explains the data with a minimum of free parameters. The AIC penalizes free parameters less strongly than does the Schwartz.

Bayesian Information Criterion
In statistics, the bayesian information criterion (BIC) is a statistical criterion for model selection. The BIC is sometimes also named the Schwartz criterion, or Schwartz information criterion (SIC). Schwartz, G. gave a Bayesian argument for adopting it.

Let

n = The number of observations, equivalently, the sample size
k = The number of free parameters to be estimated
RSS = The residual sum of squares from the estimated model
L = The maximized value of the likelihood function for the estimated model

The formula for the BIC is:

Under the assumption that the model errors or disturbances are normally distributed, this becomes:

Given any two estimated models, the model with the lower value of BIC is the one to be preferred. The BIC is a decreasing function of RSS, the goodness of fit and an increasing function of k. The BIC penalizes free parameters more strongly than does the Akaike information criterion.

RESULTS AND DISCUSSION

The real series used in the demonstration of our model was sunspot real series when t = 250 that is from 1734 to 1983.

Fitting of Discrete Spectral Model
Spectral model of orders 1 to 5 were fitted using the real series. The choice of the best order is made on the basis of AIC and BIC and the minimum AIC and BIC are always the best model and at order 1 we recorded the best model. The fitted model is:

Fitting of Discrete Bispectral Model
Bispectral model of orders 1 to 5 were fitted using the real series. The choice of the best order is made on the basis of AIC and BIC and the minimum AIC and BIC are always the best model and at order 4 we recorded the best model. The fitted model is:

Xt=0.254306+0.435671cosw1t-0.223265sinw1t+0.053312cosw1tet-1-0.124346sinw1tet-1+0.115462cosw2tet-1-0.026833sinw2tet-1-0.375790cosw3tet-1-0.024802sinw3tet-1-0.057056cosw4tet-1-0.198304sinw4tet-1+et

From Table 1, we could see the performance of the bi-spectral over the spectral model. The residual variance attached to the bi-spectral model is smaller than the residual variance for the spectral model. Also the Akaike Information Criterion as well as Bayesian Information Criterion revealed the better performance of bi-spectral model over the spectral model.

Table 1: Performance of spectral and Bi-spectral models
RV: Residual Variance; AIC: Akaike Information Criterion; BIC: Bayesian Information Criterion; F: F-Statistic; P: Probability level

CONCLUSION

We have seen in this study the performance of discrete bi-spectral analysis of time series data over discrete spectral analysis of time series data. Therefore, since most series encountered are non-linear in nature then fitting bi-spectral model is appropriate for such series.

REFERENCES

  • Beard, A.G., N.J. Mitchell, P.J.S. Williams and M. Kunitake, 1999. Non-linear interactions between tides and planetary waves resulting in periodic tidal variability. J. Atmosph. Solar Terrest. Phys., 61: 363-376.
    CrossRef    Direct Link    


  • Akaike, H., 1974. A new look at the statistical model identification. IEEE Trans. Autom. Control, 19: 716-723.
    CrossRef    Direct Link    


  • Anderson, T.W., 1971. The Statistical Analysis of Time Series. 1st Edn., John Wiley and Sons, New York, ISBN: 0471029009


  • Annene, J.D., 1990. An Introduction to Generalized Linear Models. 1st Edn., Chapman and Hall, 2-6 Boundary Row, London SE1 8HN, UK., ISBN 0 412 31110 0


  • Box, G. and Jenkins, 1976. Time Series Analysis and Control. 1st Edn., Holden-Day San Francisco, California, ISBN: 0-8162-1104-3


  • Li, C.J., J. Ma, B. Hwang and G.W. Nickerson, 1991. Pattern recognition based bicoherence analysis of vibrations for bearing condition monitoring. Sensors, Controls and Quality Issues in Manufacturing, 1991, American Society of Mechanical Engineers, pp: 1-11.


  • Debra, G., N. Hunter, C. Farrar and R. Deen, 2000. Idedentifying damage sensitive features using nonlinear time series and bispoectral analysis. Proceedings of the IMAC 18, February 7-10, 2000, San Antonio, Texas, pp: 1-7.


  • Shen, M. and L. Sun, 1997. The analysis and classification of phonocardiogram based on higher order spectra. Proceeding of the Workshop on Higher Order Statistics, July 21-23, 1997, Banff, Alta, Canada, pp: 29-33.


  • Shittu, O.I. and D.K. Shangodoyin, 2008. Detection of outliers in time series data: A frequency domain approach. Asian J. Scientific Res., 1: 130-137.
    CrossRef    Direct Link    


  • Barker, R.W. and M.J. Hinich, 1993. Statistical monitoring of rotating machinery by cumulant spectral analysis. Proceeding of the Workshop on Higher Order Statistics, June 7-9, 1993, South Lake Tahoe, CA, USA., pp: 187-197.


  • Rivola, A. and P.R. White, 1998. Bispectral analysis of the bilnear oscillator with application to the detection of fatque cracks. J. Sound Vibration, 216: 889-910.
    CrossRef    Direct Link    


  • Robert, V.F. and H. Lee, 2000. Adaptive fourier series and the analysis of periodicities in time series data. J. Time Series Anal., 21: 649-662.
    CrossRef    Direct Link    


  • Wojtek, J.K., 1998. An Introduction to Statistical Modelling. 1st Edn., Great Britain, London, ISBN 0 340 69185 9

  • © Science Alert. All Rights Reserved