Subscribe Now Subscribe Today
Research Article
 

Estimations of the Central Tendency Measures of the Random-sum Poisson-Weibull Distribution using Saddlepoint Approximation



O. Al Mutairi Alya and Heng Chin Low
 
Facebook Twitter Digg Reddit Linkedin StumbleUpon E-mail
ABSTRACT

The random-sum Poisson-Weibull variable is the sum of a random sample from a Weibull distribution with a sample size that is an independent Poisson random variable. It has a wide range of applications. This random sum is complex and difficult to analyze. Saddlepoint approximations are powerful tools for obtaining accurate expressions for closed-form distribution functions for these complex distributions. The use of saddlepoint approximations almost outperforms other methods with respect to computational costs, though not necessarily with respect to accuracy. This study introduces saddlepoint approximations to the cumulative distribution function for the Poisson-Weibull model, from which we can obtain some important statistical measures of the central tendency of a cumulative distribution. We discuss approximations of a random-sum variable using dependent components, assuming the existence of a moment-generating function. Numerical examples of Poisson-Weibull random sums are presented.

Services
Related Articles in ASCI
Similar Articles in this Journal
Search in Google Scholar
View Citation
Report Citation

 
  How to cite this article:

O. Al Mutairi Alya and Heng Chin Low, 2014. Estimations of the Central Tendency Measures of the Random-sum Poisson-Weibull Distribution using Saddlepoint Approximation. Journal of Applied Sciences, 14: 1889-1893.

DOI: 10.3923/jas.2014.1889.1893

URL: https://scialert.net/abstract/?doi=jas.2014.1889.1893
 
Received: November 01, 2013; Accepted: February 08, 2014; Published: April 18, 2014



INTRODUCTION

Saddlepoint approximations are powerful tools for obtaining accurate terms for distribution functions that are difficult to obtain in closed form. Saddlepoint approximations almost surpass other techniques with respect to computational cost, although it does not necessarily surpass other techniques with respect to accuracy.

RANDOM-SUM DISTRIBUTIONS

Random-sum distributions have many natural applications and appear frequently in probability theory and applications. For example, these distributions have a wide range of applications in branching processes (Neyman, 1939), renewal processes, damage processes (Rao et al., 1980), stopped random walks (Malinovskii, 1994) and risk theory (Esscher, 1932; Jensen, 1995; Gurland, 1957). One of the most important families of random sums is the family of Poisson random sums, in which N is a Poisson (λ) random variable and the Xis are independent and identically distributed. The random variable YN is said to have a random distribution if Y is of the following form (Johnson et al., 2005) Eq. 1:

(1)

where, the number of terms N is uncertain, the random variables Xi are independent and identically distributed (with a common distribution V) and each Xi is independent of N. If N = 0, then we have Y = 0. Although, this is implicit in the definition, we want to call attention to this point for clarity. The distribution function of YN is given by the following Eq. 2:

(2)

where, for n≥1, Gn(y) is the distribution function of the independent sum . We can also express YN in terms of convolutions (Eq. 3):

(3)

where, f is the common distribution function for Xi and f*n is the n-fold convolution of f. If the common distribution X is discrete, then the random sum YN is discrete. On the other hand, if X is continuous and if P[N = n]>0, then the random sum Y has a mixed distribution. The mean of the random sum YN is as follows (Eq. 4):

(4)

The expected value of the mean has a natural interpretation. It is the product of the expected number of events N and the expected individual distribution X. This makes intuitive sense. The variance of the random sum is as follows (Eq. 5):

(5)

The moment-generating function of the random sum YN is defined as follows (Eq. 6) (Hogg and Tanis, 1983):

(6)

where, the function In is the natural log function. The cumulant-generating function is defined as follows in Eq. 7:

(7)

SADDLEPOINT APPROXIMATION TO DENSITIES AND MASS FUNCTIONS

The most basic saddlepoint approximation was introduced by Daniels (1954) and is fundamentally a formula for approximating the density and mass function from an associated moment-generating function. Saddlepoint approximations are constructed by assuming the existence of the Moment-generating Function (MGF) or equivalently, the Cumulant-generating Function (CGF), of a random variable. For improvements to the saddlepoint methodology and associated techniques, the reader is referred to Daniels (1954, 1987) for details concerning density and mass approximation, Skovgaard (1987) for a conditional version of this approximation, Reid (1988) for applications to inference, Borowiak (1999) for discussion of a tail-area approximation with a uniform relative error and Terrell (2003) for a stabilized lugannani-rice formula.

Suppose a random variable X has density function f(x) identified for all real values of x. The MGF is defined as follows (Hogg and Craig, 1978) (Eq. 8):

(8)

Wherever this expectation exists, Mx(0) always exists and is equal to 1. We shall assume that Mx(s) converges over the largest open neighborhood (a, b) at zero. The cumulant-generating function CGF is given by the following Eq. 9:

(9)

For a continuous random variable X with CGF Kx(s) and unknown density f(x), the saddlepoint approximation density of f(x) is given by the following Eq. 10:

(10)

where,ŝ = ŝ(x) denotes the unique solution to the saddlepoint equation K’(ŝ) = x and K”x(ŝ) is the second derivative (Daniels, 1954). This approximation is useful for values of x that are within the point of support {x: f(x)>0} = χ.

The saddlepoint approximation for univariate cumulative distribution functions F(x) is given by the following Eq. 11:

(11)

where, the continuous random variable X has CDF F(x) and CGF CGFKx(s) with mean μ = E(x) and and are defined as follows (Eq. 12):

(12)

ŵ and are functions of x and saddlepoint ŝ , where ŝ is the implicitly defined function of x given by the unique solution to K’X(ŝ) = x. The symbols φ and Φ denote the standard normal density and CDF, respectively and sgn(ŝ) captures the sign (±) of ŝ (Butler, 2007).

APPLICATIONS OF THE RANDOM-SUM POISSON-WEIBULL DISTRIBUTION

The Weibull distribution is a continuous probability distribution with the following probability density function (Eq. 13):

(13)

where, k>0 is a shape parameter and η is a scale parameter of the distribution. The mean and the variance of a Weibull random variable can be expressed as follows:

The MGF is defined as follows:

The random-sum Poisson-Weibull variable has the following form (Eq. 14):

(14)

where, the sample size N follows a Poisson (λ) distribution and the Xi,s are i.i.d. random variables that follow a Weibull distribution. The sum YN is said to have a random-sum Poisson-Weibull distribution. The exact calculation of this distribution is very complex and difficult. The saddlepoint approximation method overcomes this problem. This method is based on the moment-generating function for the random sum. The cumulant-generating function for N is given by the following Eq. 15:

(15)

For Xi,s that are i.i.d. random variables following a Weibull distribution, the cumulant-generating function is defined as follows (Eq. 16):

(16)

This leads to the cumulant-generating function for the random-sum Poisson-Weibull distribution, which takes the following form (Eq. 17):

(17)

The saddlepoint equation is as follows (Eq. 18):

(18)

Saddlepoint computation involves first and second derivatives. The saddlepoint solution is a root of the first derivative. The unique real root ŝ = ŝ(x) can be determined numerically. The second derivative of the cumulant-generating function is given by the following Eq. 19:

(19)

Using Eq. 10, the saddlepoint density function for the random-sum Poisson-Weibull distribution can be expressed in the following form (Eq. 20):

(20)

where, K”YN(ŝ) is given by Eq. 19 and KYN(ŝ) is the cumulant-generating function at the value of saddlepoint ŝ.

Because of its flexible shape and its ability to model a wide range of models, the Weibull distribution has been used successfully in many applications, such as the modeling of wind speeds.

Wind speeds in most places in the world can be modeled using a Weibull distribution. This statistical tool tells us how often winds of different speeds will be observed at a location with a certain average (mean) wind speed. Knowing this helps us to choose a wind turbine with the optimal cut-in speed (the wind speed at which the turbine starts to generate usable power) and the cut-out speed (the speed at which the turbine hits the limit of its alternator and can no longer put out power with further increases in wind speed).

The shape of the Weibull distribution depends on a parameter called (helpfully) shape. In Northern Europe and most other locations around the world, the value of shape is approximately 2. The shape parameter will typically range from 1-3. For a given average wind speed.

Let N be the number of wind events that occur in one country or region during a given time period and let X be the amount of energy a wind turbine produces over a given time. YN gives the total amount of energy a wind turbine produces over a given time period.

Now, let as given in Eq. 11, with ŵ and as given in Eq. 12 and ŝ as given in Eq. 19. Let (k = 1 shape parameter ), λ = 5 and η = 4 when x = 0.02. Then, we can determine the value of the saddlepoint to be:

ŝ= -7.65569415

The cumulant-generating function value is:

KY(ŝ) = -4.841886117

The second derivative of the cumulant-generating function is:

K"Y(ŝ) = 0.005059644256

and:

ŵ= -3.06227766, = -0.544558531

Table 1:Compares the exact probabilities with saddlepoint approximations for Poisson-Weibull distribution


Fig. 1:
Comparison of the exact CDF for a Poisson-Weibull distribution (solid line) vs. its saddlepoint approximation (dashed line)

Then, the saddlepoint cumulative distribution function value for the random-sum Poisson-Weibull distribution when x = 0.02 is:

We can use the empirical distribution function to determine the exact cumulative distribution function for the Poisson-Weibull distribution by using the MATLAB program to simulate 106 independent values of YN, where N is Poisson (5) and the Xis are generated from a Weibull (1, 4) distribution. Table 1 compares the exact probabilities with saddlepoint approximations for Poisson-Weibull distribution, for each X, the first value of each cell of Table 1 is the exact Poisson-Weibull distribution, the second is the saddlepoint approximations and the third value is the relative error.

Figure 1 shows that the saddlepoint approximation of the CDF has the same accuracy as the exact CDF. The mean squared error of the saddlepoint approximation is MSE = 0.0783707, which indicates that the saddlepoint approximation is almost exact. From the CDF, we can drive various statistical measures of central tendency, such as the mean, the median and the mode. The median is the inverse of the CDF at 0.5. Using the MATLAB program, we find that Φ-1(0.5) = 18. From Eq. 4, we find that the mean = 20. Because the distribution is asymmetric, we use the formula below to drive the mode:

Mode = 3median-2mean = 54-40 = 14

CONCLUSION

This study introduced saddlepoint approximations to the cumulative distribution function for random-sum Poisson-Weibull distributions in continuous settings. We discussed approximations to random-sum random variables with dependent components assuming the presence of a moment-generating function. Measures of central tendency were derived. We used an empirical distribution function to calculate the exact CDF value by simulation of one million independent values of YN. A numerical example of a continuous distribution function for a Poisson-Weibull distribution was presented. We found that the saddlepoint approximation for CDF yields the same accuracy as the exact CDF and that the mean squared error of the saddlepoint approximation was close to zero.

ACKNOWLEDGMENTS

This study was supported financially by Taibah University, Al Madinah-M., Kingdom of Saudi Arabia.

REFERENCES
1:  Borowiak, D.S., 1999. A saddlepoint approximation for tail probabilities in collective risk models. J. Actuarial Pract., 7: 1-11.

2:  Butler, R.W., 2007. Saddlepoint Approximations with Applications. Cambridge University Press, USA.

3:  Daniels, H.E., 1954. Saddlepoint approximations in statistics. Ann. Math. Stat., 25: 631-650.
Direct Link  |  

4:  Daniels, H.E., 1987. Tail probability approximations. Int. Stat. Rev., 55: 37-48.

5:  Johnson, N.L., A.W. Kemp and S. Kotz, 2005. Univariate Discrete Distributions. 3rd Edn., John Wiley and Sons, New York, ISBN-13: 9780471715801, pp: 386-388.

6:  Hogg, R.V. and A.T. Craig, 1978. Introduction to Mathematical Statistics. 4th Edn., Collier Macmillan Publisher, USA.

7:  Hogg, R.V. and E.A. Tanis, 1983. Probability and Statistical Inference. 2nd Edn., Macmillan Publishing Co., New York, USA., ISBN-13: 978-0023557309.

8:  Reid, N., 1988. Saddlepoint methods and statistical inference. Stat. Sci., 3: 213-227.
Direct Link  |  

9:  Skovgaard, I.M., 1987. Saddlepoint expansions for conditional distributions. J. Applied Probab., 24: 875-887.
Direct Link  |  

10:  Terrell, G.R., 2003. A stabilized Lugannani-Rice formula. Proceedings of the Symposium on the Interface: Computing Science and Statistics, March 12-15 2003, Salt Lake, USA -.

11:  Neyman, J., 1939. On a new class of contagious distributions applicable in entomology and bacteriology. Ann. Math. Statist., 10: 35-57.

12:  Rao, C.R., R.C. Srivastava, S. Talwalker and G.A. Edgar, 1980. Characterization of probability distributions based on a generalized Rao-Rubin condition. Sankhya A, 42: 161-169.

13:  Malinovskii, V.K., 1994. Limit theorems for stopped random sequences, I: Rates of convergence and asymptotic expansions. Th. Prob. Appl., 38: 673-693.

14:  Esscher, F., 1932. On the probability function in the collective theory of risk. Scand. Actuarial J., 15: 175-195.
CrossRef  |  

15:  Jensen, J.L., 1995. Saddlepoint Approximations. Clarendon Press, Oxford.

16:  Gurland, J., 1957. Some interrelations among compound and generalized distributions. Biometrika, 44: 265-268.

©  2021 Science Alert. All Rights Reserved