ABSTRACT
This study introduced the idea of using the novel ranked set sampling scheme for the Monte Carlo integral estimation problem. We proposed and discussed the unidimensional integral problem. It is demonstrated that this approach provides an unbiased and more efficient estimators than the traditional estimators based on simple random sampling. The method is illustrated by examples for estimating π and {f (x) = e-x2, 0≤x≤1}. An application to estimate the Gini index is proposed.
PDF Abstract XML References Citation
How to cite this article
DOI: 10.3923/ajms.2010.130.138
URL: https://scialert.net/abstract/?doi=ajms.2010.130.138
INTRODUCTION
A defined integral, such as I, which cannot be explicitly evaluated, can be obtained by a variety of numerical methods. Therefore, the importance of good Monte Carlo integration scheme is evident. Some of these methods were given by Rubinstein (1981) and Morgan (1984) for univariate integration problem. In this study our concern is in the sample mean Monte Carlo method for integral estimation. Consider the one dimensional integral:
![]() | (1) |
This integral can be represented as expected value of some random variable. Indeed, let us rewrite the integral as:
![]() |
Assuming that f(x) is any pdf such that f(x) > 0 and a< x< b, when g (x) ≠0; then,
![]() |
For simplicity, suppose that X is distributed uniformly over [a,b]; i.e., X~ U(a,b), then:
![]() |
Therefore an unbiased estimator of I based on SRS is given by:
![]() |
This is an unbiased estimator with variance:
![]() |
It is in any case interesting to see how simple random numbers may be used to evaluate deterministic integral. However, one can utilize the idea of Rank Set Sampling (RSS) of McIntyre (1952) for integrals approximation. The majority of research of RSS has been concerned with estimating the population mean. Few works in the literature were considering the RSS in Monte Carlo methods; Samawi (1999) used the random Beta sampler to evaluate non-stochastic integrals. In a similar fashion, Al-Saleh and Samawi (2000) investigated the use of the Steady State RSS for integrals approximation. It turned out that this procedure improve the efficiency of Monte Carlo methods much further. Later, Samawi and Al-Saleh (2007) used the importance sampling technique with RSS on the multiple integrals approximation.
In this study, we used the simulated RSS for univariate integral estimation based on the sample mean Monte Carlo method.
RANKED SET SAMPLING
The balanced RSS scheme involves of drawing m sets of SRS each of size m from a population and ranking each set with respect to the variable of interest. Then, from the first set the element with the smallest rank is chosen for the actual measurement. From the second set the element with the second smallest rank is chosen. The process is continued by keep selecting the ith order statistics of the ith random sample until the element with the largest rank from the mth set is chosen. The scheme yields the following data:
![]() |
Hence, the selected RSS will be denoted by:
![]() |
where X[I: m] is the ith order statistics of the ith random sample of size m and it is denoted by the ith judgment order statistics. It can be noted that the selected elements are independent order statistics but not identically distributed.
In practice, the sample size m is kept small to ease the visual ranking, RSS literature suggested that m = 2, 3, 4, 5 or 6. Therefore, if a sample of larger size is needed, then the entire cycle may be repeated several times; say r times, to produce a RSS sample of size n = rm. Then the element of the desired sample will be in the form:
![]() | (2) |
which can be represented as:
![]() |
where X[I: m]j is the ith judgment order statistics in the jth cycle, which is the ith order statistics of the ith random sample of size m in the jth cycle. It should be noted that all of X[I: m]j's are mutually independent, in addition, the X [I: m]j in the same row of (2) are identically distributed; More details can be found by Al-Nasser and Al-Rawwash (2007).
ONE DIMENSIONAL INTEGRAL USING RSS
In order to plan sample mean Monte Carlo RSS design for the problem in Eq. 1, n RSS should be selected. Then the integral estimation has the following steps:
Step 1: | Generate a RSS of size n= m x r from U(a, b) |
![]() |
Step 2: | Compute X(i)j = a + (b a) U[i:m]j |
Step 3: | Compute g(X(i)j) |
Step 4: | Find the ranked sample-mean estimator |
![]() | (3) |
Lemma 1
is unbiased estimator for I given in Eq. 1.
Proof
In order to proof this lemma, just take the expected value of Eq. 3; for both sides:
![]() |
Since, RSS are independent order statistics; this expectation can be rewritten as:
![]() |
But f (x) is U (a, b), which means:
![]() |
and this complete the prove of the lemma.
EMPIRICAL STUDY
In this section we carry out some experiments to compare the efficiency of SRS and RSS, in three different areas, Mathematics, Statistics and Economics, by considering the problem of estimating Pi, Gaussian integral and the Gini index. For these comparisons, we generate 50,000 random samples, each of size n = m xr, where m takes the values 2, 6, 20 and 500; r takes the values 3, 4, 5 and 6. The simulated MSE, Bias and the EFF of the parameter were used as a criterion in the comparison, as follows:
![]() |
and the efficiency:
![]() |
where θ is the exact value, is the estimate of the parameter and NOI represent the number of generated sample (50,000).
The Number π
The constant π is an irrational number; that is, it cannot be written as the ratio of two integers. By using the equivalent of 96-sided polygons, Archimedes (287-212 BC) proved that 223/71<π < 22/7. Taking the average of these values yields to 3.1419. However, π can be empirically estimated by drawing a large circle, then measuring its diameter and circumference and dividing the circumference by the diameter.
For any circle with radius r and diameter d = 2r, the circumference is πd and the area is πr2. Further, π appears in formulas for areas and volumes of many other geometrical shapes based on circles, such as ellipses, spheres, cones and tori. Accordingly, ð appears in definite integrals that describe circumference, area or volume of shapes generated by circles. In the basic case, the area of a quadrant of a circle of radius unity is given by:
![]() | (4) |
Table 1: | Comparison between SRS and RSS in estimating π |
![]() | |
The results for estimating (Eq. 4) are shown in Table 1 showing the MSE, Bias as well as the EFF of the parameter estimates using SRS and RSS.
The Gaussian Integral
The Gaussian integral, or probability integral, is the improper integral of the Gaussian function over the entire real line. It is named after the German mathematician and physicist Carl Friedrich Gauss and the equation is:
![]() | (5) |
This integral has wide applications including normalization in probability theory and continuous Fourier transform. It also appears in the definition of the error function. The Gaussian integral can be solved analytically through the tools of calculus. That is, there is no elementary indefinite integral for
![]() |
but the definite integral given in Eq. 5 can be evaluated. The gaussian integral is also can be used to evaluate the exact value of π.
Table 2: | Comparison between SRS and RSS in estimating normal probailities |
![]() | |
Therefore, to have different results; we shrinkage the integral interval to be evaluated on a finite limits; say [a,b]. such formulation allow us to evaluate the area under normal curve.
Without loss of generality, we consider the integral limits to be [0,1]. Under the simulation assumptions, the results for evaluating the integral
![]() |
By using both sampling schemes; are shown in Table 2 showing the MSE, Bias as well as the EFF of the parameter estimates using SRS and RSS.
The Gini-Index
Economists use a cumulative distribution called Lorenz Curve to measure the distribution of income among households in a given country. Typically, a Lorenz Curve (Fig. 1) is defined on [0, 1], continuous, increasing and concave up and passes through (0, 0) and (1, 1).
For example, the point (a, b) on the curve represents the fact that the bottom a% of the households receive less than or equal to b% of the total income. The Gini Index (coefficient of inequality), is the ratio of the area of the region between y = x and the Lorenz Curve to the area under y = x.
![]() | |
Fig. 1: | The Lorenz curve |
Table 3: | Comparison between SRS and RSS in estimating gini index: p = 0.3 |
![]() | |
The Gini index G for an income distribution of a certain country which is represented by the Lorenz curve, for example:
![]() |
depending on the fact that The Gini Index is
![]() |
The performance of the coefficient of inequality over the intervals [0, 0.3] is presented in Table 3, the results indicate that the estimator based on RSS is superior to the estimators based on SRS in estimating GINI index.
Table 4: | Comparison between SRS and RSS in estimating Gini Index: p = 0.5 |
![]() | |
Table 5: | Comparison between SRS and RSS in estimating gini index: p = 1.0 |
![]() | |
Extending the area of coefficient of inequality to be in the interval [0, 0.5] we observed the same inference about the proposed technique as given in Table 4. Also, the similar were obtained in Table 5 as the inequality area is extended to be in the interval [0, 1].
CONCLUSION
In this study, we consider using simulated ranked set sampling for estimation the unidimensional integral. The method is illustrated by Monte Carlo experiments for estimating π and normal probabilities and the Gini index. All the simulation experiments indicated that estimators based on ranked set sample is more superior than the estimators based on simple random samples of the same size.
REFERENCES
- Al-Nasser, A.D. and M. Al-Rawwash, 2007. A control chart based on ranked data. J. Applied Sci., 7: 1936-1941.
CrossRefDirect Link - Al-Saleh, M.F. and H.M. Samawi, 2000. On the efficiency of monte carlo methods using steady state ranked simulated samples. Commun. Stat. Simulat. Comput., 29: 941-954.
Direct Link - McIntyre, G.A., 1952. A method of unbiased selective sampling using ranked sets. Aust. J. Agric. Res., 3: 385-390.
Direct Link - Samawi, H.M., 1999. More efficient monte carlo methods obtained by using ranked set simulated samples. Commun. Stat. Simulat. Comput., 28: 699-713.
Direct Link - Samawi, H.M. and M.F. Al-Saleh, 2007. On the approximation of multiple integrals using multivariate ranked simulated sampling. Applied Math. Comput., 188: 345-352.
CrossRef