Research Article

# Optimal Estimator for Sample Size Using Monte-Carlo Method H. Bevrani, M. Ghorbani and M.K. Sadaghiani

ABSTRACT

In this study, we construct the optimal estimator for sample size, which were sufficient for maintenance the demanded accuracy and reliability. The goal of this paper is presenting three estimators such as follow. The first one which is traditional approach and rough enough is based on the Chebyshev`s inequality. The second one is based on the central limit theorem, but it doesn`t take into account the accuracy of the normal approximation. The third estimator is based on Berry-Esseen`s inequality that takes into account the accuracy of the normal approximation and is guaranteed.

 Services Related Articles in ASCI Similar Articles in this Journal Search in Google Scholar View Citation Report Citation Science Alert

 How to cite this article: H. Bevrani, M. Ghorbani and M.K. Sadaghiani, 2008. Optimal Estimator for Sample Size Using Monte-Carlo Method. Journal of Applied Sciences, 8: 1122-1124. DOI: 10.3923/jas.2008.1122.1124 URL: https://scialert.net/abstract/?doi=jas.2008.1122.1124

INTRODUCTION

The Monte Carlo method provides approximate solutions to a variety of mathematical problems (Bauer, 1958). As is well known, the Monte-Carlo method is composed from three composite parts. Firstly, this is a simulation of random variables with the known distributions, secondly, construction of probability models for real processes and at last, problems of theory of the statistical estimation (Rubenstein, 1981). Certainly, the basic ideas of this method are the law of large numbers and the central limit theorem (Ermakov, 1971; Bevrani, 2003; Gentle, 2004). In both cases the sample size is unknown. Frequently there is a question, whether enough the available statistical data that the inference made on their basis, were exact and reliable, in other words, whether available sampling is representative. Also it is rather general problem. Therefore the purpose of the given article is the estimation of sample’s value for the Monte-Carlo method.

THE MONTE-CARLO METHOD

Let it is required to calculate approximately model I with the help of a Monte-Carlo method. Then it is necessary to find an random variable U, such, that its mathematical expectation is equal I: EU = I.

Let’s consider (n+1) independent identically distributed random variables U1, U2,...,Un with the finite second moments. Then from the central limiting theorem it follows that; (1)

where, Φ (x) is a standard normal distribution function.

This relation means, that if we have sufficiently big amount of observations U1, U2,..., Un, the required model can be approximately calculated as follows: (2)

Thus, with the probability near to 0.998, we mistake on value, not exceeding . Easy to see, that EI*n = I.

ESTIMATION OF SAMPLE SIZE

Let’s consider the problem on the accuracy of the approximation I*n ≈. I. Unfortunately, unlike the determined (nonrandom) schemes, analysis of random data requires more then one parameter describing the accuracy, as event is random, for any (0, 1), that is, for one sampling this event may happen and for any another-may not. Therefore alongside with the parameter ε describing the accuracy, we’ll set one more parameter γ (0, 1)- confidence of a statistical inference. We’ll require, that the probability of the indicated event was not less then γ, that is, (3)

Thus it is clear, that ε should be close to zero and γ should be close to unit, characterizing our confidence of the regularity of the inference. Now we are passing to the estimation of a sample size. We’ll start with traditional approaches, using the Chebyshev’s inequality and the central limiting theorem. Then we’ll consider more accurate estimates which take into account an error of normal approximation. These estimates will be based of the Berry-Esseen’s inequality and it’s more exact analogue for the case of smooth distributions.

Solution based of the Chebyshev’s inequality: On the Chebyshev’s inequality: (4)

Hence, condition (3) is satisfied, if Denote DU = σ2, then and a low bound for the number of observations will look like: (5)

Solution based on the central limit theorem: As is well known, the Chebyshev’s inequality is rather rough, therefore, using the Central Limiting Theorem (CLT) instead of it permits to hope, that estimates for the necessary sample size and appropriate accuracy would be more optimistically. CLT implies, that for the sufficiently big n (6)

Taking into account requirements on the confidence of our inference, we obtain, that probability (6) should be no more than 1–γ: when in view of definition α-quantiles zα of the standard normal law we obtain (7)

As it is easy to see, estimates (5) and (7) differ only in the factors (1–γ)-1 and .For example, let us assign γ = 0.95, then the condition (5) requires that the relation was not less than 20, while the (7) one-only 3.85 (z0.975 = 1.96), that is more, than five times better. Such in the image, the CLT allows to receive more optimistically estimates, however optimism from apparent advantage of the solution based on the CLT, doesn’t owe us to weaken. The matter is that the Chebyshev’s inequality gives though rough, but absolutely correct, guaranteed estimates for the sample’s value and for the accuracy. At the same time, attracting the CLT, we use approximate equality (6), which brings itself an error into the inference. In the following section we’ll correct this lack.

Solutions which take into account the accuracy of the normal approximation: The Berry-Esseen inequality as an estimate of the rate of convergence in the CLT is well known in the probability theory. This estimate holds for an arbitrary distribution with the finite third moment.

Assume, that the random variable U has the finite third moment and denote β3 = M|U–I|3. Then, applying the Berry-Esseen’s inequality to the accuracy estimation of relation (6), we obtain: (8)

where , and C0 is an absolute constant with the upper bound C0<0.7655 (Shiganov, 1986; Korolev and Shevtsova, 2006). Thus, more accurate estimate for the sample’s value is as follows: (9)

Results analysis: Let σ2 = 1. Then the required sample’s value can be easily computed with the help of relations Eq. 5, 7, 9 and 11. The outcomes of these computations are shown in the Table 1 and 2 (Appendix). The first Table 1 is constructed for ε = 0.001 and the second one- for ε = 0.01. The upper rows contain values of confidence level γ, the second and the third ones-values of the samples sizes, obtained by using the Chebyshev’s inequality (Eq. 5) and the CLT (Eq. 7), accordingly. A marginal left column contains the values of L3 (from 0.7655 till 2.1655 with the step 0.1). We consider so lower bound for L3, because as it follows from the Lyapunov’s inequality and therefore L3>=C0. The sample’s value can be found in the intersection of the row with appropriate value L3 and the column with required confidence level γ.

APPENDIX

 Table 1: Estimations for the sample’s value when ε = 0.001 Table 2: Estimations for the sample’s value when ε = 0.01 ACKNOWLEDGMENT

This research has been supported by the Research Institute for Fundamental Sciences, Tabriz, Iran. The authors would like to thank this support.

REFERENCES
1:  Bauer, W.F., 1958. The monte carlo method. J. Soc. Ind. Applied Math., 6: 438-451.

2:  Bevrani, H., 2003. Generalization of a monte-carlo method for a solve of definite integrals. Proceedings of 11th All-Russia Conference on Mathematical Methods of Pattern Recognition, (MMPR'03), Moscow, pp: 208-210.

3:  Ermakov, S.M., 1971. The Monte-Carlo Method and Related Problems. 1st Edn., Nauka, Moscow.

4:  Gentle, J.E., 2004. Random Number Generation and Monte Carlo Methods. 1st Edn., George Mason University, USA.

5:  Korolev, V.Y. and I.G. Shevtsova, 2006. On the accuracy of normal approximation. Probab. Theo. Appl., 50: 298-310.
Direct Link  |

6:  Rubinstein, R.Y., 1981. Simulation and the Monte Carlo Method. 1st Edn., John Wiley and Sons, New York.

7:  Shiganov, I.S., 1986. Refinement of the upper bound of the constant in the central limit theorem. J. Soviet Math., 35: 2545-2550.
CrossRef  |  Direct Link  | ©  2021 Science Alert. All Rights Reserved