HOME JOURNALS CONTACT

Journal of Applied Sciences

Year: 2007 | Volume: 7 | Issue: 13 | Page No.: 1790-1794
DOI: 10.3923/jas.2007.1790.1794
On Efficient Confidence Intervals for the Log-Normal Mean
Peter Chami, Robin Antoine and Ashok Sahai

Abstract: Data obtained in biomedical research is often skewed. Examples include the incubation period of diseases like HIV/AIDS and the survival times of cancer patients. Such data, especially when they are positive and skewed, is often modeled by the log-normal distribution. If this model holds, then the log transformation produces a normal distribution. We consider the problem of constructing confidence intervals for the mean of the log-normal distribution. Several methods for doing this are known, including at least one estimator that performed better than Coxxs method for small sample sizes. We also construct a modified version of Coxxs method. Using simulation, we show that, when the sample size exceeds 30, it leads to confidence intervals that have good overall properties and are better than Coxxs method. More precisely, the actual coverage probability of our method is closer to the nominal coverage probability than is the case with Coxxs method. In addition, the new method is computationally much simpler than other well-known methods.

Fulltext PDF Fulltext HTML

How to cite this article
Peter Chami, Robin Antoine and Ashok Sahai, 2007. On Efficient Confidence Intervals for the Log-Normal Mean. Journal of Applied Sciences, 7: 1790-1794.

Keywords: simulation study, Normal population mean and confidence interval

INTRODUCTION

Data obtained in biomedical research is often skewed. Examples include the incubation period of diseases like HIV and survival times of cancer patients. Since statistical inference based on the normal distribution is well known, an established way to deal with non-normal data is to apply a transformation that makes them normally distributed. The log transformation is one of those most commonly used for this purpose. It is especially recommended when the data come from a population that is positive and skewed. A variable X is said to have a log-normal distribution with parameters μ and σ2 if Y = logX is normally distributed with mean μ and variance σ2. In this case the mean θ of X is

We consider the problem of constructing confidence intervals for the mean θ of the log-normal distribution.

Zhou and GAO (1997) did simulations to construct and compare confidence intervals using the four accepted methods at that time. These are the naive method, Angus's conservative method, Angus's Parametric Bootstrap (PB) method and Cox's method. Their simulation study revealed that the naive method was wholly inappropriate and contrary to what one would expect, produced an increase in coverage area with an increase in sample size. When the sample size was fairly small (n = 11), coverage error was overall the smallest for the (PB) method but this result was only obtained when the variance was small. It was also found that the (PB) method was negatively biased.

Cox's method yielded confidence intervals that had comparatively the smallest coverage error for moderate sample sizes (as small as 50). However, unlike the (PB)'s method, Cox's method provided coverage error values that did not significantly increase as σ2 increased.

Wu et al. (2003) derived a modified signed log-likelihood ratio method that, for small samples of size less than 30, outperformed both Cox's method and the (PB) method for all the comparative criteria by Zhou and Gao (1997).

In the current study, we revisit this classical problem and derive a modified version of Cox's method to provide a more efficient estimator. Unlike all the previous papers mentioned, our approach will deal exclusively with samples of size greater than 30. The proposed estimator is compared with the existing methods via the three criteria used by Zhou and Goa (1997), coverage error, interval width and relative bias. We show that coverage error is smaller than any of the other methods, including the modified signed log-likelihood ratio method of Wu et al. (2003), for sample size n>30.

THE FIVE MAIN APPROACHES

Here, we review the five existing methods for constructing two-sided 1-α level confidence intervals for a log-normal mean θ. Let X1, X2,....,Xn be a random sample from a log-normal distribution with parameters μ and σ2, let Y = log Xi for I = 1,2,...n and let

be the mean of the log-normal.

The Naïve method: This method constructs a confidence interval for μ, the mean of the log-transformed data, using the normal theory as

Next an antilogarithm function is applied to transform the confidence limits back to the original scale to obtain a confidence interval for

For large n, this method leads to biased estimators.

Cox’s method: One way to estimate θ is to estimate μ and σ2 and then to make use of the relationship

If we estimate μ and σ2 by the sample mean and the sample variance S2, respectively of the observations then we get the point estimator of

Since is a complete, sufficient statistic for (μ, σ2) and

is an unbiased estimator of

it follows from a well-known theorem (Lehmann, 1983) that is an UMVUE of log θ. From the independence of Y and S2 we get that the variance of is

Here we note that, at this point in their discussion of the relevant point estimators, Zhou and Gao (1997) made an error in stating that

is an unbiased estimator of

Although E(S2) = σ2, S4 is not an unbiased estimator of σ4. In fact, because

it can be easily shown that

Thus the correct unbiased estimator of Var() of this form is:


  This is also the UMVUE of

Assuming approximate normality for , the approximate confidence limits for θ may be obtained in the form nα = exp (β+Zαζ).

Angus's conservative method: Angus proposed a conservative method for construction of a confidence interval for lnθ based on the following approximate pivotal statistic:

(1)

when the sample is finite however, (1) has the same distribution as:

(2)

where, N and χ2 (n-1) are independent, N is the standard normal and 2(n-1) is a χ2(n-1) is a χ2- distribution with n-1 degrees of freedom. This leads (Zhou and Gao, 1997), to the lower and upper limits respectively of the (1-α) confidence interval for ln θ.

A parametric bootstrap method: The bootstrap interval described by Angus applies the parametric t-percentile bootstrap method to the approximate pivotal statistics η(θ) Eq. 1 By letting t0 and t1 be the percentile and the percentile of η(θ), respectively. Hence a theoretical 1-α level confidence level for lnθ is

(3)

The unknown quantiles t0 and t1 can be estimated by a parametric bootstrap sample.

Modified signed log-likelihood ratio method: Wu et al. (2003) asserted that Cox's method did not perform well in small sample settings due its nonquadratic and asymmetric shape of the likelihood profile for small n. They instead considered the modified signed log-likelihood ratio introduced by Banndroff-Nielsen (1986 and 1991), generally known as the r* formula:

(4)

where, u (Ψ) is a quantity and the general form of r* is given in the Appendix. r* being asymptotically distributed as a standard normal variate with third order accuracy. Therefore, an approximate 100 (1-α)% confidence interval based on r* is

(5)

where unlike Cox's interval, this r* interval calculates the confidence limit from the observed asymmetric likelihood - based function r* (Ψ) which theoretically should have achieved a more accurate coverage probability than Cox's. The modified signed log-likelihood ratio method produced zero coverage errors and almost negligible average biases and both the coverage probabilities and average biases remained nearly constant as the variance increased.

THE IMPROVED COX’S CONFIDENCE INTERVAL(C I) ESTIMATORS

Firstly, we note a simple fact regarding the concept of estimation. An estimator, say t of the parameter say, θ is examined for its efficiency on the following counts:


(6)

Since a jointly complete sufficient statistic for (230, σ2), it suffices to find an unbiased estimator of M (t*) = V (t*) + B (t*), as a function of alone. This estimator would be a UMVUE. From (5), we have

Where, Using the results that and that and s2 are independent, we can find unbiased estimators of A and B as follows.

where,

Hence  = -v is an unbiased estimator of A. Further,


Hence = ((n+1)/(n-1)v is an unbiased estimator of B. Let . Then is an unbiased estimator of R and hence an UMVU estimator of R. We can write as

(7)

Hence,

(8)

and

(9)

Therefore,

(10)

Now, we are ready to propose our improved CIs, as follows:

Let’s call the Cox’s CI (Lower CI(:CI low) and Upper CI (:CI high) limits), as Estimator 1, i.e.,

As per our proposition the efficient CI’s (Lower CI(CI low) and Upper CI(CI high) limits) for the Lognormal mean θ:

And

SIMULATION AND CONCLUSIONS

As mentioned in the beginning, we emulate the Simulation Framework of Zhou and Gao (1997) for the good reasons explained in their paper. Adopting the same structure of their simulation study, we also have carried out a simulation study in this section, of which the results are reported in the Tables in the Appendix.

Using 6000 samples (of illustrative sizes of 51, 101, 151, 201 and 301) from the relevant lognormal distribution with illustrative values of σ2: 1.00, 1.25, 1.50, 1.75 and 2.00 (assuming like in Land (1971), for the sake of simplicity of illustration and without any loss of generality, that the population mean μ = - σ2/2), we have calibrated the characteristics of the CIs : Coverage Probability (Cvg. Prob.), Coverage Error (Cvg. Error), Length of the CI (Length), Proportion/ Probability of cases of the CI not covering the true value of the actual population mean, when CIs are on the left/right of the true value of the population mean(Left/Right Bs., respectively) and hence the Relative Bias (Rel. Bs.).

The results of the simulation study are tabulated in the five tables given in the appendix, which deal with sample sizes of 51, 101, 151, 201 and 301.

Overall, the Estimator 2 performs better than Cox’s method, in the sense that the achieved coverage probability is closer to the nominal probability of 0.90 (in other words, the coverage error is smaller)

Appendix:
For n = 51

REFERENCES

  • Land, C.E., 1971. Confidence intervals for linear functions of the normal mean and variance. Ann. Math. Stat., 42: 1187-1205.


  • Lehmann, E.L., 1983. Theory of Point Estimation. John Wiley and Sons, New York


  • Wu, J., A.C.M. Wong and G.Y. Jiang, 2003. Likelihood-based confidence interval for log-normal mean. Stat. Med., 22: 1849-1860.
    Direct Link    


  • Zhou, X.H. and S. Gao, 1997. Confidence intervals for the log-normal mean. Stats. Med., 16: 783-790.

  • © Science Alert. All Rights Reserved