Bootstrapping: A Nonparametric Approach to Identify the Effect of Sparsity of Data in the Binary Regression Models

Abstract: In this research, the bootstrap methods are used to investigate the effects of sparsity of the data for the binary regression models. The artificial data was created by the bootstrapping vector. We also used the percentile confidence intervals as a tool for inference, because they combine point estimation and hypothesis testing in a single inferential statement of great intuitive appeal. We found that the bootstrap confidence intervals are shorter than classical confidence intervals with the same confidence coefficient. We also found that some parameters that are non-significant when using classical confidence interval become significant with the bootstrapping sampling methods and vice versa. Moreover the bootstrap confidence intervals provided robust results for the sparse data. We also found that the sparsity of data results in the bad behaviour of the tail of the bootstrap sampling distribution, but reduction of confidence coefficient results to obtained robust confidence interval.

Fulltext PDF Fulltext HTML

How to cite this article

Mohammad Reza Zadkarami , 2008. Bootstrapping: A Nonparametric Approach to Identify the Effect of Sparsity of Data in the Binary Regression Models. Journal of Applied Sciences, 8: 2991-2997.

Keywords: confidence interval, bootstrapping vector, sparsity, percentile and Bootstrapping

INTRODUCTION

Binary regression is widely used in the applied statistics (Agresti, 2002; Kleinbaum and Klein, 2002; Farrell and Rogers-Stewart, 2008). In the standard setting, subject to regularity conditions, the Maximum Likelihood Estimator (MLE) of the unknown parameters in the binary regression are known to be consistent, asymptotically normal and efficient (Lehmann and Casella, 1998). However, this situation is not satisfied if the data is sparse (Zadkarami, 2000). This problem is encountered in medical (Allardice, 2001; Ainsworth and Dean, 2006), social (King and Zeng, 2001) and geography (Begueria, 2006; Griffith, 2006; Vanwalleghem, 2007) studies which involve the modeling of the outcomes of rare diseases or events because they are usually dealing with small numbers of deaths or events. Therefore, sparse binary response is of great interest (Fleiss et al., 2003; Farrell and Sutradhar, 2006). This phenomenon, sparsity, was pointed out firstly by Hauck and Donner (1977). Let Y_i ε {0,1} be the binary response variable. The probability of success (event) is p₁ = Pr(Y_i = 1) = F(βX_i) where, F(.) is a cumulative distribution function. If there are some maximum likelihood estimator of parameters, s, that are large, the curvature of the log-likelihood at can be much less than near β_i = 0 and so the Wald approximation underestimates the change in log-likelihood setting β_i = 0. This happens in such a way that as , the t-statistic tends to zero. Thus highly significant coefficients according to the likelihood ratio test may have non-significant t ratios. This problem happens when the data is sparse and the fitted probabilities are extremely close to zero or one (Venables and Ripley, 2002). However, there has been fairly extensive discussion of this in the statistical literature, usually claiming the non-existence of maximum likelihood estimates (Santer and Duffy, 1989). But, the phenomenon was discussed much earlier in the pattern recognition literature (Duda and Hart, 1973; Ripley, 1996).

The sparsity of data depends on a range of factors and models which may, or may not, be under the control of the researchers. Sparsity of the data creates some problems in using standard asymptotic methods and it also creates some computational problems in the estimation of the unknown parameters in the models.

The bootstrap is a nonparametric technique that can be used to provide statistical inference about the parameter estimates, especially when the standard asymptotic methods are not satisfied properly (Hansen et al., 1999; Yousef et al., 2005; Modarres et al., 2006; Gerard and Schucany, 2007; Tang et al., 2007). This approach is based on using repeated samples from the data to generate an empirical sampling distribution for a statistic. In response to sparsity of the data, a completely nonparametric bootstrap approach is applied to resample cases. Then we can determine the MLE of the parameters in the model and make inferences about them. Nonparametric simulation requires the generation of artificial data without assuming that the original data have some particular parametric distribution. We use confidence intervals as a tool for inference, because they combine point estimation and hypothesis testing in a single inferential statement of great intuitive appeal.

In almost every case the accuracy of the confidence intervals depends on parametric assumptions that we are considered. However if parametric assumptions are not satisfied properly, the bootstrap methods may be used to obtain a more robust nonparametric estimate of the confidence intervals to assumptions commonly made about data (Moulton and Zeger, 1991; DiCiccio and Efron, 1996). The bootstrap confidence intervals are not only asymptotically more accurate than the standard confidence intervals; they are also more correct (DiCiccio and Efron, 1996). Over the past decade, substantial attention has been paid to the development of techniques using the bootstrap sampling distribution to build confidence intervals around various population parameters (Efron and Tibshirani, 1986; Stine, 1989; Moulton and Zeger, 1991; DiCiccio and Efron, 1996; Tu and Zhou, 2002; Henderson, 2005; Hsieh et al., 2007).

The bootstrap confidence intervals are divided into two groups: parametric and nonparametric, according to the assumptions which are used in their construction (Cojbasic and Tomovic, 2007; Andrews et al., 2006; Karlis and Patilea, 2008). The choice of which bootstrap confidence interval method to use is highly dependent on the particular research situation facing an analyst (Manichaikul et al., 2006). None of these techniques offers the best confidence intervals in every situation, because the criteria for judging the quality of their results vary widely (DiCiccio and Romano, 1988). In this research percentile confidence is used because the data is sparse and the analytic formulae are not satisfied properly. The percentile approach provides a nonparametric method that does not require any assumption, such as normality, in order to build a confidence interval (Mudelsee and Alkio, 2007).

From the point of view of demands on computational resources, bootstrap methods are particularly useful for small data sets because the procedure usually needs at least 1,000 repeated bootstrap samples in order to provide robust results for making confidence intervals (Efron and Tibshirani, 1993).

MATERIALS AND METHODS

Let Y_i ε {0,1} be the binary response variable. The probability of success is given by

where, η_i = βX_i denotes the linear predictor, X_i denotes the vector of explanatory variables that characteristics of the individual i, β is a interested parameter vector and Φ(.) is the standard normal cumulative distribution. The log likelihood function is

The probit link function is used because this link function produces better results compared to the other link functions in the sparse data (Zadkarami, 2000).

Percentile confidence interval: There are two methods the bootstrapping residual (parametric bootstrap approach) and the bootstrapping vector (nonparametric bootstrap approach) for Generalized Linear Models (GLMs), (Salibian-Barrera, 2005; Pardo-Fernaudez, 2007; Shen and Zhu, 2008). However, the definitions of the residuals for GLMs and consequently nonparametric residual resampling approach for GLMs, are not unique (Davison and Hinkley, 1997). These residuals are: standardized Pearson residual, standardized residual on the linear predictor and standardized deviance residual. Consequently, three different approaches for bootstrapping samples for GLMs can be used, corresponding to these three residuals. These residuals are scaled implicitly or explicitly. However, Davison and Hinkley (1997) reported that none of these methods is perfect.

The disadvantage of the parametric bootstrap approaches for a GLM is that these methods involve the simulation of a new data set from the fitted parametric model. In fact, the generated data sets from a poorly fitting model may not have the statistical properties of the original data, particularly when count data are overdispersed relative to a Poisson or binomial model (Davison and Hinkley, 1997).

In response to these difficulties, a completely nonparametric approach is to resample cases in the manner described as follows. Nonparametric simulation requires the generation of artificial data without assuming that the original data have some particular parametric distribution. Let the vector z_i = (X_i, Y_i), i = 1, 2, ...,n, be a sample from some multivariate distribution F(.) of (X, Y). In this approach, the regression coefficients are viewed as a statistical function of F(.), but with no assumptions on the random errors of model other than independence. The pairs (X_i, Y_i) are bootstrapped to provide the bootstrap data set z* so that

for i₁, i₂,....i_n, a random sample of the integers 1 to n. Then, following steps 1, 2, 3 and 4 will allow us to make inferences about parameters in the model.

•	Selected B replicated bootstrap samples of size n, z₁, z₂, ...z_B*.
•	The binary response with probit link function was fitted to each bootstrapping sample z* = (Y, X) and the vector of the parameters, , is estimated by GLIM package ( Aitkin et al., 2005).
•	The 100(1-2α)% percentile confidence interval for the parameter β is obtained as

where, is calculated as

(1)

•	The BC (Bias Correction) confidence interval is calculated as

where, was introduced in (1) and α₁ and α₂ are functions of z_α, the α th percentile of the standard normal distribution. In fact, the percentile interval may be improved by a simple adjustment, the BC (bias-corrected) method (Efron and Tibshirani, 1986; Pituch et al., 2006; Cerin and Leslie 2008; Timmerman and Ter Braak, 2008). In the BC method, α₁ and α₂ are calculated by

(2)

Where:

When half of the bootstrap distribution of is less than the observed value , then

The bootstrap confidence intervals are also divided into two groups: parametric (normal approximation method and the bootstrap-t method) and nonparametric (the percentile method), according to the assumptions which are used in their construction. The choice of which bootstrap confidence interval method to use is highly dependent on the particular research situation facing an analyst. None of these techniques offers the best confidence intervals in every situation, because the criteria for judging the quality of their results vary widely (DiCiccio and Romano, 1988). However, normal approximation confidence intervals rely on a strong parametric assumption when the bootstrap procedure actually was designed to be a nonparametric technique. Confidence intervals developed in this way are no better than those developed using the traditional parametric approach when this particular assumption is violated (Mooney and Duval, 1993).

The bootstrap-t method raises two problems. First, we need to calculate , the estimated standard error of for the bootstrap sample , which is difficult when is a complicated statistic for which no simple standard error formula exists. Therefore, we need to compute a bootstrap estimate of the standard error for each bootstrap sample. This implies that two nested levels of bootstrap sampling are needed, which is costly in terms of computational resources. Another problem is that the bootstrap-t confidence interval does not possess the transformation-respecting property, such that it makes a difference which scale is used to construct the interval and some scales are better than others (Efron and Tibshirani, 1993). But the advantages of this method are:

•	This method is free from parametric assumptions and is simple to execute. We need no complex analytical formulae to estimate the parameters of `s assumed sampling distribution and no tables of critical values for the probabilities of the standardized sampling distribution are needed.
•	If a statistic is distributed asymmetrically, it does not in theory adversely affect the accuracy of the percentile method`s confidence interval. The bootstrap percentile method allows to conform to any shape suggested by the data. This allows the confidence intervals to be asymmetrical around the expected value of (Mooney and Duval, 1993; Heiermann et al., 2005).
•	The percentile interval for any (monotone) parameter transformation m(β) is simply the percentile interval for β mapped by m(β), because this method is transformation-respecting (Efron and Tibshirani, 1993).

However, the percentile method does have some problems. First, it may perform poorly with small samples, because the tails of the sampling distribution in these types of confidence interval calculation are important. The second potential problem with the percentile method is that we must assume that the bootstrapped sampling distribution is an unbiased estimate of . However, this is certainly less restrictive than assuming has some standard distribution with known properties. However, the percentile interval may be improved by the BC method (Efron and Tibshirani, 1986).

RESULTS AND DISCUSSION

The National Child Development Survey (NCDS) data set is used to investigate the effect of the sparsity of data on parameter estimates for the binary model. This data set was collected on babies born in one week (3-9 March 1958) in England, Wales and Scotland (Wildschut et al., 1997; Spencer, 2006). We selected a sample of 10,141 individuals for whom we have complete information on forty variables associated with perinatal mortality. The interval between 28 weeks (196 days) of gestation and week 4 after birth is divided into four subintervals, allowing for the possibility of death before delivery (antepartum stillbirth), during delivery (fresh stillbirth), in the first week after birth (early neonatal deaths), or between the first and fourth weeks after birth (late neonatal deaths). We call these stages 1, 2, 3 and 4, respectively. The survivors beyond stage one are further divided into two groups, those with assisted delivery and those with natural delivery to allow for the differential effects of the type of delivery in the two groups (Zadkarami, 2000). The data in stage 3, early neonatal duration, of the assisted delivery cases are selected as an illustration in the investigation of the effect of sparsity of data on the parameter estimates for the binary models. There are 1132 cases of the assisted delivery that include 12 deaths, Y = 1 and 1120 survival cases, Y = 0.

In the nonparametric approach the bootstrapping vector is used to construct the percentile confidence interval for parameters in the model. Although the bootstrapping of cases demands extensive computational resources, this method has the properties of the original data set and the results are not affected by poorly fitting model.

Efron and Tibshirani (1993) suggested at least 1,000 repeated bootstrap samples were needed in order to provide robust results for making confidence intervals. In order to protect the results from the effect of sparseness in the data, 2,000 bootstrapping samples z* = (X*, Y*) were selected from the original data set (X, Y). Moreover, the bootstrapping sample size was subsequently increased to 5,000 and 10,000 samples in order to obtain a better bootstrap confidence interval. However, increasing the number of bootstrap samples from 5,000 to 10,000 improved the results only slightly. The binary response with probit link function was fitted to each bootstrapping sample z* = (X*, Y*). Various starting values were used to assess the effect of starting values on the results. Finally the percentile confidence interval for each parameter was obtained.

As empirical results demonstrated, bootstrap samples with far fewer numbers of (early neonatal) deaths result in poorly estimated parameters. Consequently, the tails of the bootstrap sampling distribution are badly behaved at levels of 95% and higher. However, the behaviour the tails of the distribution at the 90% level is satisfactory enough to allow us to construct the percentile confidence interval. The bad behaviour of the tail of the bootstrap sampling distribution also does not allow the BC methods to improve the endpoints of the percentile confidence interval. Therefore the sparsity of the data results in the bad behaviour of the tail of the bootstrap sampling distribution at levels of 95% and higher. However, the reduction of confidence coefficient results to find the robust confidence interval.

Table 1 displays the 90% percentile confidence interval and classical confidence interval based on the results of the package GLIM (Aitkin et al., 2005). As we can see, the two confidence intervals are completely different and the percentile confidence intervals are shorter than classical confidence intervals with the same confidence coefficient. Shorter size is one of optimality in using confidence intervals (Casella and Berger, 2002).

In the percentile confidence interval, the variable “baby birth order (2nd or later baby)” is associated positively with early neonatal death. Uddian and Hossain (2008) reported similar results. A previous caesarean delivery does not increase the risk of neonatal death (Richter et al., 2007). We also find similar results as reported in Table 1.

We observed that “the week of 1st antenatal visit” also is associated negatively with neonatal death. Uddian and Hossain (2008) reported that the negative association between neonatal mortality and timing of first antenatal check was highly significant such that the neonatal mortality was highest among the babies whose mother had not received antenatal check during pregnancy. Improving antenatal and neonatal care are resulted in the decline in perinatal mortality (Forssas et al., 1998; Cruz-Anguiano et al., 2004). Moreover, every year many women die due to pregnancy and delivery-related complications. Many of these deaths and related morbidity can be avoided through effective appropriate maternity care and preventive, diagnostic and timely therapeutic interventions (Petrou et al., 2003).

Table 1:	The results of bootstrap and classical confidence intervals

APH²: Accidental antepartum haemorrhage

Whether the baby is delivered naturally or with assistance reflects both the effect of previous pregnancies and the current pregnancy. If the health of the mother or baby would be endangered by allowing the pregnancy to continue, the labour will be induced by using artificial means. Sanchez-Ramos et al. (2003) reported that women whose labour was induced experienced a lower perinatal mortality rate. We found that the variable whether labour induced (oxytocin+surgical, O.B.E.+oestrogen and oxytocin in labour) are negatively associated with early neonatal deaths for assisted delivery babies.

We also found the variables abnormality during pregnancy (placenta praevia), past complication of pregnancy (other abnormality) are negatively associated with early neonatal deaths for assisted delivery babies. In fact the association between variables “abnormality during pregnancy, past complication of pregnancy with early neonatal deaths is the effect of type of delivery on the surviving baby during the first week after delivery (Zadkarami, 2000). Moreover, a previous caesarean delivery does not increase the risk of neonatal death (Richter et al., 2007). We obtained similar results as reported in Table 1.

However, in the classical confidence interval, the terms birthweight and (birthweight)**2 are the only significant variables. Considering the results that are displayed in Table 1 and discussion about variables in the model, the percentile confidence interval presents better results for the sparse data.

CONCLUSION

Our results confirm that sparsity of the data can affect the results of fitting binary models. We found that some parameters that are non-significant when using classical confidence interval become significant with the bootstrapping sampling methods and vice versa. We also found that the bootstrap confidence intervals are shorter than classical confidence intervals with the same confidence coefficient. Therefore, the bootstrap confidence intervals are more robust and the bootstrap methods can be useful to confirm results in the analysis of sparse data.

REFERENCES

Agresti, A., 2002. Categorical Data Analysis. 2nd Edn., John Wiley and Sons Inc., New Jersey, USA., ISBN: 9780471360933, Pages: 710

Ainsworth, L.M. and C.B. Dean, 2006. Approximate inference for disease mapping. Comput. Stat. Data Anal., 50: 2552-2570.
CrossRef ISI

Aitkin, M., D.A. Anderson, B. Frances and J.P. Hinde, 2005. Statistical Modeling in GLIM4. 2nd Edn. Oxford University Press, ISBN: 0-19-852413-7 UK

Allardice, G.M., E.M. Wright, M. Peterson and J.M. Miller, 2001. A statistical approach to an out break of endopthalmitis following cataract surgery at hospital in the west of Scotland. J. Hosp. Infect., 49: 23-29.
CrossRef PubMed ISI

Andrews, D.W.K., O. Lieberman and V. Marmer, 2006. Higher-order improvements of the parametric bootstrap for long-memory Gaussian processes. J. Econ., 133: 673-702.
CrossRef ISI

Begueria, S., 2006. Changes in land cover and shallow landslide activity: A case study in the Spanish Pyrenees. Geomorphology, 74: 196-206.
CrossRef ISI

Casella G. and R.L. Berger, 2002. Statistical Inference. 2nd Edn. Duxbury Press, California, USA., ISBN: 0-534-24312-6

Cerin, E. and E. Leslie, 2008. How socio-economic status contributes to participation in leisure-time physical activity. Soc. Sci. Med., 66: 2596-2609.
CrossRef PubMed

Cojbasic, V. and A. Tomovic, 2007. Nonparametric confidence intervals for population variance of one samples and the difference of variances of two samples. Comput. Stat. Data Anal., 51: 5562-5578.
CrossRef ISI

Cruz-Anguiano, V., J.O. Talavera, L. Vazquez, A. Antonio, A. Castellanos, M.A. Lezana and N.H. Wacher, 2004. The importance of quality of care in perinatal mortality: A case-control study in Chiapas, Mexico. Arch. Med. Res., 35: 554-562.
CrossRef PubMed ISI

Davison, A.C. and D.V. Hinkley, 1997. Bootstrap Methods and Their Application. 1st Edn., Cambridge University Press, Cambridge, UK., ISBN: 0-521-57391-2

DiCiccio, T.J. and J.P. Romano, 1988. A review of bootstrap confidence intervals. J. R. Statist. Soc. B., 50: 338-354.
Direct Link

DiCiccio, T.J. and B. Efron, 1996. Bootstrap confidence intervals. Stat. Sci., 11: 189-228.
CrossRef Direct Link

Duda, R.O. and P.E. Hart, 1973. Pattern Classification and Scene Analysis. 2nd Edn., John Wiley and Sons, New York, ISBN: 0-471-22361-1, pp: 482
Direct Link

Efron, B. and R. Tibshirani, 1986. Bootstrap methods for standard errors, confidence intervals and other measures of statistical accuracy. Statist. Sci., 1: 54-75.
CrossRef Direct Link

Efron, B. and R.J. Tibshirani, 1993. An Introduction to the Bootstrap. 1st Edn., Chapman and Hall Inc., New York, USA

Farrell, P.J. and B.C. Sutradhar, 2006. A non-linear conditional probability model for generating correlated binary data. Stat. Prob. Lett., 76: 353-361.
CrossRef ISI

Farrell, P.J. and K. Rogers-Stewart, 2008. Methods for generating longitudinally correlated binary data. Int. Stat. Rev., 76: 28-38.
CrossRef ISI

Fleiss, J.L., B. Lrvin and M.C. Paik, 2003. Statistical Methods for Rates and Proportions (Chapter 15). 3rd Edn., Johan Wiley and Sons, Inc., New York, ISBN: 978-0471-52629-2

Forssas, E., M. Gissler and E. Hemminki, 1998. Declining perinatal mortality in Finland between 1987 and 1994: Contribution of different subgroups. Eur. J. Obstet. Gynecol. Reprod. Biol., 80: 177-181.
CrossRef PubMed ISI

Gerard, P.D. and W.R. Schucany, 2007. An enhanced sign test for dependent binary data with small numbers of clusters. Comput. Stat. Data Anal., 51: 4622-4632.
CrossRef ISI

Griffith, D.A., 2006. Assessing spatial dependence in count data, winsorized and spatial filter specification alternatives to the auto-Poisson model. Geog. Anal., 38: 160-179.
CrossRef ISI

Hansen, C.H., M.A. Evans and T.D. Shultz, 1999. Application of the bootstrap procedure provides an alternative to standard statistical procedures in the estimation of the vitamin B_6 requirement. J Nutr., 129: 1915-1919.
PubMed ISI

Heiermann, K., H. Riesch-Oppermann and N. Huber, 2005. Reliability confidence intervals for ceramic components as obtained from bootstrap methods and neural networks. Comput. Mater. Sci., 34: 1-13.
CrossRef ISI

Henderson, A.R., 2005. The bootstrap: A technique for data-driven statistics using computer-intensive analyses to explore experimental data. Clin. Chim. Acta, 359: 1-26.
CrossRef PubMed

Hsieh, K.L., Y.K. Chen and C.C. Shen, 2007. Bootstrap confidence interval estimates of the bullwhip effect. Simul. Model. Pract. Theory, 15: 908-917.
CrossRef

Hauck Jr, W.W. and A. Donner, 1977. Wald's test as applied to hypotheses in logit analysis. J. Am. Stat. Assoc., 72: 851-853.
Direct Link

Karlis D. and V. Patilea, 2008. Bootstrap confidence intervals in mixtures of discrete distributions. J. Stat. Planning Infer., 138: 2313-2329.
CrossRef ISI

King, G. and L. Zeng, 2001. Logistic regression in rare events data. Polit. Anal., 9: 137-163.
Direct Link

Kleinbaum, D.G. and M. Klein, 2002. Logistic Regression. 2nd Edn., Springer-Verlag, Inc., New York, ISBN: 0-387-95397-3,

Lehmann, E.L. and G. Casella, 1998. Theory of Point Estimation. 2nd Edn., Springer-Verlag, New York, ISBN: 0-387-98502-6

Manichaikul, A., J. Dupuis, S. Sen and K.W. Broman, 2006. Poor performance of bootstrap confidence intervals for the location of a quantitative trait locus. Genetics, 174: 481-489.
CrossRef PubMed ISI

Modarres, R., T.P. Hui and G. Zheng, 2006. Resampling methods for ranked set samples. Comput. Stat. Data Anal., 51: 1039-1050.
CrossRef ISI

Mooney, C.Z., R.D. Duval and R. Duval, 1993. Bootstrapping: A Nonparametric Approach to Statistical Inference. SAGE Publications, Newbury Park, CA., ISBN: 9780803953819, Pages: 73
Direct Link

Moulton, L.H. and S.L. Zeger, 1991. Bootstrapping generalized linear models.Bootstrapping generalized linear models. Comput. Stat. Data Anal., 11: 53-63.
CrossRef ISI

Mudelsee, M. and M. Alkio, 2007. Quantifying effects in two-sample environmental experiments using bootstrap confidence intervals. Environ. Model. Softw. 22: 84-96.
CrossRef ISI

Pardo-Fernaudez, J.C., 2007. Comparison of error distributions in nonparametric regression. Stat. Probab. Lett., 77: 350-356.
CrossRef ISI

Petrou, S., E. Kupek, S. Vause and M. Maresh, 2003. Antenatal risk and adverse perinatal outcomes: Results from a British population-based study. Eur. J. Obstet. Gynecol. Reprod. Biol., 106: 40-49.
CrossRef ISI

Pituch, K.A., L.M. Stapleton and J.Y. Kang, 2006. A comparison of single sample and bootstrap methods to assess mediation in cluster randomized trials. Multi. Behav. Res., 41: 367-400.
CrossRef ISI

Richter, R., R.L. Bergmann and J.W. Dudenhausen, 2007. Previous caesarean or vaginal delivery: which mode is a greater risk of perinatal death at the second delivery. Eur. J. Obstet. Gynecol. Reprod. Biol., 132: 51-57.
CrossRef PubMed ISI

Ripley, B.D., 1996. Pattern Recognition and Neural Networks. 1st End., Cambridge University Press, Cambridge, ISBN: 0-521-46086-7

Salibian-Barrera, M., 2005. Estimating the p-values of robust test for the linear model. J. Stat. Planning Infer., 128: 241-257.
CrossRef ISI

Sanchez-Ramos, L.S., F. Olivier, I. Delke and A.M. Kaunitz, 2003. Labor induction versus expectant management for postterm pregnancies: A systematic review with meat-analysis. Obstet. Gynecol., 101: 1312-1318.
CrossRef PubMed ISI

Santer, T.J. and D.E. Duffy, 1989. The Statistical Analysis of Discrete Data. 1st Edn., Springer-Verlag, New York, ISBN: 0-38797-018-5

Shen, H. and Z. Zhu, 2008. Efficient mean estimation in log-normal linear models. J. Stat. Planning Infer., 138: 552-567.
CrossRef ISI

Spencer, N., 2006. Explaining the social gradient in smoking in pregnancy: Early life course accumulation and cross-sectional clustering of social risk exposures in the 1958 British national cohort. Soc. Sci. Med., 62: 1250-1259.
CrossRef ISI

Stine, R., 1989. An introduction to bootstrap methods. Soc. Methods Res., 18: 243-291.
CrossRef ISI

Tang, M.L., K.W. Ng, G.L. Tian and M. Tan, 2007. On improved EM algorithm and confidence interval construction for incomplete tables. Comput. Stat. Data Anal., 51: 2919-2933.
CrossRef ISI

Timmerman, M.E. and C.J.F. Ter Braak, 2008. Bootstrap confidence intervals for principle response curves. Comput. Stat. Data Anal., 52: 1837-1849.
CrossRef ISI

Tu, W. and V.H. Zhou, 2002. A bootstrap confidence interval procedure for the treatment effect using propensity score subclassification. Health Ser. Outco. Res. Methods, 3: 135-147.
CrossRef

Uddian, M.J. and M.Z. Hossain, 2008. Predictors of infant mortality in a developing country. Asian J. Epidemiol., 1: 1-16.
CrossRef Direct Link

Venables, W.N. and B.D. Ripley, 2002. Modern Applied Statistics with S. 4th Edn., Springer-Verlag Inc., New York, ISBN: 0-387-95457-0

Vanwalleghem, T., M.V.D. Eeckhaut, J. Poesen, G. Govers and J. Deckers, 2008. Spatial analysis of factors controlling the presence of closed depression and gullies under forest: Application of rare event logistic regression. Geomorphology, 95: 504-517.
CrossRef ISI

Wildschut, H.I.J., T. Nas and J. Golding, 1997. Are Socio-demographic factors predicative of pre-term Birth? A reappraisal of the 1958 British perinatal mortality survey. Br. J. Obstet. Gynecol., 104: 57-63.
CrossRef PubMed ISI

Yousef, W.A., R.F. Wagner and M.H. Loew, 2005. Estimating the uncertainty in the estimated mean area under the ROC curve of a classifier. Pattern Recognit. Lett., 26: 2600-2610.
CrossRef ISI

Zadkarami, M.R., 2000. Longitudinal data analysis: Some of the statistical issues arising in the analysis of perinatal mortality. Ph.D. Thesis, Lancaster University, UK

HOME JOURNALS CONTACT

Journal of Applied Sciences

Year: 2008 | Volume: 8 | Issue: 17 | Page No.: 2991-2997
DOI: 10.3923/jas.2008.2991.2997

Bootstrapping: A Nonparametric Approach to Identify the Effect of Sparsity of Data in the Binary Regression Models

Mohammad Reza Zadkarami

How to cite this article

Mohammad Reza Zadkarami , 2008. Bootstrapping: A Nonparametric Approach to Identify the Effect of Sparsity of Data in the Binary Regression Models. Journal of Applied Sciences, 8: 2991-2997.

Keywords: confidence interval, bootstrapping vector, sparsity, percentile and Bootstrapping

REFERENCES

HOME JOURNALS CONTACT

Journal of Applied Sciences

Year: 2008 | Volume: 8 | Issue: 17 | Page No.: 2991-2997 DOI: 10.3923/jas.2008.2991.2997

Bootstrapping: A Nonparametric Approach to Identify the Effect of Sparsity of Data in the Binary Regression Models

Mohammad Reza Zadkarami

How to cite this article

Mohammad Reza Zadkarami , 2008. Bootstrapping: A Nonparametric Approach to Identify the Effect of Sparsity of Data in the Binary Regression Models. Journal of Applied Sciences, 8: 2991-2997.

Keywords: confidence interval, bootstrapping vector, sparsity, percentile and Bootstrapping

REFERENCES

Year: 2008 | Volume: 8 | Issue: 17 | Page No.: 2991-2997
DOI: 10.3923/jas.2008.2991.2997