

Articles
by
Habshah Midi 
Total Records (
20 ) for
Habshah Midi 





Hassan S. Uraibi
,
Habshah Midi
,
Bashar A. Talib
and
Jabar H. Yousif


Problem statement: Bootstrap approach had introduced new advancement in modeling and model evaluation. It was a computer intensive method that can replace theoretical formulation with extensive use of computer. The Ordinary Least Squares (OLS) method often used to estimate the parameters of the regression models in the bootstrap procedure. Unfortunately, many statistics practitioners are not aware of the fact that the OLS method can be adversely affected by the existence of outliers. As an alternative, a robust method was put forward to overcome this problem. The existence of outliers in the original sample may create problem to the classical bootstrapping estimates. There was possibility that the bootstrap samples may contain more outliers than the original dataset, since the bootstrap resampling is with replacement. Consequently, the outliers will have an unduly effect on the classical bootstrap mean and standard deviation. Approach: In this study, we proposed to use a robust bootstrapping method which was less sensitive to outliers. In the robust bootstrapping procedure, we proposed to replace the classical bootstrap mean and standard deviation with robust location and robust scale estimates. A number of numerical examples were carried out to assess the performance of the proposed method. Results: The results suggested that the robust bootstrap method was more efficient than the classical bootstrap. Conclusion/Recommendations: In the presence of outliers in the dataset, we recommend using the robust bootstrap procedure as its estimates are more reliable. 





Ehab A. Mahmood
,
Habshah Midi
,
Sohel Rana
and
Abdul Ghapor Hussin


Background and Objective: The existence of outliers in any type of data influences the efficiency of an estimator. Few methods for detecting outliers in a simple circular regression model have been proposed in the study but it suspected that they are not very successful in the presence of multiple outliers in a data set. This study aimed to investigate new statistic to identify multiple outliers in the response variable in a simple circular regression model. Materials and Methods: The proposed statistic is based on calculating robust circular distance between circular residuals and circular location parameter. The performance of the proposed statistic is evaluated by the proportion of detected outliers and the rate of masking and swamping. The simulation study is applied for different sample sizes at 10 and 20% ratios of contamination. Results: The results from simulated data showed that the proposed statistic has the highest proportion of outliers and the lowest rate of masking comparing with some existing methods. Conclusion: The proposed statistic is very successful in detecting outliers with negligible amount of masking and swamping rates. 





S.K. Sarkar
and
Habshah Midi


Logistic regression is a sophisticated statistical tool for data analysis in both control experimentation and observational studies. The goal of logistic regression is to correctly predict the category of outcome for individual cases using the most parsimonious model. To accomplish this goal, a model is created that includes all predictor variables that are useful in predicting the response variable. The logistic regression model is being used with increasing rate in various fields in data analysis. In spite of such increase, there has been no commensurate increase in the use of commonly available methods for assessing the model adequacy. Failure to address model adequacy may lead to misleading or incorrect inferences. Therefore, the goal of this study is to present an overview of a few easily employed methods for assessing the fit of logistic regression models. The summary measures of goodnessoffit as Likelihood Ratio Test, HosmerLemeshow goodnessoffit test, OsiusRojek large sample approximation test, Stukel test and area under Receiver Operating Characteristic curve indicate that the logistic regression model fits the data quite well. However, recommendations are made for the use of methods for assessing the model adequacy in different aspects before proceed to present the results from a fitted logistic regression model. 




A. Bagheri
,
Habshah Midi
and
A.H.M.R. Imon


In this study, the effect of different patterns of high leverages on the classical multicollinearity diagnostics and collinearityinfluential measure is investigated. Specifically the investigation is focus on in which situations do these points become collinearityenhancing or collinearityreducing observations. Both the empirical and the Monte Carlo simulation results, in collinear data sets indicate that when high leverages exist in just one explanatory variable or when the values of the high leverages are in different positions of the two explanatory variables, these points will be collinearityreducing observations. On the other hand, these high leverages are collinearityenhancing observations when their values and positions are the same for the two collinear explanatory variables. 




S.K. Sarkar
,
Habshah Midi
and
Sohel Rana


Logistic regression is one of the most frequently used statistical methods as a standard method of data analysis in many fields over the last decade. However, analysis of residuals and identification of influential outliers are not studied so frequently to check the adequacy of the fitted logistic regression model. Detection of outliers and influential cases and corresponding treatment is very crucial task of any modeling exercise. A failure to detect influential cases can have severe distortion on the validity of the inferences drawn from such modeling. The aim of this study is to evaluate different measures of standardized residuals and diagnostic statistics by graphical methods to identify potential outliers. Evaluation of diagnostic statistics and their graphical display detected 25 cases as outliers but they did not play notable effect on parameter estimates and summary measures of fits. It is recommended to use residual analysis and note outlying cases that can frequently lead to valuable insights for strengthening the model. 




Ashkan Shabbak
,
Habshah Midi
and
Mohd Nooh Hassan


A separate univariate control chart for each characteristic is often used to detect changes in the inherent variability of a process because of ease of computation. Nevertheless, the shortcoming of using separate individual charts would not have detected out of control condition when the characteristics of interests are correlated. The problems get more complicated in the existence of sustained shift in the process mean. In this study, we proposed a robust multivariate control chart which is less sensitive to the sustained shift in mean process. However, the theoretical cutoffpoints of the proposed charts are intractable. As an alternative, we proposed two different procedures of empirical cutoffpoints. The performance of these two cutoffpoints in detecting a step shift in the mean vector is investigated extensively by real example and monte carlo simulation. 




Bader Ahmad I. Aljawadi
,
Mohd Rizam A. Bakar
,
Noor Akma Ibrahim
and
Habshah Midi


In population based cancer clinical trials, a proportion of patients will never experience the interested event and considered as “cured” or “immunes”. The majority of recent cancer studies focus on the estimation of immune proportion. In this study we investigated the estimation of proportion of patients curd of cancer in case of left censored data based on the Bounded Cumulative Hazard (BCH) model proposed by Chen in 1999. The analysis provided the Maximum Likelihood Estimation (MLE) of the parameters within the framework of the Expectation Maximization (EM) algorithm where the numerical solutions of the estimation equations of the cure rate parameter could be employed. 




Habshah Midi
and
Syaiba Balqish Ariffin


Detection of outlier based on standardized Pearson residuals
has gained widespread use in logistic regression model in the presence of a
single outlier. An innovation attempts in the same direction but dealing for
a group of outliers have been made using generalized standardized Pearson residual
which requires a graphical or a robust estimator to find suspected outliers
to form a group deletion. In this study, an alternative measure namely modified
standardized Pearson residual is derived from the robust logistic diagnostic.
The weakness of standardized Pearson residuals and the usefulness of generalized
standardized Pearson residual and modified standardized Pearson residual are
examined through several real examples and Monte Carlo simulation study. The
results of this study signify that the generalized standardized Pearson residual
and the modified standardized Pearson residual perform equally good in identifying
a group of outliers. 





Habshah Midi
,
Ehab A. Mahmood
,
Abdul Ghapor Hussin
and
Jayanthi Arasan


Mean direction is a good measure to estimate circular location parameter in univariate circular data.
However, it is bias and cause misleading when the circular data has some outliers, especially with increasing
ratio of outliers. Trimmed mean is one of robust method to estimate location parameter. Therefore in this study,
it is focused to find a robust formula for trimming the circular data. This proposed method is compared with
mean direction, median direction and M estimator for clean and contaminated data. Results of simulation study
and real data prove that trimmed mean direction is very successful and the best among them. 




Habshah Midi
and
Nasuhar Ab. Aziz


Quality engineering practitioners have great interest for using response surface method in a real situation. Recently, robust design has been widely used extensively for multiple responses in terms of the process location and process scale based on sample mean and sample variance, respectively. One of the methods that can be used to simultaneously, optimize multiple responses is by using the Augmented Approach to the Harrington’s Desirability Function (AADF) technique by assigning weight to the location and scale in order to see the reflection the relative importance for both effects. In this technique, the AADF approach uses a dimensionality reduction approach that converts multiple predicted responses into a single response problem. Furthermore , for the regression fitting secondorder polynomials model, the Ordinary Least Squares (OLS) method is usually used to acquire the sufficient response functions for the process location and scale based on mean and variance. Nevertheless, these existing procedures are easily influenced by outliers. As an alternative, we propose the uses of higherorder estimation techniques for robust MMlocation, MMscale estimator and MM regression estimator to overcome the weakness and shortcomings. The numerical results signify that the proposed approach is more efficient than the existing methods. 




Habshah Midi
,
Shelan Saied Ismaeel
and
Shelan Saied Ismaeel


The detection of multicollinearity is very crucial, so that, proper remedial measures can be taken up in their presence. The widely used diagnostic method to detect multicollinearity in multiple linear regressions is by using Classical Variance Inflation Factor (CVIF). It is now evident that the CVIF failed to correctly detect multicollinearity when high leverage points are present in a set of data. Robust Variance Inflation Factor (RVIF) has been introduced to remedy this problem. Nonetheless, the computation of RVIF takes longer time because it is based on robust GM (DRGP) estimator which depends on Minimum Volume Ellipsoid (MVE) estimator that involves a lot of computer times. In this study, we propose a fast RVIF (FRVIF) which take less computing time. The results of the simulation study and numerical examples indicate that our proposed FRVIF successfully detect multicollinearity problem with faster rate compared to other methods. 




Habshah Midi
and
Sani Muhammad


In the presence of outlying observations in panel data set, the traditional ordinary least square estimator can be strongly biased, lead to erroneous estimation and misleading inferential statement. However, Weighted Least Squares (WLS) are usually used to remedy the effect of outliers. Visek used Least Weighted Squares (LWS) based on meancentering technique for data transformation. The meancentering was found to be very sensitive to outliers. Furthermore, robust method for data transformation is needed in order to down weight the effect of outliers. We employed a new method of transformation based on MMestimate of location termed MMCentering method. A simulation study was used to evaluate the performance the proposed method. The Weighted Least Square based on the proposed MMcentering Method (WLSMM) was found to be the best method for both the high leverage points and vertical outliers. 





Md. Sohel Rana
,
Habshah Midi
and
A.H.M. Rahmatullah Imon


Problem statement: The problem of heteroscedasticity occurs in regression analysis for many practical reasons. It is now evident that the heteroscedastic problem affects both the estimation and test procedure of regression analysis, so it is really important to be able to detect this problem for possible remedy. The existence of a few extreme or unusual observations that we often call outliers is a very common feature in data analysis. In this study we have shown how the existence of outliers makes the detection of heteroscedasticity cumbersome. Often outliers occurring in a homoscedastic model make the model heteroscedastic, on the other hand, outliers may distort the diagnostic tools in such a way that we cannot correctly diagnose the heteroscedastic problem in the presence of outliers. Neither of these situations is desirable. Approach: This article introduced a robust test procedure to detect the problem of heteroscedasticity which will be unaffected in the presence of outliers. We have modified one of the most popular and commonly used tests, the GoldfeldQuandt, by replacing its nonrobust components by robust alternatives. Results: The performance of the newly proposed test is investigated extensively by real data sets and Monte Carlo simulations. The results suggest that the robust version of this test offers substantial improvements over the existing tests. Conclusion/Recommendations: The proposed robust GoldfeldQuandt test should be employed instead of the existing tests in order to avoid misleading conclusion. 




Md.Sohel Rana
,
Habshah Midi
and
A.H.M. Rahmatullah Imon


Problem statement: Most of the statistical procedures heavily depend on normality assumption of observations. In regression, we assumed that the random disturbances were normally distributed. Since the disturbances were unobserved, normality tests were done on regression residuals. But it is now evident that normality tests on residuals suffer from superimposed normality and often possess very poor power. Approach: This study showed that normality tests suffer huge set back in the presence of outliers. We proposed a new robust omnibus test based on rescaled moments and coefficients of skewness and kurtosis of residuals that we call robust rescaled moment test. Results: Numerical examples and Monte Carlo simulations showed that this proposed test performs better than the existing tests for normality in the presence of outliers. Conclusion/Recommendation: We recommend using our proposed omnibus test instead of the existing tests for checking the normality of the regression residuals. 




Arezoo Bagheri
,
Habshah Midi
and
A.H.M. Rahmatullah Imon


Problem statement: High leverage points are extreme outliers in the Xdirection. In regression analysis, the detection of these leverage points becomes important due to their arbitrary large effects on the estimations as well as multicollinearity problems. Mahalanobis Distance (MD) has been used as a diagnostic tool for identification of outliers in multivariate analysis where it finds the distance between normal and abnormal groups of the data. Since the computation of MD relies on nonrobust classical estimations, the classical MD can hardly detect outliers accurately. As an alternative, Robust MD (RMD) methods such as Minimum Covariance Determinant (MCD) and Minimum Volume Ellipsoid (MVE) estimators had been used to identify the existence of high leverage points in the data set. However, these methods tended to swamp some low leverage points even though they can identify high leverage points correctly. Since, the detection of leverage points is one of the most important issues in regression analysis, it is imperative to introduce a novel detection method of high leverage points. Approach: In this study, we proposed a relatively new twostep method for detection of high leverage points by utilizing the RMD (MVE) and RMD (MCD) in the first step to identify the suspected outlier points. Then, in the second step the MD was used based on the mean and covariance of the clean data set. We called this method twostep Robust Diagnostic Mahalanobis Distance (RDMD^{TS}) which could identify high leverage points correctly and also swamps less low leverage points. Results: The merit of the newly proposed method was investigated extensively by real data sets and Monte Carlo Simulations study. The results of this study indicated that, for small sample sizes, the best detection method is (RDMD^{TS}) (MVE)mad while there was not much difference between (RDMD^{TS}) (MVE)mad and (RDMD^{TS}) (MCD)mad for large sample sizes. Conclusion/Recommendations: In order to swamp less low leverage as high leverage point, the proposed robust diagnostic methods, (RDMD^{TS}) (MVE)mad and (RDMDTS) (MCD)mad were recommended. 




Arezoo Bagheri
and
Habshah Midi


Problem statement: The Least Squares (LS) method has been the most popular technique for estimating the parameters of a model due to its optimal properties and ease of computation. LS estimated regression may be seriously disturbed by multicollinearity which is a near linear dependency between two or more explanatory variables in the regression models. Even though LS estimates are unbiased in the presence of multicollinearity, they will be imprecise with inflated standard errors of the estimated regression coefficients. It is now evident that the multiple high leverage points which are the outliers in the Xdirection may be the prime source of collinearityinfluential observations. Approach: In this study, we had proposed robust procedures for the estimation of regression parameters in the presence of multiple high leverage points which cause multicollinearity problems. This procedure utilized mainly a one step reweighted least square where the initial weight functions were determined by the DiagnosticRobust Generalized Potentials (DRGP). Here, we had incorporated the DRGP with different types of robust methods to downweight the multiple high leverage points which lead to reducing the effects of multicollinearity. The new proposed methods were called GMDRGPL_{1}, GMDRGPLTS, MDRGP, MMDRGP, DRGPMM. Some indicators had been defined to obtain the best performance robust method among the existing and new introduced methods. Results: The empirical study indicated that the DRGPMM emerge to be more efficient and more reliable than other methods, followed by the GMDRGPLTS as they were able to reduce the most effect of multicollinearity. The results seemed to suggest that the DRGPMM and the GMDRGPLTS offers a substantial improvement over other methods for correcting the problems of high leverage points enhancing multicollinearity. Conclusion/Recommendations: In order to solve the multicollinearity problems which are mainly due to the multiple high leverage points, two proposed robust methods, DRGP MM and the GMDRGPLTS, were recommended. 




S.K. Sarkar
and
Habshah Midi


Problem statement: The population problem is the biggest problem in the world. In the global and regional context, Bangladesh population has drawn considerable attention of the social scientists, policy makers and international organizations. Bangladesh is now world’s 10th populous country having about 140 million people. The recent experience of Bangladesh shows that fertility can sustain impressive declines even when women’s lives remain severely constrained. Recent statistics also suggest that, despite a continuing increase in contraceptive prevalence rate (56%), the expected fertility decline in Bangladesh has stalled. Approach: The purpose of this study was to explore the possibility of further fertility decline in Bangladesh with special attention to identify some social and demographic factors as predictors which are responsible to desire for more children using stepwise and best subsets logistic regression approaches. The study had compared two approaches to determine an optimum model for prediction of the outcome. Results: It had been found, excess desire for children is solely responsible for the stalled fertility. Conclusion: To overcome the situation, the policy makers of Bangladesh should pay their attention to eliminate the regional variations of desire for more children and introduce awareness programs among rural women about the positive impact of smaller family. 





Ehab A. Mahmood
,
Habshah Midi
and
Abdul Ghapor Hussin


Researchers interest to develop methods of robust estimation. These methods can be used when the data have outliers or not satisfy the condition of classical methods. However, few researchers suggest robust estimation of circular data. In this study, we propose robust estimation of circular variance and mean resultant length. The proposed robust estimation depends on extending trimmed procedure by find robust formula for trimming. Simulation results and practical example show that the proposed procedure for the circular variance and mean resultant length are better than classical methods for different ratios of outliers. 





Habshah Midi
,
S.K. Sarkar
and
Sohel Rana


The aim of this study was to fit a multinomial logit model and check whether any gain achieved by this complicated model over binary logit model. It is quite common in practice, the categorical response have more than two levels. Multinomial logit model is a straightforward extension of binary logit model. When response variable is nominal with more than two levels and the explanatory variables are mixed of interval and nominal scale, multinomial logit analysis is appropriate than binary logit model. The maximum likelihood method of estimation is employed to obtain the estimates and consequently Wald test and likelihood ratio test have been used. The findings suggest that parameter estimates under two logits were similar since neither Wald statistic was significant. Thus, it can be concluded that complicated multinomial logit model was no better than the simpler binary logit model. In case of response variable having more than two levels in categorical data analysis, it is strongly recommended that the adequacy of the multinomial logit model over binary logit model should be justified in its fitting process. 




Ng Kooi Huat
and
Habshah Midi


A control chart for detecting shifts in the variance of a process is developed for the case where the nominal value of variance is unknown. The Shewhart S control chart is one of the most extensively used statistical process control techniques developed for control process variability based on essential assumption that the underlying distribution of the quality characteristic is normal. However, this chart is very sensitive to the occurrence of occasional outliers. As an alternative, robust control charts are put forward when the underlying normality assumption is not met. In this study, a robust control chart for the process standard deviation σ by means of a robust scale estimator is proposed. The presented robust method seems to yield a better performance than the Shewhart method and had good properties for the contaminated and heavytailed distribution for moderate sample sizes. The proposed robust Modified biweight A chart acts as an alternative for practitioners who are interested in the detection of permanent shifts in the process variance whereby the presence of occasional outliers is usually associated with the occurrence of common causes. 





