Subscribe Now Subscribe Today
Research Article
 

Goodness-of-Fit Tests for GEE Models Using Kappa-like Statistic to Diabetes Mellitus Study



Md. Abdus Salam Akanda and Maksuda Khanam
 
Facebook Twitter Digg Reddit Linkedin StumbleUpon E-mail
ABSTRACT

A kappa-like classification statistic is used for assessing the fit of GEE regression models with a categorical response. The statistic is a summary measure depicting how well categorical responses are predicted from the fitted GEE model. The statistic takes on a value of 1 if prediction is perfect and a value of 0 if the fitted model fares no better than random chance, i.e., fitting the repeated categorical responses with an intercept-only model. To assess the performance of the classification statistic, we conducted analyses by using BIRDEM data and the concern is assessing the fit of the GEE categorical response models by determining how well the covariates predict the subject’s responses.

Services
Related Articles in ASCI
Similar Articles in this Journal
Search in Google Scholar
View Citation
Report Citation

 
  How to cite this article:

Md. Abdus Salam Akanda and Maksuda Khanam , 2005. Goodness-of-Fit Tests for GEE Models Using Kappa-like Statistic to Diabetes Mellitus Study. Journal of Applied Sciences, 5: 1597-1601.

DOI: 10.3923/jas.2005.1597.1601

URL: https://scialert.net/abstract/?doi=jas.2005.1597.1601

INTRODUCTION

Generalized Estimating Equations (GEE) are useful for analyzing correlated data with categorical or continuous responses [1,2]. Parameter estimation is conducted through estimating equations which converge to a sum of mean zero random variables if the mean structure is correctly specified. There is no need to specify a joint distribution for the responses. However, assessing model fit is further complicated with GEE than for models assuming independence because no likelihood is available and the residuals are correlated within a cluster. Some methods are available for assessing the fit of GEE regression models with binary responses. Horton[3] developed a goodness-of-fit test for assessing such model fit by extending Hosmer[4] goodness-of-fit statistic for ordinary logistic regression. Their proposed statistic has an approximate chi-squared distribution when the model is specified correctly. Barnhart[5] also propose a goodness-of-fit statistic for assessing the fit of GEE binary regression models. They extend Tsiatis’ method[6] for assessing the fit of ordinary logistic regression models. This approach involves partitioning the space of covariates into distinct regions and forming score statistics that are asymptotically distributed as chi-square random variables with the appropriate degrees of freedom. Barnhart’s[5] approach is best employed in the situation when there are only discrete covariates available because then there is no need to partition the covariates. Pan[7] has proposed goodness-of-fit tests for GEE with correlated binary data. Pan’s two tests result in the Pearson chi-square and an unweighted sum of residual squares, both of which are based on the residuals. These two tests can only be used when there is at least one continuous covariate available. If the possibility of influential observations is of concern to the data analyst, Preisser[8] have proposed deletion diagnostics for generalized estimating equations. The diagnostics consider leverage and residuals to measure the influence of a subset of observations on the fitted regression parameters. Preisser[9] also generalize the GEE procedure to produce parameter estimates and fitted values that are resistant to influential data. Here, the concern is assessing the fit of GEE categorical response models by determining how well the covariates predict the subject’s responses. We present the kappa-like classification statistic, which indicates how well the proposed model predicts the categorical response.

GENERALIZED ESTIMATING EQUATION (GEE)

We outline Lipsitz[10] method as follows. Assume the response of interest is a categorical outcome with K categories denoted zit = k if the tth subunit from the ith cluster falls in the kth category, for i = 1, 2,......,N; t = 1, 2,.....,Ti(T = max(Ti)∀i = 1, 2,.....,N) and k = 1, 2,.........,K. For simplicity, we will assume that the data are balanced, i.e., Ti = T for i = 1, 2,......,N. The following method will still be applicable in the case of unbalanced data. The T(K-1)x1 response vector Yt for cluster I consists of the binary random variables Yitk, where, Yitk = 1 if Zit = k.

Typically one models the marginal cumulative probabilities of response, vitk = Pr (Zit≤ k) for k = 1, 2,.......,K-1. The marginal probabilities are denoted by πitk = Pr (Zit = k) = Pr (Yitk = 1) = E (Yitk) = vitk - vit,k-1 and will comprise the T(K-1)x1 vector πi. The vectors Yi and πi require only T(K-1) elements instead of TK elements because Image for - Goodness-of-Fit Tests for GEE Models Using Kappa-like Statistic to Diabetes Mellitus Study for i = 1, 2,........,K-1 and t = 1,2,........,T. Let Xit be the p x 1 covariate vector for the tth subunit of the ith cluster.

The cumulative marginal response probabilities will be related to the covariates via ling function g, the kth intercept λk and the px 1 marginal parameter vector β, g (vitk) = λk + X’itβ.

The intercept are in increasing order: λ1≤λ2≤........≤λK-1. For an ordinal response, the function g may be any link function such as the logit function, probit function (φ-1), or the elementary log-log function. Lipsitz[10] suggest that one estimates β with the following set of generalized estimating equations:

Image for - Goodness-of-Fit Tests for GEE Models Using Kappa-like Statistic to Diabetes Mellitus Study

where, Di = Di (β) = dπt (β)|dβ, Vi = Vi (β, α) ≈ var (Yi) is a working covariance matrix of Yi[1,2] and α is a q x 1 vector of correlation parameters. The parameters α are associated with the correlation between the elements of the vectors Yis and Yit.

KAPPA-LIKE STATISTIC

We used a kappa-like statistic to assess model fit for GEE categorical response models. Historically, the kappa coefficient has been used to determine the agreement of binary[11] and categorical[12] outcomes between raters. Kappa corrects the percentage of agreement between raters by taking into account the proportion of agreement expected by chance. Kappa has been used as a measure of reproducibility in many epidemiologic settings, such as studies involving twin similarity [13] and control-informant agreement collected from case-control studies. The general expression for the kappa statistic is:

Image for - Goodness-of-Fit Tests for GEE Models Using Kappa-like Statistic to Diabetes Mellitus Study

where, P0 is the observed proportion of agreement and Pe is the proportion of agreement expected by chance alone[12]. A value of 0 for k indicates no agreement beyond chance and a value of 1 indicates perfect agreement, among many of k’s desirable properties[14]. Thus, larger values of k indicate greater agreement between the outcomes.

Here we use k as a measure of agreement between the predicted and observed categorical responses to assess the fit of the GEE model. We estimate k in a second set of estimating equations, similar to Lipsitz[15], Klar[13] and Williamson[16]. With Lipsitz’s[10] method, we estimate the probability of the response falling in each of the K categories. Denote this estimated probability for the kth category, tth subunit, of the ith cluster as Image for - Goodness-of-Fit Tests for GEE Models Using Kappa-like Statistic to Diabetes Mellitus Study . We do not have a straightforward predicted response as with linear regression. However, if we did have a predicted response for the tth subunit of the ith cluster, denoted Image for - Goodness-of-Fit Tests for GEE Models Using Kappa-like Statistic to Diabetes Mellitus Study , it is natural to assume that Image for - Goodness-of-Fit Tests for GEE Models Using Kappa-like Statistic to Diabetes Mellitus Study would equal k (k = 1,2,AAA,K) with probability Image for - Goodness-of-Fit Tests for GEE Models Using Kappa-like Statistic to Diabetes Mellitus Study . Let P0it denote the probability that the predicted response from the model is equal to the observed response, i.e., Image for - Goodness-of-Fit Tests for GEE Models Using Kappa-like Statistic to Diabetes Mellitus Study = Zit. A natural estimate of P0it is obtained by using Image for - Goodness-of-Fit Tests for GEE Models Using Kappa-like Statistic to Diabetes Mellitus StudyZit, the estimated probability from the fitted model that the response falls into the observed category for the tth subunit of the ith cluster. We define kit as the agreement between the predicted and observed responses for the tth subunit of the ith cluster as follows:

Image for - Goodness-of-Fit Tests for GEE Models Using Kappa-like Statistic to Diabetes Mellitus Study

where, P0it is defined above and Pe is the probability of correct prediction expected by chance alone.

As an estimate of Pe, we fit an intercept-only model. Cox[17] and Nagelkerke[18] proposed using an intercept-only model as a baseline model when generalizing the coefficient of determination for assessing the fit of a logistic regression model. Thus, we will fit a model with just the intercepts, the λk parameters for k = 1,2, AAA,K-1, and no covariates. This baseline model will provide a good starting point from which to compare the proposed model. The estimated category probabilities from the intercept-only model will be the same for all clusters and subunits and will be denoted:

Image for - Goodness-of-Fit Tests for GEE Models Using Kappa-like Statistic to Diabetes Mellitus Study

where, nk is the sum of observations in category k, i.e., Image for - Goodness-of-Fit Tests for GEE Models Using Kappa-like Statistic to Diabetes Mellitus Study for k = 1, . . .,K. All nk observations with response category k will each be correctly predicted with probability Image for - Goodness-of-Fit Tests for GEE Models Using Kappa-like Statistic to Diabetes Mellitus Study ; accordingly, the estimate of Pe is:

Image for - Goodness-of-Fit Tests for GEE Models Using Kappa-like Statistic to Diabetes Mellitus Study

The agreement between two raters for assessing a categorical outcome with K categories can be depicted in a K x K contingency table[14]. The row and column total probabilities for the kth outcome category are pk. and p.k, the marginal probabilities that the two raters assess the outcome in the kth category. The estimate of the probability expected by chance is calculated assuming independence between the rows and columns in the contingency table and is Image for - Goodness-of-Fit Tests for GEE Models Using Kappa-like Statistic to Diabetes Mellitus Study, which is similar to the estimate above. We will estimate an overall k to ascertain the fit of the model (k = kit for i = 1, . . .,N and t = 1, . . . , T). By noting that P0it = Pe+k(1.0-Pe), we use a second set of estimating equations as follows. Let P0i and Ui denote the Tx1 vectors Image for - Goodness-of-Fit Tests for GEE Models Using Kappa-like Statistic to Diabetes Mellitus Study and Image for - Goodness-of-Fit Tests for GEE Models Using Kappa-like Statistic to Diabetes Mellitus Study. The second set of estimating equations are, thus:

Image for - Goodness-of-Fit Tests for GEE Models Using Kappa-like Statistic to Diabetes Mellitus Study

where, Image for - Goodness-of-Fit Tests for GEE Models Using Kappa-like Statistic to Diabetes Mellitus Studyand W i≈ var (Ui) is the T xT working covariance matrix ofUi. To compute Image for - Goodness-of-Fit Tests for GEE Models Using Kappa-like Statistic to Diabetes Mellitus Study one can use a Fisher-scoring-type algorithm such as:

Image for - Goodness-of-Fit Tests for GEE Models Using Kappa-like Statistic to Diabetes Mellitus Study

and

Image for - Goodness-of-Fit Tests for GEE Models Using Kappa-like Statistic to Diabetes Mellitus Study

where, m denotes the iteration. We use Liang[1]’s empirically corrected variance estimate of Image for - Goodness-of-Fit Tests for GEE Models Using Kappa-like Statistic to Diabetes Mellitus Study and Prentice’s[19] empirically corrected variance estimate of Image for - Goodness-of-Fit Tests for GEE Models Using Kappa-like Statistic to Diabetes Mellitus Study. The second set of estimating equations can be solved non-iteratively if we choose the T x T identity matrix for Wi:

Image for - Goodness-of-Fit Tests for GEE Models Using Kappa-like Statistic to Diabetes Mellitus Study

The term Image for - Goodness-of-Fit Tests for GEE Models Using Kappa-like Statistic to Diabetes Mellitus Study is the average predicted probability corresponding to the observed responses. If the fitted model predicts the categorical response perfectly, i.e., Image for - Goodness-of-Fit Tests for GEE Models Using Kappa-like Statistic to Diabetes Mellitus StudyZit then Image for - Goodness-of-Fit Tests for GEE Models Using Kappa-like Statistic to Diabetes Mellitus Study = 1.0 . If the fitted model predicts the responses no better than an intercept-only model, thenThis kappa-like classification statistic should be interpreted as the average probability of predicting the observed responses above and beyond the prediction by the intercept-only model.

DATA AND VARIABLES

In present study the diabetes mellitus data was used to carry out the analysis. Here the follow up data on 528 patients registered at BIRDEM (Bangladesh Institute of Research and Rehabilitation in Diabetes, Endocrine and Metabolic disorders) in 1984-94 are used to identify the risk factors responsible for the transitions from controlled diabetic to confirmed diabetic state as well as confirm diabetic to controlled stage of diabetes. The response variable is defined in terms of the observed glucose level two hours of 75 g glucose load for each follow-up visit. The cut-off point for the blood glucose level is 11.1 mM L-1. If the observed response is less than 11.1, then the patient is defined as non diabetic (categorized as 0) if the response is greater than or equal to 11.1 then the patient is said to be diabetic (categorized as 1) according to WHO (1985) criteria. We include six independent variables, age, sex, education level, area of residence, family history of father and mother and time in our study. Out of these variables, age represents the age of the respondents at each visit. Time represents the length of time of the consecutive visits. These two variables are continuous variables and used directly in the analysis. Other variables are dichotomous variable.

RESULTS AND DISCUSSION

The kappa-like statistic takes on a value of 0.0 for the intercept-only model and a value of 1.0 for the saturated model. An advantage of the statistic is that no decisions need be made when calculating it, unlike methods based on covariate partitioning (where to partition, how many partitioned categories), Hosmer and Lemeshow’s approach, rank correlation methods and classification tables. Interpretation of the kappa statistic is not always straightforward; see Fleiss[12] and Landis[20] for details. Similar to Landis’s[20] labeling of kappa values, Williamson[21] suggest interpreting the values of kappa for this classification index as follows:

Image for - Goodness-of-Fit Tests for GEE Models Using Kappa-like Statistic to Diabetes Mellitus Study

First we fit a GEE logistic regression model with main effects terms only. We then examined various interaction and quadratic variables for entry into the regression model at the significance level of 0.05. The quadratic terms for age and time were significant and entered into the final model. The kappa-like classification index for this final model increased indicating a better fit.

Table 1: GEE logistic regression models without quadratic terms assuming the various correlation structures for diabetic mellitus study
Image for - Goodness-of-Fit Tests for GEE Models Using Kappa-like Statistic to Diabetes Mellitus Study

Table 2: GEE logistic regression models with quadratic terms assuming the various correlation structures for diabetic mellitus study
Image for - Goodness-of-Fit Tests for GEE Models Using Kappa-like Statistic to Diabetes Mellitus Study

With the inclusion of the two quadratic terms, the model fit the data quite well according to Barnhart’s[5] test.

From Table 1, it can be seen that the kappa-like statistic is a fair predictor for the GEE model of exchangeable correlation structure, a good predictor for the autoregressive correlation structure and poor predictor for pairwise correlation structure. Though the likelihood ratio test shows significant effect of covariates in the model but this test does not indicate how well the model predicts the observed responses.

Table 2 shows that when we include the two quadratic terms the kappa-like statistic is good for GEE model of exchangeable correlation structure, a excellent predictor for the autoregressive correlation structure and fair predictor for pairwise correlation structure.

CONCLUSIONS

The kappa-like classification statistic is a more appropriate indicator of how well the model predicts the observed responses at the cluster level (e.g., an individual) as opposed to how well the model fits the data at the group level (e.g., treatment category). Often a model can fit the data well in terms of predicting the proportion of positive responses for a group of individuals, but is not necessarily useful for predicting a particular individual’s response. The kappa-like classification index is an intuitive measure for assessing model fit in that it estimates the probability of an observation being correctly predicted by the fitted model. Then, this probability is corrected for chance by comparing it to the probability that an intercept-only model would have correctly predicted the observation. For the diabetes mellitus study, we would recommend that the kappa value for the GEE model with quadratic terms indicated better prediction than the GEE models without quadratic terms. That is, the GEE models with quadratic terms are fitter well.

ACKNOWLEDGMENTS

The authors thank Dr. Kalipada Sen, Professor, Department of Statistics, University of Dhaka, Bangladesh for helpful discussions and manuscript review.

REFERENCES
1:  Liang, K.Y. and S.L. Zeger, 1986. Longitudinal data analysis using generalized linear models. Biometrika, 73: 13-22.
Direct Link  |  

2:  Zeger, S.L. and K.Y. Liang, 1986. Longitudinal data analysis for discrete and continuous outcomes. Biometrics, 42: 121-130.

3:  Horton, N.J., J.D. Bebchuck, C.L. Jones, S.R. Lipsitz, P.J. Catalano, G.E.P. Zahner and G.M. Fitzmaurice, 1999. Goodness-of-fit for GEE: An example with mental health service utilization. Stat. Med., 18: 213-222.

4:  Barnhart, H.X. and J.M. Williamson, 1998. Goodness-of-fit tests for GEE modeling with binary responses. Biometrics, 54: 720-729.

5:  Tsiatis, A.A., 1980. A note on a goodness-of-fit test for the logistic regression model. Biometrika, 67: 250-251.

6:  Pan, W., 2002. Goodness-of-fit tests for GEE with correlated binary data. Scand. J. Stat., 29: 101-110.

7:  Preisser, J.S. and B.F. Qaqish, 1996. Deletion diagnostics for generalized estimating equations. Biometrika, 83: 551-562.

8:  Preisser, J.S. and B.F. Qaqish, 1999. Robust regression for clustered data with application to binary responses. Biometrics, 55: 574-579.

9:  Lipsitz, S.R., K. Kim and L. Zhao, 1994. Analysis of repeated categorical data using generalized estimating equations. Stat. Med., 13: 1149-1163.

10:  Cohen, J., 1960. A coefficient of agreement for nominal scales. Educat. Psychol. Measurement, 20: 37-46.

11:  Fleiss, J.L., 1971. Measuring nominal scale agreement among many raters. Psychol. Bull., 76: 378-382.
CrossRef  |  Direct Link  |  

12:  Klar, N., S.R. Lipsitz and J. Ibrahim, 2000. An estimating equations approach for modeling kappa. Biometrical J., 42: 45-58.

13:  Lipsitz, S.R., N.M. Laird and T.A. Brennan, 1994. Simple moment estimates of the co-efficient and its variance. Applied Stat., 43: 309-323.

14:  Williamson, J.M., A.K. Manatunga and S.R. Lipsitz, 2000. Modeling kappa for measuring dependent categorical agreement data. Biostatistics, 1: 191-202.

15:  Nagelkerke, N.J.D., 1991. A note on a general definition of the coefficient of determination. Biometrika, 78: 691-692.
Direct Link  |  

16:  Prentice, R.L., 1988. Correlated binary regression with covariates specific to each binary observation. Biometrics, 44: 1033-1048.
Direct Link  |  

17:  Landis, R.J. and G.G. Koch, 1977. The measurement of observer agreement for categorical data. Biometrics, 33: 159-174.
PubMed  |  Direct Link  |  

18:  Williamson, J.N., L. Hung-Mo and H.X. Barnhart, 2003. A classification statistic for GEE categorical response models. J. Data Sci., 1: 149-165.

19:  Cox, D.R. and E.J. Snell, 1989. The Analysis of Binary Data. 2nd Edn., Chapman and Hall, London.

20:  Hosmer, D.W. and S. Lemeshow, 1989. Applied Logistic Regression. Wiley-Interscience Publication, New York, USA.

21:  Fleiss, J.L., 1981. Statistical Methods for Rates and Proportions. 2nd Edn., John Wiley and Sons Inc., New York, USA., Pages: 336.

©  2021 Science Alert. All Rights Reserved