Subscribe Now Subscribe Today
Research Article
 

An Analytical Approach on Non-parametric Estimation of Cure Rate Based on Uncensored Data



M. Taj Uddin, A. Sen , M.S. Noor , M.N. Islam and Z.I. Chowdhury
 
Facebook Twitter Digg Reddit Linkedin StumbleUpon E-mail
ABSTRACT

This study deals with the analysis of non parametric estimation of Cure rate parameter. Frailty models have been proposed for cure rate estimation. In this research we have tried to estimate the cure parameter from the Frailty model using non-parametric maximum likelihood estimation (NPMLE) method. To perform the analysis, we consider two cases: i) cured and non cured group simultaneously and ii) non cured group only. The analysis showed that cute rate estimator can not be obtained from the analytical solution of the estimating equation but may be obtained from the numerical solution of the equation.

Services
Related Articles in ASCI
Similar Articles in this Journal
Search in Google Scholar
View Citation
Report Citation

 
  How to cite this article:

M. Taj Uddin, A. Sen , M.S. Noor , M.N. Islam and Z.I. Chowdhury , 2006. An Analytical Approach on Non-parametric Estimation of Cure Rate Based on Uncensored Data. Journal of Applied Sciences, 6: 1258-1264.

DOI: 10.3923/jas.2006.1258.1264

URL: https://scialert.net/abstract/?doi=jas.2006.1258.1264

INTRODUCTION

When analyzing survival data from clinical trials, it is sometimes clear that a non-zero proportion of patients can be considered as cured. Cantor and Shuster (1992) make a constructive discussion about parametric and non-parametric methods for estimating cure rate based on censored survival data. They use Kaplan-Meier (1958) method for non-parametric estimation of cure proportion. On the other hand for parametric estimation of cure rates, they assume a survival function S(t) for which i.e., in a proportion of patients the event never occurs, using MLE one can estimate S(∞), which is considered as cure rate. Since, in many cases, distributional form clearly does not fit the survival data well, the use of non-parametric methods is attractive. However, the Kaplan-Meier method is not free of problems and limitations. Miller (1983) shows that the asymptotic efficiency of the Kaplan–Meier estimator tends to be low relative to the MLE for a given distribution.

Tsodikov (2001) studied the estimation of survival function based on proportional hazard model with cure. He proposed an algorithm to fit the proportional hazard model restricted by the fixed survival rates at the end of observation period. He used parametric cure model to estimate the proportion of long term survivors. To combine the stability of the parametric method, the survival function is estimated non-parametrically conditional on the cure rates provided by the parametric analysis. Peng and Carriere (2002) have proposed parametric and semi-parametric cured models for cure rate estimation. In their study, several parametric and semi-parametric models are compared and their estimation methods are discussed within the framework of EM algorithm. They showed that semi-parametric cure models can achieve efficiency levels similar to those of parametric cure models, provided that the failure time distribution is well specified and non-cured patients have an increasing hazard rate. They also recommended that the semi-parametric model is a viable alternative to parametric cure models. They have proposed mixture models for these analyses. The main objective of this analysis is to develop the non-parametric estimating equation to estimate the cure parameter of the model.

Cure model: Cure models were first proposed 50 years ago (Boag, 1949; Berkson et al., 1952; Haybittle, 1959) and have since received regular attention in the statistical literature (Haybittle, 1965; Mould et al., 1975; O’Neill, 1979; Farewell, 1982; Goldman, 1984; Sposto and Sather, 1985; Farewell, 1986; Halpern, 1987; Goldman, 1991; Sposto et al., 1992; Sposto, 2002; Kuk et al., 1992; Maller et al., 1992; Cantor et al., 1992; Ghitany et al., 1994; Yakovlev, 1994; Cantor et al., 1994; Ghitany et al., 1995; Lee et al., 1995; Zhou et al., 1995; Laska et al., 1992; Tsodikov et al., 1998; Peng et al., 1998; Gieser et al., 1998; Tsodikov, 1998; Hauck et al., 1997; De Angelis et al., 1999; Sy and Tayler, 2000). However, they have not attained wide use or acceptance in the medical research literature, perhaps in part because of their reliance on particular parametric forms. However, parametric cure models provide a good empirical description of outcome in paediatric cancer data (Sposto and Sather, 1985; Sposto et al., 1992). Most importantly, they provide a single analytic method within which the effect of treatments and prognostic factors on the proportion cured can be assessed separately from their effect on the time to failure.

Frailty model: Chen et al. (1999) have proposed a different type of cure rate model, which is considered as a frailty model. The frailty model can be derived as follows:

Suppose that for an individual in the population, N denotes the total number of carcinogenic cells (often called clonogens) for that individual left active after the initial treatment and then assume that N has a Poisson distribution with mean θ. Also let Zi denote the random time for the ith clonogenic cell to produce a detectable cancer mass. That Zi can be viewed as an incubation time for the ith clonogenic cell. The variable Zi, i = 1, 2, ….., are assumed to be identically independently distributed (i.i.d.) with a common distribution function F(t)=1-S(t)and are independent of N. Where S(t)is the survival function. The time to relapse of cancer can be defined by the random variable T= min{Zi, 0≤i≥N}, P(Z0 = ∞) = 1 and N is independent of the sequence Zi, Z2, … The survival function for T and hence the survival function for the population is given by


(1)

The model (1) is not a proper survival function. Because SP(∞) = exp(-θ). Note that F(0) = 0 and F(∞) = 1. We observe that model (1) shows explicitly the contribution to the failure time of two distinct characteristics of tumor growth: the initial number of carcinogenic cells and the rate of their progression. Thus the model incorporates parameters bearing clear biological meaning. The model (1) is suitable for any type of failure data that has a surviving fraction (cure rate). Thus the model can be useful for modeling various types of failure time data, including time to relapse, time to death, time to first infection and so forth.

We also observe that the cure rate π is given by

(2)

As θ→ ∞, the cure rate tends to zero, whereas as θ→ o, the cure rate tends to 1. i.e., the cure rate lies between 0 and 1. Note that by taking first derivative of (1), we get,

Where, S’P(t) and F’(t) denote the first derivative of SP(t) and F(t), respectively and F’(t) = f(t)

Or, - S’P(t) = θf(t)exp(-θF(t))

Since, - S’P(t) = θfP(t)

The density function corresponding to model (1) is given by

(3)

We observe that SP(t) is not a proper survival function, because SP(∞) ≠ 0. Therefore fP(t) is not a proper probability density function. But f(t) in model (3) is a proper density function.

MATERIALS AND METHODS

We try to estimate the cure parameter by using Non Parametric Maximum Likelihood Estimation (NPMLE) Method. The method is described as follows:

Non-parametric maximum likelihood method: Suppose that X is a random variable with probability density function f(x;θ), θ to be estimated and x1, x2, ….. xn is a random sample of size n. The joint probability density function of the random variable comprising the sample is called the likelihood function of the sample and is given by

(4)

If there exists a value L(x1,x2,…xn;θ*)≥ L(x1,x2,…xn;θ) such that θ for all possible choices of θ*, then based upon the meaning of likelihood function, θ* maximizes the Eq. (4) and this value θ* is considered as the maximum likelihood estimate. Choosing the value of θ that makes it most likely that the data would be as obtained is certainly a reasonable approach. Therefore, if a value θ* can be found such that θ* maximizes the likelihood function (4) for a given set of sample values x1,x2,…xn, then θ* is called the maximum likelihood estimate for the given set of sample values. Since the likelihood function is a function of parameter, θ under the sample information (x1,x2,…xn), the maximum likelihood estimate will be a function of the sample values. If the sample functional relationship between the estimate and the sample information (x1,x2,…xn) holds for all possible choices of the xi, then that functional relationship can be taken as an estimation rule and the result will be an estimator of θ, this estimator being known as maximum likelihood estimator or the maximum likelihood filter.

In Non-parametric maximum likelihood method, we write the non-parametric likelihood function as

(5)

where, F(.) is the common distribution function of Xi,1≤i≤n and ΔF(xi)-F(xi-) is the jump of F(.) at Xi,1≤i≤n. Putting ΔF(xi) = pi in (5), we can write

(6)

Now we maximize (6) subject to condition

(7)

The log-likelihood function becomes

(8)

By using Lagrange multiplier method we can maximize (8). By adding a Lagrange multiplier λ, (8) becomes

(9)

Thus, the non-parametric maximum likelihood estimator of pi is obtained by the solution of the following equations

(10)

(11)

(12)

From (11), we can write

(13)

Using (12) in (13) we obtain

(14)

Therefore, the non-parametric maximum likelihood estimator of pi is

RESULTS AND DISCUSSION

Suppose that T is the life time of a patient. Then P(X=∞) = , which is considered as cure rate. On the other hand, P(X<∞) = 1-e, 0≤t≤∞ which is the probability of non-cured.

By using Non-Parametric Maximum Likelihood Estimation (NPMLE) method, we can estimate the cure parameter. For uncensored data we consider the following cases:

Case-(a): F0(.), f0(.) and θ are unknown. Here we observe both cured and non-cured group.

Suppose that we have the data in the form (xii), i = 1,2,..,n,, where xi denotes the survival time for the ith patient, εi is the cured indicator with 1 if xi is not cured and 0 otherwise i.e., εi = 1{xi<∞}.


(15)

Therefore, the non-parametric likelihood function is given by

(16)

Where, ΔF(xi) = jump of F(.) at xi

(17)

Where,

Therefore, the above likelihood function (16) can be written as

(18)

We want to maximize (18) subject to condition

(19)

The log-likelihood function becomes

(20)

By using Lagrange multiplier method we can maximize (20). Adding Lagrange multiplier λ, we can write (20) as follows

(21)

Therefore, the non-parametric maximum likelihood estimators of θ and Pi are obtained by the solution of the following equations

(22)

Now gives,

(23)

Similarly, gives, I = 1,2,…,n-1

(24)

and gives,

(25)

Multiplying (24) by pi and summing over i from 1 to n, we obtain the following equation

(26)

Since, , so the equation (26) becomes

(27)

From (23) and (27), we obtain

(28)

Now using the estimate of λ in (24), we get

(29)

Therefore,

(30)

Comment: This Eq. 30 can be considered as an estimating equation of Pi which can not be solved analytically but may be solved numerically. So the solution of this equation is our desired estimate of Pi.

Again using (28) in (25), we may obtain the estimate of Pn

Comment: The above equation also can not be solved analytically but may be solved numerically.

Finally the estimate of θ may be obtained from the numerical solution of Eq. (23).

Case-(b): F0(.), f0(.) and θ are unknown and only non-cured group are observed. Suppose that we have the data in the form (xii), I = 1,2,..,n,where, xi denotes the survival time for the ith patient, εi is the cured indicator with 1 if xi is not cured and 0 otherwise. i.e., εi=1 {xi<∞}. The non-parametric likelihood function can be written as

The log-likelihood function is given by


(31)

We want to maximize (31) subject to condition , By using Lagrange multiplier method we can maximize (31). Adding Lagrange multiplier λ in (31), we can write

(32)

Therefore, the non-parametric maximum likelihood estimators of θ and pi are obtained by the solution of following equations

(33)

Now gives

(34)

and gives, i =1,2,…,n-1

(35)

(36)

Multiplying (35) by Pi and summing over i from1 to n we obtain

(37)

Since , so the above equation becomes

(38)

Multiplying Eq. (34) by θ and subtracting from (38), we obtain

Therefore,

(39)

Using the estimate of λin (35), we obtain


Thus,

(40)

Comment: This is an estimating equation of Pi, i =1,2,..,n-1. and it can not be solved analytically but it may be solved numerically and the solution of this equation is the desired estimate of Pi.

Again, using (39) in (36), we obtain

(41)

We observe that the estimate of Pn may be obtained from the numerical solution of the Eq. (41)

Finally the estimate of θ can be obtained from the numerical solution of the Eq. (34).

CONCLUSIONS

Considering both non-cured and cured group, when we assume f0(.) and F0(.) are unknown, we found a non-parametric estimating equation of θ. Unfortunately we could not find an explicit solution for θ. But hopefully, this non-parametric estimating equation may be solved numerically by choosing an appropriate numerical method. Also we have found the same result when we consider non-cured group only. That is, in both the cases we have found a non-parametric estimating equation for θ.

REFERENCES
1:  Berkson, J. and R.P. Gage, 1952. Survival curves for cancer patients following treatments. J. Am. Stat. Assoc., 47: 501-515.
Direct Link  |  

2:  Boag, J.W., 1949. Maximum likelihood estimates of the proportion of patients cured by cancer therapy. J. R. Stat. Soc., 11: 15-53.
Direct Link  |  

3:  Cantor, A.B. and J.J. Shuster, 1992. Parametric versus non-parametric methods for estimating cure rates based on censored survival data. Stat. Med., 11: 931-937.
PubMed  |  Direct Link  |  

4:  Cantor, A.B. and J. Shuster, 1994. Parametric versus non-parametric methods for estimating cure rates based on censored survival data-reply. Stat. Med., 13: 983-986.
Direct Link  |  

5:  Chen, M.H., J.G. Ibrahim and D. Sinha, 1999. A new Bayesian model for surviving data with a surviving fraction. J. Am. Stat. Assoc., 94: 909-919.
Direct Link  |  

6:  De Angelis, R., R. Capocaccia, T. Hakulinen, B. Soderman and A. Verdecchia, 1999. Mixture models for cancer survival analysis: Application to population based data with covariates. Stat. Med., 18: 441-454.
Direct Link  |  

7:  Farewell, V.T., 1982. The use of mixture models for the analysis of survival data with long-term survivors. Biometrics, 38: 1041-1046.
PubMed  |  Direct Link  |  

8:  Farewell, V.T., 1986. Mixture model in survival analysis: Are they worth the risk?. Can. J. Stat., 14: 257-262.
CrossRef  |  Direct Link  |  

9:  Geiser, P.W., M.N. Chang, P.V. Rao, J.J. Shuster and J. Pullen, 1998. Modeling cure rates using the gompertz model with covariate information. Stat. Med., 17: 831-839.
Direct Link  |  

10:  Ghitany, M.E., R.A. Maller and S. Zhou, 1994. Exponential mixture models with long term survivors and covariates. J. Multivariate Anal., 49: 218-241.
Direct Link  |  

11:  Ghitany, M.E., R. Maller and S. Zhou, 1995. Estimating the proportion of immunes in censored samples: A simulation study. Stat. Med., 14: 39-49.
Direct Link  |  

12:  Goldman, A.I., 1984. Survivorship analysis when cure is a possibility: A monte carlo study. Stat. Med., 3: 153-163.
CrossRef  |  Direct Link  |  

13:  Goldman, A.I., 1991. The cure model and time confounded risk in the analysis of survival and other timed events. J. Clin. Epidemiol., 44: 1327-1340.
Direct Link  |  

14:  Halpern, J. and B.W. Brown, 1987. Cure rate models: Power of the log rank and generalized wilcoxon tests. Stat. Med., 6: 483-489.
PubMed  |  Direct Link  |  

15:  Hauck, W.W., L.J. Mckee and B.J. Turner, 1997. Two-part survival models applied to administrative data for determining rate and predictors for maternal-child transmission of HIV. Stat. Med., 16: 1683-1694.
Direct Link  |  

16:  Haybittle, J.L., 1965. A two parameters model for the survival curve of treated cancer patients. J. Am. Stat. Assoc., 60: 16-26.

17:  Haybittle, J.L., 1959. The estimation of the proportion of patients cured after treatment for cancer of the breast. Br. J. Radiol., 32: 725-733.

18:  Kaplan, E.L. and P. Meier, 1958. Non-parametric estimation from incomplete observations. J. Am. Stat. Assoc., 53: 457-481.

19:  Kuk, A.Y.C. and C.H. Chen, 1992. A mixture model combining logistic regression with proportional hazards regression. Biometrika, 79: 531-541.
CrossRef  |  Direct Link  |  

20:  Laska, E.M. and M.J. Meisner, 1992. Non-parametric estimation and testing in a cure model. Biometrics, 48: 1223-1234.
Direct Link  |  

21:  Lee, J.W. and H.N. Sather, 1995. Group sequential methods for comparison of cure rates in clinical trials. Biometrics, 51: 756-763.
PubMed  |  Direct Link  |  

22:  Maller, R.A. and S. Zhou, 1992. Estimating the proportion of immunes in a censored sample. Biometrika, 79: 731-739.
CrossRef  |  Direct Link  |  

23:  Maller, R.A. and S. Zhou, 1995. Testing for the presence of immune or cured individuals in censored survival data. Biometrics, 51: 1197-1205.
Direct Link  |  

24:  Miller, R.G., 1983. What price kaplan-meier? Biometrics, 39: 1077-1081.
PubMed  |  Direct Link  |  

25:  Mould, R.F. and J.W. Boag, 1975. A test of several parametric models for estimating success rate in the treatment of carcinoma cervix uteri. Br. J. Cancer, 32: 529-550.
PubMed  |  Direct Link  |  

26:  O`Neill, T.J., 1979. Distribution free of cure time. Biometrika, 66: 184-187.
CrossRef  |  Direct Link  |  

27:  Peng, Y. and K.C. Carriere, 2002. An empirical comparison of parametric and semiparametric cure models. Biometrical. J., 44: 1002-1014.
CrossRef  |  Direct Link  |  

28:  Peng, Y., K.B. Dear and J.W. Denham, 1998. A generalized F mixture model for cure rate estimation. Stat. Med., 17: 813-830.
Direct Link  |  

29:  Sposto, R. and H.N. Sather, 1985. Determining the duration of comparative clinical trials while allowing for cure. J. Chronic Dis., 38: 683-690.
PubMed  |  Direct Link  |  

30:  Sposto, R., H.N. Sather and S.A. Baker, 1992. A comparison of tests of the difference in the proportion of patients who are cured. Biometrics, 48: 87-99.
PubMed  |  Direct Link  |  

31:  Sposto, R., 2002. Cure model analysis in cancer: An application to data from the children's cancer group. Stat. Med., 21: 293-312.

32:  Sy, J.P. and J.M.G. Taylor, 2000. Estimation in a Cox proportional hazard cure model. Biometrics, 56: 227-236.
CrossRef  |  Direct Link  |  

33:  Tsodikov, A., 1998. A proportional hazard model taking account of long-term survivors. Biometrics, 54: 1508-1516.
PubMed  |  Direct Link  |  

34:  Tsodikov, A., 2001. Estimation of survival based on proportional hazards when cure is a possibility. Math. Comput. Modell., 33: 1227-1236.
Direct Link  |  

35:  Tsodikov, A., M. Loeffler and A. Yakovlev, 1998. A cure model with time-changing risk factor: An application to the analysis of secondary Leukemia. A report from the international database on hodskins disease. Stat. Med., 17: 27-40.
Direct Link  |  

36:  Yakovlev, A.Y., 1994. Letter to the editor. Parametric versus non-parametric methods for estimating cure rates based on censored survival data. Stat. Med., 13: 983-985.

37:  Zhou, S. and R.A. Maller, 1995. The likelihood ratio test for the presence of immunes in a censored sample. Stat. Med., 27: 181-201.

©  2021 Science Alert. All Rights Reserved