**INTRODUCTION**

When analyzing survival data from clinical trials, it is sometimes clear that
a non-zero proportion of patients can be considered as cured. Cantor and Shuster
(1992) make a constructive discussion about parametric and non-parametric methods
for estimating cure rate based on censored survival data. They use Kaplan-Meier
(1958) method for non-parametric estimation of cure proportion. On the other
hand for parametric estimation of cure rates, they assume a survival function
S(t) for which i.e.,
in a proportion of patients the event never occurs, using MLE one can estimate
S(∞), which is considered as cure rate. Since, in many cases, distributional
form clearly does not fit the survival data well, the use of non-parametric
methods is attractive. However, the Kaplan-Meier method is not free of problems
and limitations. Miller (1983) shows that the asymptotic efficiency of the Kaplan–Meier
estimator tends to be low relative to the MLE for a given distribution.

Tsodikov (2001) studied the estimation of survival function based on proportional hazard model with cure. He proposed an algorithm to fit the proportional hazard model restricted by the fixed survival rates at the end of observation period. He used parametric cure model to estimate the proportion of long term survivors. To combine the stability of the parametric method, the survival function is estimated non-parametrically conditional on the cure rates provided by the parametric analysis. Peng and Carriere (2002) have proposed parametric and semi-parametric cured models for cure rate estimation. In their study, several parametric and semi-parametric models are compared and their estimation methods are discussed within the framework of EM algorithm. They showed that semi-parametric cure models can achieve efficiency levels similar to those of parametric cure models, provided that the failure time distribution is well specified and non-cured patients have an increasing hazard rate. They also recommended that the semi-parametric model is a viable alternative to parametric cure models. They have proposed mixture models for these analyses. The main objective of this analysis is to develop the non-parametric estimating equation to estimate the cure parameter of the model.

**Cure model:** Cure models were first proposed 50 years ago (Boag, 1949;
Berkson *et al*., 1952; Haybittle, 1959) and have since received regular
attention in the statistical literature (Haybittle, 1965; Mould *et al*.,
1975; O’Neill, 1979; Farewell, 1982; Goldman, 1984; Sposto and Sather,
1985; Farewell, 1986; Halpern, 1987; Goldman, 1991; Sposto *et al*., 1992;
Sposto, 2002; Kuk * et al*., 1992; Maller *et al*., 1992; Cantor *et
al*., 1992; Ghitany *et al*., 1994; Yakovlev, 1994; Cantor *et al*.,
1994; Ghitany *et al*., 1995; Lee *et al*., 1995; Zhou *et al*.,
1995; Laska *et al*., 1992; Tsodikov *et al*., 1998; Peng *et al*.,
1998; Gieser *et al*., 1998; Tsodikov, 1998; Hauck *et al*., 1997;
De Angelis *et al*., 1999; Sy and Tayler, 2000). However, they have not
attained wide use or acceptance in the medical research literature, perhaps
in part because of their reliance on particular parametric forms. However, parametric
cure models provide a good empirical description of outcome in paediatric cancer
data (Sposto and Sather, 1985; Sposto *et al*., 1992). Most importantly,
they provide a single analytic method within which the effect of treatments
and prognostic factors on the proportion cured can be assessed separately from
their effect on the time to failure.

**Frailty model: **Chen *et al*. (1999) have proposed a different type of cure rate model, which is considered as a frailty model. The frailty model can be derived as follows:

Suppose that for an individual in the population, N denotes the total number of carcinogenic cells (often called clonogens) for that individual left active after the initial treatment and then assume that N has a Poisson distribution with mean θ. Also let Z_{i} denote the random time for the ith clonogenic cell to produce a detectable cancer mass. That Z_{i} can be viewed as an incubation time for the ith clonogenic cell. The variable Z_{i}, i = 1, 2, ….., are assumed to be identically independently distributed (i.i.d.) with a common distribution function F(t)=1-S(t)and are independent of N. Where S(t)is the survival function. The time to relapse of cancer can be defined by the random variable T= min{Z_{i}, 0≤i≥N}, P(Z_{0} = ∞) = 1 and N is independent of the sequence Z_{i}, Z_{2,} … The survival function for T and hence the survival function for the population is given by

The model (1) is not a proper survival function. Because S_{P}(∞) = exp(-θ). Note that F(0) = 0 and F(∞) = 1. We observe that model (1) shows explicitly the contribution to the failure time of two distinct characteristics of tumor growth: the initial number of carcinogenic cells and the rate of their progression. Thus the model incorporates parameters bearing clear biological meaning. The model (1) is suitable for any type of failure data that has a surviving fraction (cure rate). Thus the model can be useful for modeling various types of failure time data, including time to relapse, time to death, time to first infection and so forth.

We also observe that the cure rate π is given by

As θ→ ∞, the cure rate tends to zero, whereas as θ→
o, the cure rate tends to 1. i.e., the cure rate lies between 0 and 1. Note
that by taking first derivative of (1), we get,

Where, S’_{P}(t) and F’(t) denote the first derivative of
S_{P}(t) and F(t), respectively and F’(t) = f(t)

Or, - S’_{P}(t) = θf(t)exp(-θF(t))

Since, - S’_{P}(t) = θf_{P}(t)

The density function corresponding to model (1) is given by

We observe that S_{P}(t) is not a proper survival function, because S_{P}(∞) ≠ 0. Therefore f_{P}(t) is not a proper probability density function. But f(t) in model (3) is a proper density function.

**MATERIALS AND METHODS**

We try to estimate the cure parameter by using Non Parametric Maximum Likelihood Estimation (NPMLE) Method. The method is described as follows:

**Non-parametric maximum likelihood method:** Suppose that X is a random variable with probability density function f(x;θ), θ to be estimated and x_{1}, x_{2}, ….. x_{n }is a random sample of size n. The joint probability density function of the random variable comprising the sample is called the likelihood function of the sample and is given by

If there exists a value L(x_{1},x_{2},…x_{n};θ*)≥
L(x_{1},x_{2},…x_{n};θ) such that θ for
all possible choices of θ*, then based upon the meaning of likelihood function,
θ* maximizes the Eq. (4) and this value θ* is considered
as the maximum likelihood estimate. Choosing the value of θ that makes
it most likely that the data would be as obtained is certainly a reasonable
approach. Therefore, if a value θ* can be found such that θ* maximizes
the likelihood function (4) for a given set of sample values x_{1},x_{2},…x_{n},
then θ* is called the maximum likelihood estimate for the given set of
sample values. Since the likelihood function is a function of parameter, θ
under the sample information (x_{1},x_{2},…x_{n}),
the maximum likelihood estimate will be a function of the sample values. If
the sample functional relationship between the estimate and the sample information
(x_{1},x_{2},…x_{n}) holds for all possible choices
of the x_{i}, then that functional relationship can be taken as an estimation
rule and the result will be an estimator of
θ, this estimator being known as maximum likelihood estimator or the maximum
likelihood filter.

In Non-parametric maximum likelihood method, we write the non-parametric likelihood function as

where, F(.) is the common distribution function of X_{i},1≤i≤n and ΔF(x_{i})-F(x_{i}-) is the jump of F(.) at X_{i},1≤i≤n. Putting ΔF(x_{i}) = p_{i} in (5), we can write

Now we maximize (6) subject to condition

The log-likelihood function becomes

By using Lagrange multiplier method we can maximize (8).
By adding a Lagrange multiplier λ, (8) becomes

Thus, the non-parametric maximum likelihood estimator of p_{i} is obtained by the solution of the following equations

From (11), we can write

Using (12) in (13) we obtain

Therefore, the non-parametric maximum likelihood estimator of p_{i }is

**RESULTS AND DISCUSSION**

Suppose that T is the life time of a patient. Then P(X=∞) = , which is considered as cure rate. On the other hand, P(X<∞) = 1-e^{-θ}, 0≤t≤∞ which is the probability of non-cured.

By using Non-Parametric Maximum Likelihood Estimation (NPMLE) method, we can estimate the cure parameter. For uncensored data we consider the following cases:

**Case-(a):** F_{0}(.), f_{0}(.) and θ are unknown. Here we observe both cured and non-cured group.

Suppose that we have the data in the form (x_{i},ε_{i}), i = 1,2,..,n,, where x_{i }denotes the survival time for the ith patient, ε_{i} is the cured indicator with 1 if x_{i }is not cured and 0 otherwise i.e., ε_{i} = 1_{{xi<∞}}.

Therefore, the non-parametric likelihood function is given by

Where, ΔF(x_{i}) = jump of F(.) at x_{i}

Where,

Therefore, the above likelihood function (16) can be written as

We want to maximize (18) subject to condition

The log-likelihood function becomes

By using Lagrange multiplier method we can maximize (20).
Adding Lagrange multiplier λ, we can write (20) as follows

Therefore, the non-parametric maximum likelihood estimators of θ and P_{i} are obtained by the solution of the following equations

Now
gives,

Similarly, gives,
I = 1,2,…,n-1

and gives,

Multiplying (24) by p_{i} and summing over i from
1 to n, we obtain the following equation

Since,
, so the equation (26) becomes

From (23) and (27), we obtain

Now using the estimate of λ in (24), we get

Therefore,

**Comment:** This Eq. 30 can be considered as an estimating equation of P_{i} which can not be solved analytically but may be solved numerically. So the solution of this equation is our desired estimate of P_{i}.

Again using (28) in (25), we may obtain
the estimate of P_{n}

**Comment:** The above equation also can not be solved analytically but may be solved numerically.

Finally the estimate of θ may be obtained from the numerical solution
of Eq. (23).

**Case-(b):** F_{0}(.), f_{0}(.) and θ are unknown and only non-cured group are observed. Suppose that we have the data in the form (x_{i},ε_{i}), I = 1,2,..,n,where, x_{i }denotes the survival time for the ith patient, ε_{i} is the cured indicator with 1 if x_{i }is not cured and 0 otherwise. i.e., ε_{i}=1 _{{xi<∞}}. The non-parametric likelihood function can be written as

The log-likelihood function is given by

We want to maximize (31) subject to condition
, By using Lagrange multiplier method we can maximize (31).
Adding Lagrange multiplier λ in (31), we can write

Therefore, the non-parametric maximum likelihood estimators of θ and p_{i} are obtained by the solution of following equations

Now gives

and gives,
i =1,2,…,n-1

Multiplying (35) by P_{i} and summing over i from1
to n we obtain

Since ,
so the above equation becomes

Multiplying Eq. (34) by θ and subtracting from (38),
we obtain

Therefore,

Using the estimate of λin (35), we obtain

Thus,

**Comment:** This is an estimating equation of P_{i}, i =1,2,..,n-1. and it can not be solved analytically but it may be solved numerically and the solution of this equation is the desired estimate of P_{i}.

Again, using (39) in (36), we obtain

We observe that the estimate of P_{n} may be obtained from the numerical
solution of the Eq. (41)

Finally the estimate of θ can be obtained from the numerical solution
of the Eq. (34).

**CONCLUSIONS**

Considering both non-cured and cured group, when we assume f_{0}(.) and F_{0}(.) are unknown, we found a non-parametric estimating equation of θ. Unfortunately we could not find an explicit solution for θ. But hopefully, this non-parametric estimating equation may be solved numerically by choosing an appropriate **numerical method**. Also we have found the same result when we consider non-cured group only. That is, in both the cases we have found a non-parametric estimating equation for θ.