**INTRODUCTION**

Survival models that incorporate the cure fraction in the analysis are called
cure rate models. Recently, survival cure models are being widely used in analyzing
data from cancer studies. They are used for analyzing survival data from various
types of cancer in which a proportion of patients becomes free of any signs
or symptoms of the disease, Amiri *et al*. (2008).
The first created cure rate model is the mixture model which constructed by
Boag (1949) and later developed by Berkson
and Gage (1952). In this model, a certain proportion π of patients
are cured as well as the remain 1-π are not.

In this model the survival function for the entire population can be written in terms of the ‘mixture’ of the cured part plus the uncured part such that:

where, S (t) and S_{u} (t) are the survival functions for the entire
population and the uncured patients, respectively. The survival function of
uncured patients can be estimated parametrically or non-parametrically which
leads to parametric or semi-parametric survival function, respectively, where
in the parametric case, a particular distribution for the failure time distribution
of uncured patients could be employed such as exponential, Weibull, Gompertz,
negative binomial and Generalized F distribution (Savadi-Oskouei
*et al*., 2010).

The literature on mixture cure model could be found in the study of Gamel
*et al*. (1990), Kuk and Chen (1992), Taylor
(1995), Peng and Dear (2000), Sy
and Taylor (2000), Peng and Carriere (2002), Uddin
*et al*. (2006), Liu *et al*. (2006a),
Yu and Peng (2008) and Abu Bakar
*et al*. (2009).

In Eq. 1, S (t) = 1-F (t), where F (t) is the cumulative distribution function. Furthermore, F (0) = 0 and F (∞) = 1, so that S (0) = 1 and S (∞) = π the plateau value. The hazard function concomitant to this model is:

where, f (t) is the probability density function (p.d.f) attendant to F (t).

Despite the widely used of the mixture model in **survival analysis**, it has some
limitations as was discussed by Chen *et al*. (1999),
some of these drawbacks are:

• |
The proportional hazard structure which is a desirable property
for any survival model cannot be constructed in the presence of covariates |

• |
When including covariates through the parameter π via a standard
regression model, then mixture model yields improper posterior distributions
for many types of non-informative improper priors, including the uniform
prior for the regression coefficients |

• |
Mixture model does not appear to describe the underlying biological process
generating the failure time, at least the context of cancer relapse, where
cure rate models are frequently used |

Chen et al. (1999) proposed the Bounded Cumulative
Hazard (BCH) model developed by Yakovlev * et al*. (1993)
as the viable alternative to the mixture model. This alternative model is quite
attractive for several aspects:

• |
It is derived from a natural biological motivation |

• |
It has proportional hazard structure through the cure rate parameter |

• |
It is computationally very attractive |

• |
It has a mathematical relationship with the mixture cure rate model |

The bounded cumulative hazard model assumes that for an individual in the population
left with N cancer cells after the initial treatment. The cancer cells (often
called clonogens) grow rapidly and replace the normal tissue later on (cancer
relapse). N may follows Poisson, Bernoulli or negative binomial distribution
(Rodrigues *et al*., 2009). However, in this study
we will consider N to follow the Poisson distribution with a mean of θ.

Let Z_{i}, i = 1, 2, þN denotes the time of the ith clonogen to produce detectable cancer mass. Then the time it takes cancer to relapse can be defined by the random variable T = min [Z_{i}, 0≤i≤N], P (Z_{o} = ∞) and Z_{i}'s are independent and identically distributed (i.i.d) and that N is independent of the sequence Z_{1}, Z_{2}, þ, Z_{N}. Therefore, the survival function for T and hence for the population, is given by: S (t) = P (T>t) (Probability no cancer by the time t ).

Since S (∞) = exp (-θ) and F (∞) = 1, then Eq.
2 is an improper survival function. Therefore, the cure fraction π
can be defined as follows:

As θ→∞, π→0, whereas as θ→0, π→1 (i.e., 0≤π≤1).

It should be notified that the first derivative of S (t) with respect to t is:

Since 1-S (t) = F (t) and accordingly:

then ds/dt is an improper survival function and therefore, f (t) is an improper probability density function as well.

**MATERIALS AND METHODS**

Suppose that T is a random variable with probability density function f (t; θ), θ to be estimated and t_{1}, t_{2}, þ, t_{n} is a random sample of size n. We are interesting in the likelihood function using the left censored data, because it gives us the possibility to compute the Maximum Likelihood Estimates (MLE) in order to fit a model for censored data. In order to analyze such data let α_{i} and c_{i} are indicators for the left censoring and cured, respectively where for the ith patient:

If α_{i} = 1, then c_{i} = 1 but if α_{i} = 0, then c_{i} is not observed and it can be either one or zero, assuming that censoring is independent of failure times.

In parametric maximum likelihood method the cumulative distribution function F (.) and the probability density function f (.) for the entire population are known. Thus, given α_{i} and c_{i} (i.e., the complete data are available), then the joint probability density function can be written as:

Consequently, the complete log likelihood function is:

where, f_{u} (t_{i}) and S_{u} (t_{i}) are the p.d.f and the survival function for the uncured patients, respectively.

This study considers the exponential distribution for S_{u} (t_{i}) and f_{u} (t_{i}) such that:

A datum t_{i} is said to be left-censored if the event occurs at a time before a left bound but it is unknown when it happens, for example, when the date of starting a cancer clinical trial is assigned but for a cancer patient we don’t know when the patient has been died. However, in case of left censoring the survival function of the uncured patients becomes S_{u} (t) = 1-e^{-λτ}.

Therefore, the log-likelihood function becomes:

The solutions of:

and:

are the desired estimates of θ and λ, where,

Solving Eq. 7 implies:

While Eq. 8 can be solved numerically since no explicit solution can be found.

As the cure status c_{i} is not fully observed, the Expectation Maximization (EM) algorithm will employ.

Before implementing the EM algorithm, let’s define g_{i} as the
expected value of the ith patient to be uncured conditional on the current estimates
of α_{i} and the survival function of uncured patients, S_{u}
(t_{i}) (Peng and Dear, 2000):

For censored individuals α_{i} = 0 and hence the equation giving
g_{i} can be re-written as follows:

For simplicity, let p_{i} to be the probability of cured patients such that p_{i} = E (1-c_{t}) = 1-g_{i}:

**THE EM ALGORITHM**

EM algorithm is an iterative optimization method which alternates between performing
an Expectation (E) step which computes the expectation of the log-likelihood
function using the current estimate for the latent variables and Maximization
(M) step which computes parameters maximizing the expected log-likelihood found
on the E step. These parameter-estimates are then used to determine the distribution
of the latent variables in the next E step. The EM algorithm (and its faster
variant ordered subset expectation maximization) is widely used in many different
fields, especially in data clustering in machine learning, **computer vision** and
medicine (Liu *et al*., 2006b; Safarinejadian
*et al*., 2009).

However, suppose that the data vector is in the form of (t_{i}, α_{i}, c_{i}). For i = 1…n, the observed data is the lifetime (t_{i}) and censoring status (α_{i} = 1) for i = 1…n and also the cure status (c_{i} = 1), i = 1…m while the unobserved data is the cure status (c_{i}) for i = (m+1)…n, ∀m<n.

In the presence of unobserved data (c_{i}), only a function of the complete-data vector is observed. However, in the E-Step we find the expected value of the log likelihood function given by Eq. 6 as follows:

are the sufficient statistics for the parameters vector (λ, θ)^{T}.

It follows that the log-likelihood based on complete data is linear in complete data sufficient statistics and then the E-step requires the computation of :

and

Let:

For the M-Step we can use the complete data maximum likelihood estimates of
(λ, θ) given by Eq. 8 and 9 and
then substituting the expectations derived in the E-step for the complete data
sufficient statistics, such that on grounds of the sufficient statistics, the
maximum likelihood equation of θ implies:

While Eq. 8 could be re-rewritten as follows:

Thus, the E-step involves evaluating the sufficient statistics given by Eq.
11, 12 and 13 and also p_{i}
using some initial values for the parameters (θ^{o}, λ^{o})
followed by M-step involves substituting these values in Eq.
14 and solving Eq. 15 numerically with respect to λ.
The convergence t^{th} iteration is our desired estimates of θ
and λ and eventually the desired cure fraction is exp (-θ^{t+1}).

**CONCLUSION**

We investigated the maximum likelihood estimation methodology for cure rate estimation based on the bounded cumulative hazard model when the exponential distribution can be used to represent the survival function of the uncured patients. A novel development of the EM algorithm was used to obtain maximum likelihood estimates when the data set has some left censoring observations.