Abstract: The statistical problem considered in this article is the parametric treatment comparison when partly interval-censored failure time data exist. Partly interval-censored failure time data are composed of exact observations and interval-censored observations. This phenomenon often occurs in clinical trials and health studies that require periodic following up with patients. The authors constructed a score test and likelihood ratio test for this type of failure time data under Weibull distributions using multiple imputation technique. A simulation study and a modified secondary data set from breast cancer study are used to assess the proposed test and illustrate the differences between the two tests. The results indicate that the presented procedure works well for both tests, but the likelihood ratio test is better than the score test in certain situations.
INTRODUCTION
Partly Interval-censored (PIC) data often occurs in medical and health studies that include periodic examinations. With PIC data, the failure times are precisely observed for a portion of participants, but the failure times for remaining participants only occur within a period of time. An example of this type of data is provided by the Framingham Heart Disease study. In this study, time of first emergence of subcategory angina pectoris in coronary heart disease patients was the event of interest. For a number of patients, the event times are recorded exactly, but times are recorded only between two clinical examinations but for the remaining patients (Odell et al., 1992). Research of PIC data is ongoing but limited to date. Researchers that have addressed the PIC data include Huang (1999), who derived the asymptotic properties of the nonparametric estimation for the distribution function with PIC data. Kim (2003) studied the maximum likelihood estimation for regression analysis of PIC data using the proportional hazards model. Zhao et al. (2008) developed a nonparametric test approach with PIC data. The nonparametric test approach is based on the same idea by Sun et al. (2005). Multiple Imputation (MI) has proven useful in solving many statistical problems. However, to the best of the authors' knowledge, there is no application of MI to PIC data and the authors are investigating application to the parametric test when PIC data exist.
The objective of this article is to present parametric tests of PIC failure time data using a MI technique that can compare the survival functions of two or more samples.
PARTLY INTERVAL-CENSORED DATA
Consider schematic follow-ups for medical studies, where x1, x2,
,
xm are inspection times. Suppose that after each initial follow-up
inspection times, the patient may be absent from subsequent follow-ups with
probability , where, 0≤
Assume that the exact failure time for n1 participants will be observed.
Interval-censored failure time for the remaining subject n2; n2
= (n-n1) will be additionally observed. Exact failure times mean
that any patient has the event of interest during the inspection times or the
patients condition necessitates hospital examination where the event of
interest is recorded exactly. Interval-censored failure time measurements mean
that the event of interest occurs between two inspection times, (Li,
Ri) where Li, Riε(x1,
,
xm) and Li<Ri has a probability of one.
If the failure time of the patient occurs before the first examination, then
left-censored will be observed, i.e., tiε(o.Li).
If the patient did not have the failure time before or during the final examination,
then right-censored will be observed, i.e., tiε(Ri,
∞). Additionally, censoring is presumed independent of the examination
time. For the interval-censored ith patient, define
(1) |
PARAMETRIC TESTS FOR EXACT DATA
Here, briefly reviews the score test and likelihood ratio test for exact data. Suppose that P samples are generated from Weibull distribution with parameter θj = (aj, bj) and sample size nj for jth sample, where n = n1+ ··· +np. Assume that the survival function has the form:
(2) |
where, a is the shape parameter and b is the scale parameter.
The ultimate objective is to test the hypothesis Ho: S1 (t) = ··· = Sp (t). The authors are interested in testing Ho: a1 = ··· = ap; b 1 = ··· = bp against H1: a1 ≠ ··· ≠ ap; b1 ≠ ··· ≠ bp. The likelihood function under H0 is:
(3) |
and the likelihood function under H1 is:
(4) |
To apply the score test, the score is U(θ) where:
(5) |
The score test is based on the fact that the score has normal distribution with mean zero and variance V(θ) asymptotically under the null hypothesis, where:
(6) |
is the observed information matrix. The score test statistic under the null hypothesis H0 is:
(7) |
with chi-square distribution with degree freedom (v), where (v) is the number
of imposed restrictions by the null hypothesis and
For the likelihood ratio test, the test statistic is:
(8) |
with an asymptotically χ2 distribution with two degrees of freedom under H0 (Kalbfleisch and Prentice, 2002).
PARAMETRIC TESTS FOR PARTLY INTERVAL-CENSORED DATA
Here, the authors display a score test and likelihood ratio test with PIC data using the MI method.
Assumption: The authors consider survival analysis which includes n independent participants from p different treatments. Let Ti>0 be a random variable to signify the failure time of interest for ith participant and nl the number of participants from treatment l with distribution function Fl(t) and survival function Sl(t) = 1-Fl(t); l = 1, , p and n1+ ··· + np = n. Presume that the PIC data are available for lth treatment, indicating that the exact failure time for N1 participants will be observed and given by the form:
Interval-censored failure time for the remaining participants N2 will be observed and given by the form {(Li, Ri], i = N1+1, ··· , n; N2 = n-N1}. In this case, (Li, Ri] denotes the interval in which Ti is observed and Li, Ri are positive random variables independent of Ti such that Li<Ri has a probability of one. When, Li<Ri = ∞ the failure time Ti is regarded as a right-censored observation.
The ultimate aim is to test the hypothesis Ho: S1 (t)
= ··· = Sp (t) to determine whether the p treatments
could have resulted from an identical failure time distribution. Let So(t)
indicate the common survival function under Ho and let
PMLE for partly interval-censored data: In this article, the authors suppose that Ti follows the Weibull distribution. Assume that the survival function has the form given in Eq. 2 and PIC data are available. Under these assumptions and from Eq. 1, the likelihood function is:
(9) |
Under the Weibull distribution, the maximum likelihood estimates of the parameters a and b are the solution of:
(10) |
that can be solved by the Newton-Raphson method.
A method based on the MI technique: Here, imputation technique is used to handle the censored data. The exact failure time data will be imputed from the partial interval-censored data and applied to the parametric tests for exact data. The authors are not concerned with whether the interval-censored data are finite or infinite. The exact value can be imputed with right-censored data when the infinite interval contains part of a data set. When no data set exists inside the infinite interval, the left side of the interval will be taken as an exact value. The imputation scheme used by the authors is Rubin's MI (Rubin, 1987) and the procedure of MI is as follows. Let Y be a pre-determined positive integer and let y be an integer that satisfies 1<y<Y. For y = 1, ··· , Y perform:
Step 1: | Let Tiy be a random sample taken from the conditional probability function: |
(11) |
and i = 1, ··· , N2; k = 1,
··· , M. where, |
|
Next, a set of exact data will be obtained {tyi, i = 1, ··· , N2} | |
Step 2: | Mix the exact data imputed from the conditional probability function in step 1 with the exact data established in the original data. This step will result in exact data {ti, i = 1, ··· N1; tyi, i =N1+1, ··· , n} |
Step 3: | Apply the parametric tests, such as the score test or likelihood ratio test, for the exact data |
Step 4: | Repeat steps 1-3 for each y = 1, ··· , Y |
Step 5a: | For the score test, let: |
The covariance matrix V of |
(12) |
Hence, the suggested test statistic to compare P treatments
is |
|
The simulation result indicates that the test statistic has an approximate chi-square distribution with two degrees of freedom χ2(2) under the null hypothesis | |
Step 5b: | For the likelihood ratio test, let: |
be the test statistic to compare P treatments. The simulation result indicates that the test statistic has an approximate chi-square distribution with two degrees of freedom χ2(2) under the null hypothesis |
SIMULATION STUDY
To examine the accuracy of both tests proposed in the previous section, the authors performed a simulation study compared two treatments under a proportional hazards model. It was presumed that failure times were generated from Weibull distribution with shape parameter a and scale parameter b with survival function:
for treatment 1 and S2(t) = (S1(t))exp(β) for treatment 2, where β is a parameter that measures the difference between two survival functions. There are exact observations and interval-censored observations for each treatment. Additionally, the total sample size of the two treatment groups is assumed to be n = 50, 100, 200 and β = 0.9, 0.5, 0, -0.5, -0.9.
The following algorithm is used to generate interval-censored data:
• | Generate failure time Ti: i = 1, , n; n = (n1+n2) from Weibull distribution |
• | Construct the set of follow-up studies that pre-specify examination times to examine participants. Each participant was assumed to be observed at 11 follow-up times which are τr = 1+1.5 (r-1), r = 1, , 11 |
• | Assume that after the initial follow-up time, a participant may be absent
from subsequent follow-ups with probability where 0≤ |
• | Randomly choose Ti; i = 1, , n1 to be exact failure time for n1 participants such that Ti lies between the first and the last follow-up times |
• | The interval-censored time for the remaining n2 participants is defined as the shortest time interval between two successful examination times which also includes the participant's generated true failure time. The left side of the interval for a participant will be defined as zero when the true failure time occurs before the first successful examination time and the right side of the interval for a participant will be defined as infinity when the true failure time is greater than the terminal successful examination time |
RESULTS AND DISCUSSION
The results obtained from simulation study are based on 1000 replications and
Y = 10. To distinguish between two tests, let (STI) denote to the score test
with MI and let (RTI) denote to the likelihood ratio test with MI. Table
1 displays the estimated power and size at the significance level of 0.05
for both tests based on the simulated data for 30% of exact data and for different
sample sizes with different values of β and
Table 1: | Estimated power and size of tests with 30% exact observations with 10 multiple imputations and 1000 replications at 0.05 significance level |
As anticipated, the power of both tests increases as the sample size increases,
indicating different effects of n and
AN APPLICATION
The authors conducted several modifications on the breast cancer study presented by Klien and Moeschberger (1997) to test the suitability of the new tests. Several exact observations were added by generating random failure times that lie within the same range of the given data. The data set are shown in Table 2.
The data set consists of 124 participants, of whom 61 patients were treated by radiation therapy (R) and 63 patients were treated by radiation therapy and adjuvant chemotherapy (R+C). This study was executed to compare the cosmetic effects of R against R+C on women with early breast cancer and the event of interest was the time to the first occurrence of breast retraction. The patients were observed at clinic visits every 4 or 6 months and the actual dates of the event were recorded exactly if available. The interval of events was noted when actual dates were unavailable. The target of this study is to compare two treatments with respect to their cosmetic effects.
In both tests, STI and RTI were applied to examine the cosmetic effect between two treatments. The obtained values of the test statistic are equal to 7.655 and 7.262, respectively, with p-values of 0.022 and 0.026, respectively. These results indicate a significant cosmetic effect difference between the treatments. In addition, the STI p-values are smaller than the RTI p-values, indicating that the STI is more sensitive in detecting the difference between the treatments. Notably, the sample size is large where n = 124. Therefore, it can be easily observed that this result coincide with the simulation results.
Table 2: | Time to cosmetic deterioration (in months) in breast cancer patients who received two treatments |
Fig. 1: | PMLE of time survival function to cosmetic deterioration |
Furthermore, Fig. 1 shows that the Weibull distribution is used to obtain estimated survival function for the two treatments R and R+C. This figure indicates that patients in the R+C group develop breast retraction earlier than patients in the R group.
CONCLUSION
In this article, the authors proposed parametric tests using the MI technique, such as STI and RTI for PIC failure time data under Weibull distribution to compare survival functions of p treatments. Simulation results indicate that the tests work well under the situations considered in the study. The test comparison revealed that the power of the test is affected by the accuracy of the estimated test size in both tests. Therefore, the performance of a test, effective or otherwise, can be evaluated according to the estimated size of the test. Hence, it was found that RTI is better than STI when the sample size is equal to 50 and STI is better than RTI when the sample size is greater than 50. Additionally, the proposed parametric tests have been used successfully to detect the difference between the R and R+C treatments.