HOME JOURNALS CONTACT

Journal of Applied Sciences

Year: 2013 | Volume: 13 | Issue: 4 | Page No.: 621-626
DOI: 10.3923/jas.2013.621.626
Parametric Tests for Partly Interval-censored Failure Time Data under Weibull Distribution via Multiple Imputation
Azzah Mohammad Alharpy and Noor Akma Ibrahim

Abstract: The statistical problem considered in this article is the parametric treatment comparison when partly interval-censored failure time data exist. Partly interval-censored failure time data are composed of exact observations and interval-censored observations. This phenomenon often occurs in clinical trials and health studies that require periodic following up with patients. The authors constructed a score test and likelihood ratio test for this type of failure time data under Weibull distributions using multiple imputation technique. A simulation study and a modified secondary data set from breast cancer study are used to assess the proposed test and illustrate the differences between the two tests. The results indicate that the presented procedure works well for both tests, but the likelihood ratio test is better than the score test in certain situations.

Fulltext PDF Fulltext HTML

How to cite this article
Azzah Mohammad Alharpy and Noor Akma Ibrahim, 2013. Parametric Tests for Partly Interval-censored Failure Time Data under Weibull Distribution via Multiple Imputation. Journal of Applied Sciences, 13: 621-626.

Keywords: partly interval-censored data, multiple imputation, likelihood ratio test, Breast cancer study, score test and Weibull distribution

INTRODUCTION

Partly Interval-censored (PIC) data often occurs in medical and health studies that include periodic examinations. With PIC data, the failure times are precisely observed for a portion of participants, but the failure times for remaining participants only occur within a period of time. An example of this type of data is provided by the Framingham Heart Disease study. In this study, time of first emergence of subcategory angina pectoris in coronary heart disease patients was the event of interest. For a number of patients, the event times are recorded exactly, but times are recorded only between two clinical examinations but for the remaining patients (Odell et al., 1992). Research of PIC data is ongoing but limited to date. Researchers that have addressed the PIC data include Huang (1999), who derived the asymptotic properties of the nonparametric estimation for the distribution function with PIC data. Kim (2003) studied the maximum likelihood estimation for regression analysis of PIC data using the proportional hazards model. Zhao et al. (2008) developed a nonparametric test approach with PIC data. The nonparametric test approach is based on the same idea by Sun et al. (2005). Multiple Imputation (MI) has proven useful in solving many statistical problems. However, to the best of the authors' knowledge, there is no application of MI to PIC data and the authors are investigating application to the parametric test when PIC data exist.

The objective of this article is to present parametric tests of PIC failure time data using a MI technique that can compare the survival functions of two or more samples.

PARTLY INTERVAL-CENSORED DATA

Consider schematic follow-ups for medical studies, where x1, x2,…, xm are inspection times. Suppose that after each initial follow-up inspection times, the patient may be absent from subsequent follow-ups with probability , where, 0≤1 let Ti>0 be a random variable to denote failure time for ith subject. Additionally, let n be the number of participants with failure times following a continuous distribution with density function f(t, θ), where θ is a parameter vector.

Assume that the exact failure time for n1 participants will be observed. Interval-censored failure time for the remaining subject n2; n2 = (n-n1) will be additionally observed. Exact failure times mean that any patient has the event of interest during the inspection times or the patient’s condition necessitates hospital examination where the event of interest is recorded exactly. Interval-censored failure time measurements mean that the event of interest occurs between two inspection times, (Li, Ri) where Li, Riε(x1,…, xm) and Li<Ri has a probability of one. If the failure time of the patient occurs before the first examination, then left-censored will be observed, i.e., tiε(o.Li). If the patient did not have the failure time before or during the final examination, then right-censored will be observed, i.e., tiε(Ri, ∞). Additionally, censoring is presumed independent of the examination time. For the interval-censored ith patient, define and . Then, the likelihood function for θ is:

(1)

PARAMETRIC TESTS FOR EXACT DATA

Here, briefly reviews the score test and likelihood ratio test for exact data. Suppose that P samples are generated from Weibull distribution with parameter θj = (aj, bj) and sample size nj for jth sample, where n = n1+ ··· +np. Assume that the survival function has the form:

(2)

where, a is the shape parameter and b is the scale parameter.

The ultimate objective is to test the hypothesis Ho: S1 (t) = ··· = Sp (t). The authors are interested in testing Ho: a1 = ··· = ap; b 1 = ··· = bp against H1: a1 ≠ ··· ≠ ap; b1 ≠ ··· ≠ bp. The likelihood function under H0 is:

(3)

and the likelihood function under H1 is:

(4)

To apply the score test, the score is U(θ) where:

(5)

The score test is based on the fact that the score has normal distribution with mean zero and variance V(θ) asymptotically under the null hypothesis, where:

(6)

is the observed information matrix. The score test statistic under the null hypothesis H0 is:

(7)

with chi-square distribution with degree freedom (v), where (v) is the number of imposed restrictions by the null hypothesis and is the maximum likelihood estimate of θ under the null hypothesis H0.

For the likelihood ratio test, the test statistic is:

(8)

with an asymptotically χ2 distribution with two degrees of freedom under H0 (Kalbfleisch and Prentice, 2002).

PARAMETRIC TESTS FOR PARTLY INTERVAL-CENSORED DATA

Here, the authors display a score test and likelihood ratio test with PIC data using the MI method.

Assumption: The authors consider survival analysis which includes n independent participants from p different treatments. Let Ti>0 be a random variable to signify the failure time of interest for ith participant and nl the number of participants from treatment l with distribution function Fl(t) and survival function Sl(t) = 1-Fl(t); l = 1,…, p and n1+ ··· + np = n. Presume that the PIC data are available for lth treatment, indicating that the exact failure time for N1 participants will be observed and given by the form:

Interval-censored failure time for the remaining participants N2 will be observed and given by the form {(Li, Ri], i = N1+1, ··· , n; N2 = n-N1}. In this case, (Li, Ri] denotes the interval in which Ti is observed and Li, Ri are positive random variables independent of Ti such that Li<Ri has a probability of one. When, Li<Ri = ∞ the failure time Ti is regarded as a right-censored observation.

The ultimate aim is to test the hypothesis Ho: S1 (t) = ··· = Sp (t) to determine whether the p treatments could have resulted from an identical failure time distribution. Let So(t) indicate the common survival function under Ho and let indicate its Parametric Maximum Likelihood Estimate (PMLE).

PMLE for partly interval-censored data: In this article, the authors suppose that Ti follows the Weibull distribution. Assume that the survival function has the form given in Eq. 2 and PIC data are available. Under these assumptions and from Eq. 1, the likelihood function is:

(9)

Under the Weibull distribution, the maximum likelihood estimates of the parameters a and b are the solution of:

(10)

that can be solved by the Newton-Raphson method.

A method based on the MI technique: Here, imputation technique is used to handle the censored data. The exact failure time data will be imputed from the partial interval-censored data and applied to the parametric tests for exact data. The authors are not concerned with whether the interval-censored data are finite or infinite. The exact value can be imputed with right-censored data when the infinite interval contains part of a data set. When no data set exists inside the infinite interval, the left side of the interval will be taken as an exact value. The imputation scheme used by the authors is Rubin's MI (Rubin, 1987) and the procedure of MI is as follows. Let Y be a pre-determined positive integer and let y be an integer that satisfies 1<y<Y. For y = 1, ··· , Y perform:

Step 1: Let Tiy be a random sample taken from the conditional probability function:

(11)

  and i = 1, ··· , N2; k = 1, ··· , M. where, denotes the unique arrangement element of {0, tj, Li, Ri; j = 1, ··· , N1, i = N1+1, ··· , n}
  Next, a set of exact data will be obtained {tyi, i = 1, ··· , N2}
Step 2: Mix the exact data imputed from the conditional probability function in step 1 with the exact data established in the original data. This step will result in exact data {ti, i = 1, ··· N1; tyi, i =N1+1, ··· , n}
Step 3: Apply the parametric tests, such as the score test or likelihood ratio test, for the exact data
Step 4: Repeat steps 1-3 for each y = 1, ··· , Y
Step 5a: For the score test, let:


  The covariance matrix V of is the sum of two components. The first is the average within-imputation covariance connected with U and the second is between-imputation variance of U. That is:

(12)

  Hence, the suggested test statistic to compare P treatments is
  The simulation result indicates that the test statistic has an approximate chi-square distribution with two degrees of freedom χ2(2) under the null hypothesis
Step 5b: For the likelihood ratio test, let:


  be the test statistic to compare P treatments. The simulation result indicates that the test statistic has an approximate chi-square distribution with two degrees of freedom χ2(2) under the null hypothesis

SIMULATION STUDY

To examine the accuracy of both tests proposed in the previous section, the authors performed a simulation study compared two treatments under a proportional hazards model. It was presumed that failure times were generated from Weibull distribution with shape parameter a and scale parameter b with survival function:

for treatment 1 and S2(t) = (S1(t))exp(β) for treatment 2, where β is a parameter that measures the difference between two survival functions. There are exact observations and interval-censored observations for each treatment. Additionally, the total sample size of the two treatment groups is assumed to be n = 50, 100, 200 and β = 0.9, 0.5, 0, -0.5, -0.9.

The following algorithm is used to generate interval-censored data:

Generate failure time Ti: i = 1,…, n; n = (n1+n2) from Weibull distribution
Construct the set of follow-up studies that pre-specify examination times to examine participants. Each participant was assumed to be observed at 11 follow-up times which are τr = 1+1.5 (r-1), r = 1,…, 11
Assume that after the initial follow-up time, a participant may be absent from subsequent follow-ups with probability where 0≤1
Randomly choose Ti; i = 1,…, n1 to be exact failure time for n1 participants such that Ti lies between the first and the last follow-up times
The interval-censored time for the remaining n2 participants is defined as the shortest time interval between two successful examination times which also includes the participant's generated true failure time. The left side of the interval for a participant will be defined as zero when the true failure time occurs before the first successful examination time and the right side of the interval for a participant will be defined as infinity when the true failure time is greater than the terminal successful examination time

RESULTS AND DISCUSSION

The results obtained from simulation study are based on 1000 replications and Y = 10. To distinguish between two tests, let (STI) denote to the score test with MI and let (RTI) denote to the likelihood ratio test with MI. Table 1 displays the estimated power and size at the significance level of 0.05 for both tests based on the simulated data for 30% of exact data and for different sample sizes with different values of β and . In Table 1, values under β = 0 refer to the estimate size of test and values under β = ±5; ±9 refer to the power of test. As presented in Table 1, the power of both tests decreases as increases, except for β = 0.9 in the score test.

Table 1: Estimated power and size of tests with 30% exact observations with 10 multiple imputations and 1000 replications at 0.05 significance level

As anticipated, the power of both tests increases as the sample size increases, indicating different effects of n and on the power of both tests. The has a smaller effect than the sample size on the power of both tests. Table 1 additionally confirms that the power of both tests increases as the absolute value of β increases. According to whether the estimated size of test is close to or far from the significant level of 0.05, Table 1 showed that the estimated size of the test in the RTI is more accurate than the STI when n = 50. Meanwhile, the estimated size of the test in the STI is more accurate than the RTI when n>50. The table reveals that both tests work well under this situation considered. Nevertheless, RTI is better than STI in certain cases. The authors studied the estimated power and size at the significance level of 0.05 for both tests based on the simulated data for 50% of exact data and different sample sizes with different values of β and . The results are similar to Table 1. The estimated size of both tests improved in the majority of cases when more exact data exists.

AN APPLICATION

The authors conducted several modifications on the breast cancer study presented by Klien and Moeschberger (1997) to test the suitability of the new tests. Several exact observations were added by generating random failure times that lie within the same range of the given data. The data set are shown in Table 2.

The data set consists of 124 participants, of whom 61 patients were treated by radiation therapy (R) and 63 patients were treated by radiation therapy and adjuvant chemotherapy (R+C). This study was executed to compare the cosmetic effects of R against R+C on women with early breast cancer and the event of interest was the time to the first occurrence of breast retraction. The patients were observed at clinic visits every 4 or 6 months and the actual dates of the event were recorded exactly if available. The interval of events was noted when actual dates were unavailable. The target of this study is to compare two treatments with respect to their cosmetic effects.

In both tests, STI and RTI were applied to examine the cosmetic effect between two treatments. The obtained values of the test statistic are equal to 7.655 and 7.262, respectively, with p-values of 0.022 and 0.026, respectively. These results indicate a significant cosmetic effect difference between the treatments. In addition, the STI p-values are smaller than the RTI p-values, indicating that the STI is more sensitive in detecting the difference between the treatments. Notably, the sample size is large where n = 124. Therefore, it can be easily observed that this result coincide with the simulation results.

Table 2: Time to cosmetic deterioration (in months) in breast cancer patients who received two treatments

Fig. 1: PMLE of time survival function to cosmetic deterioration

Furthermore, Fig. 1 shows that the Weibull distribution is used to obtain estimated survival function for the two treatments R and R+C. This figure indicates that patients in the R+C group develop breast retraction earlier than patients in the R group.

CONCLUSION

In this article, the authors proposed parametric tests using the MI technique, such as STI and RTI for PIC failure time data under Weibull distribution to compare survival functions of p treatments. Simulation results indicate that the tests work well under the situations considered in the study. The test comparison revealed that the power of the test is affected by the accuracy of the estimated test size in both tests. Therefore, the performance of a test, effective or otherwise, can be evaluated according to the estimated size of the test. Hence, it was found that RTI is better than STI when the sample size is equal to 50 and STI is better than RTI when the sample size is greater than 50. Additionally, the proposed parametric tests have been used successfully to detect the difference between the R and R+C treatments.

REFERENCES

  • Huang, J., 1999. Asymptotic properties of nonparametric estimation based on partly interval-censored data. Stat. Sin., 9: 501-520.
    Direct Link    


  • Kalbfleisch, J.D. and R.C. Prentice, 2002. The Statistical Analysis of Failure Time Data. 2nd Edn., John Wiley and Sons, New Jersey


  • Kim, J.S., 2003. Maximum likelihood estimation for the proportional hazards model with partly interval-censored data. J. R. Stat. Soc.: Ser. B Stat. Methodol., 65: 489-502.
    CrossRef    


  • Klien, J.P. and M.L. Moeschberger, 1997. Survival Analysis Techniques for Censored and Truncated Data. 2nd Edn., Springer-Verlag, New York


  • Odell, P.M., K.M. Anderson and R.B. D'Agostino, 1992. Maximum likelihood estimation for interval-censored data using a Weibull-based accelerated failure time model. Biometrics, 48: 951-959.
    Direct Link    


  • Rubin, D.B., 1987. Multiple Imputation for Nonresponse in Surveys. John Wiley and Sons, New York.


  • Sun, J., Q. Zhao and X. Zhao, 2005. Generalized log-rank tests for interval-censored failure time data. Scand. J. Stat., 32: 49-57.
    CrossRef    


  • Zhao, X., Q. Zhao, J. Sun and J.S. Kim, 2008. Generalized log-rank tests for partly interval-censored failure time data. Biom. J., 50: 375-385.
    CrossRef    

  • © Science Alert. All Rights Reserved