Research Article
Effect of Baseline Hazards and Functional Form of Continuous Covariates on Model Misspecification
Department of Statistics, University of Ilorin, Nigeria
B.A. Oyejola
Department of Statistics, University of Ilorin, Nigeria
Analysis of survival times data has gained a considerable attention, particularly in the field of medicine, from where the conventional denotation Survival analysis arises. The main concern of survival analysis is analyzing failure time data, where the response variable gives time to a particular event, known as a failure and some of the true response times are not directly observed (Clark et al., 2003).
Often, instead of considering time to a single event, interest may be to distinguish between various types or causes of events. The modelling technique for such data structure is called Competing risk model. This is because the smallest realized time, the cause specific (risk specific) failure time makes the failure times for other causes right censored. In other words, the analyst only observes the minimum of the failure times (Klein and Andersen, 2005).
In modelling survival data, the nature of the underlying baseline distribution plays a vital role and this needs to be taken into account during analysis. For example, some underlying baseline distributions may have proportional hazard model assumptions while others may satisfy accelerated model assumptions and these may determine the type of analysis required for such data (Hutton and Monaghan, 2002).
Model misspecification has been a major problem in statistical analysis that requires serious attention, as no result can be more unreliable in modelling than that from a misspecified model. One popular way of model misspecification in survival analysis is to ignore frailty information when data involve identified clusters such as measurements on the same individuals or clusters of subjects (e.g., family, litters or geographical areas). Ignoring frailty information implies that no attention is given to the between clusters heterogeneity and the within clusters dependence and this often presents unreliable results (Henderson and Oman, 1999).
Functional form of regression equation is often not known in advance and must be decided upon once the data have been collected and analyzed. Correct functional form, however, are essential in enhancing the power of the model. Linear and quadratic regression functions are often used as satisfactory first approximation to regression functions of unknown nature. They may be used even when theory provides the relevant functional form but the known form is highly complex.
Most often, the effects of continuous covariates on response variable is not linear (Adebayo and Fahrmeir, 2005; Kneib and Fahrmeir, 2006). There are different techniques known in literature for describing non linear effect of covariates. One simple way is to transform such covariate. This approach is limited by the fact that the transformation required for a given dataset may not be known. The Penalized splines (P-splines) have been an attractive approach for modelling nonlinear smooth effects of covariates within the additive and varying coefficient models framework (Adebayo and Fahrmeir, 2005; Huang and Liu, 2006; Thompson and Rosen, 2008).
The main objective of this study is not to estimate the cause-specific risk but rather to investigate comparatively, the behaviours of models with overall baseline hazards having the above cause-specific combinations under the framework of model misspecification. In the context of this study, model will be adjudged misspecified if it ignores frailty when it is actually present.
Competing risk model: Consider T>0 which is a continuous random variable representing survival time, then the formulation of competing risk problem (Cox, 1959; Gail, 1975), assumes that there are m potential failure times T1, ,Tm in which one observes T = min (T1, Tm) and the cause of failure R = {r: Tr≤Ts ∀s, s = 1,... m)}. The cause specific hazard function due to cause r can be given as:
(1) |
which can be defined (Chiang, 1968; Holt, 1978) as the conditional probability of experiencing event of failure from cause r in the small interval [t, t+Δt], given that none of the events has been experienced prior to time t.
The overall hazard function is given as:
(2) |
This is the conditional probability of experiencing event of failure from all causes in small interval [t, t+Δt], given that such event has not been experienced prior to t.
When the causes are assumed to be distinct, then the cause specific and overall hazards are related by:
(3) |
In modelling the effects of covariates on the cause-specific hazard functions, the model specified by Cox (1972) is often adapted. Suppose that two causes are involved (as in this study), then the cause-specific model for cause r can be given by:
(4) |
where, r, r = 1, 2 indexes the cause or type of failure (or risk type) and i, I = 1,..., n, indexes individuals, λr0(t) is the unspecified baseline hazard function given in (1) and Z is a vector of time-independent covariate. The model for the overall causes, from Eq. 2 and 3 can be given by:
(5) |
Where:
Survival model with frailties: Many authors have developed frailty models that explicitly formulate the nature of dependence (Klein, 1992; Nielsen et al., 1992). Shared frailty introduces a random effect term for each group which might reflect the variation in the baseline hazard from group to group. The random variable acts multiplicatively on the hazard rate of failure in which case a large value of frailty increases the susceptibility of failure for all subjects within the group.
Suppose there are G independent clusters and Xgi and Cgi are the event and censoring times, respectively for the i-th subject in cluster g (g = 1, ,G, i = 1, ,ng). Let Zgi be a vector of covariates, then the observed failure times for the gith subject is given by:
Tgi = min (Xgi, Cgi). If for each cluster, there is an associated frailty term w shared by members of the cluster, then Eq. 5 can be written as:
(6) |
where, λ0 (t) is an unspecified baseline hazard function. The frailties are usually assumed to come from a frailty distribution f (w), belonging to some parametric family of distributions with positive support. Gamma and lognormal distributions are commonly used.
One popular approach for analysis of frailty model (Nielsen et al., 1992; Andersen et al., 1993) is the counting process Ngi (t) defined by:
(7) |
where, τ denotes the maximum follow-up time and at risk indicator process Ygi(t) which is defined as:
(8) |
Then, the dynamics of Ngi(t) can be described by its intensity process given by:
(9) |
The partial likelihood is given by:
(10) |
Because frailties are not observed, then Eq. 9 cannot be computed from the observations and therefore the marginal partial likelihood is obtained by integrating over the frailty, so that:
(11) |
Generalized Additive Model (GAM): The GAM method permits smooth nonparametric modelling when parametric assumptions cannot be guaranteed, especially for the baseline hazards and continuous covariates. Within the context of generalized additive models (Hastie and Tibshirani, 1990), we replace the predictor in (6) with additive predictor which in a simpler and more specific way that suits this study and this can be written as:
(12) |
with:
where, fo(t) is a time-varying log-baseline effect. fi(xij) is a non-linear effect of time independent continuous covariates xij. γ is a linear fixed effect of categorical covariates Zi. bg is the cluster-specific random effect to account for the frailty, with bgi = bg if i-th individual is in cluster g, g = 1, ,G.
Bayesian P-splines approach for estimating the unknown functions (fj): A number of competing approaches is available for estimating nonlinear function fj of continuous covariates. These include smoothing splines (Hastie and Tibshirani, 1990), local polynomials (Fan and Gijbels, 1996), regression splines with adaptive knot selection (Friedman and Silverman, 1989; Stone et al., 1997) and P-splines (Eilers and Marx, 1996; Marx and Eilers, 1998). In this study, Bayesian version of P-splines has been used following (Lang and Brezger, 2004).
Bayesian analysis requires assignment of appropriate priors to all the effects. Therefore, for the fixed effect parameter γ, diffuse priors has been assumed as:
P(γ)∝const
For the baseline effect f0(t) and continuous covariate effect fj, Penalized splines (P-splines) (Lang and Brezger, 2004) have been assigned. For the random effect bg, independently and identically distributed Gaussian Prior, bg∼N(0, τ2b) has been assigned.
Criteria for model comparison: Choosing the most appropriate parametric models can be difficult. Graphical approaches can be used for evaluating the appropriateness of exponential, Weibull and lognormal models. Various model diagnostic criteria are known in literature. The Loglikelihood Ratio (LR) has been proved to be quite reliable. However, Akaikes Information Criterion (AIC) provides a widely accepted approach for comparing the fit of models with different underlying distributions. These criteria along with the Bayesian Information Criterion (BIC) have been used in this study. These criteria are described thus:
Likelihood Ratio (LR) is defined as:
LR= -2 (log {likelihoodCox} -log{likelihoodSpline})
where, log {likelihoodCox} and likelihoodSpline} denote the values of the partial likelihood for the fitted standard Cox model and the Cox model with spline estimated link, respectively.
Akaike Information Criterion (AIC) is defined as:
AIC = -2log (likelihood)+2 (nknot+d+p -2)
Bayesian Information Criterion (BIC) is defined as:
BIC = -2log(likelihood)+log(n)(nknot+d+p -2)
where, for AIC and BIC, nknot and d denote, respectively, the number of knots and degree of spline, p is the number of covariates and n is the number of observations.
In this study, 20 knots have been chosen and the degree of spline is 3.
This study has been motivated by a clinical study carried out on malignant cancer patients at the Department of Plastic Surgery, University Hospital of Odense, Denmark. Patients were observed during years 1962 to 1977 (after a malignant cancer operation), either up to the time of their death (event of failure) or the end of the study (1977) which is the right censoring time. The data had earlier been analyzed by Andersen et al. (1993), Fahrmeir and Klinger (1995) and Abiodun (2007). The survival time was the time period (in days) from the day of operation till time of death. There were two possible causes of death which are death due to cancer and death that are non cancer related. Patients who were still alive at the end of the study were right censored. Explanatory variables collected include sex of the patient, tumour width, ulcer which indicated the presence or absence of ulcers, age of the patient at the time the operation was performed and year in which operation was performed.
Figure 1a-c show the baseline hazard plots for the overall cause of death (combining both cancer related and non cancer related), the cancer related cause (referred to as cause 1) and the non cancer related cause (referred to as cause 2), respectively. As observed from the plots, the baseline hazard function for the overall cause of death looks like a trade-off between the two cause-specific hazard functions which can be interpreted to be equivalent to their sum under the assumption that the two causes are distinct.
Fig. 1(a-c): | Hazard plots of the malignant data. (a) Hazard function for overall cause, (b) Hazard function for cancer related cause and (c) Hazard function for non cancer related cause |
This study has therefore been predicated by an assertion premised on above scenario, that in real life situation involving competing risk problem, an overall cause of failure could have an underlying survival distribution which is a combination of two or more distinct cause-specific distributions. Against this background, we have, in this study considered overall causes as consisting of two specific causes which have been variously combined from Weibull and Lognormal distributions as follows: Weibull-Weibul, Lognormal-Lognormal and Weibull-Lognormal.
The order of arrangement of causes is immaterial, therefore Weibull-Lognormal could also have been combined as Lognormal-Weibull.
ANALYSIS WITH SIMULATED DATA
In many clinical studies where the aim is to identify prognostic factors for disease specific mortality, data requiring specific causes of death are often either not available or not reliable (Percy et al., 1981). This makes it difficult and sometimes impossible to differentiate between cases of death that are actually related to the disease of interest and those cases of death that are related to other causes that are independent of this disease. In this study therefore, we have resorted to simulating data that mimic the competing risk data that are usually found in real life situations.
Data were simulated from the frailty version of the competing risk model:
(13) |
Where:
fro(t) = log λro (t)
The functional forms of the continuous covariates used are similar to those by Lang and Brezger (2004).
Combinations of baseline hazards considered are:
• | Combination of two Weibull baseline hazards |
λ1o(t)∼Weibull (shape = α, scale = 1), r = 1, 2 | |
• | Combination of two Lognormal baseline hazards |
λro(t)∼ Lognormal (μlog = 0, σlog = 1), r = 1, 2 | |
• | Combination of Weibull and Lognormal baseline hazards |
λro(t)∼Weibull (shape = α, scale = 1) | |
and: | |
λ2o(t)∼ Lognormal (μlog = 0, σlog = 1) |
Independent censoring times were generated from exponential distribution as:
Ci∼ Exp (λ)
binary covariate was generated from Bernoulli distribution as:
Zi∼Bernoulli (1, 0.5)
The continuous covariate was generated from uniform distribution:
Xi∼ U (-3, 3)
the frailty term was generated from Normal distribution as:
bg∼Normal (0, 0.5)
The observed survival times is thus ti = min (T1i, T2i, Ci) and censoring indicator is set as:
This model is simulated for i =1,
.500 using 200 replications.
The data generation were done using R package version 2.6.1.
The following models were progressively analysed under this simulation setup:
Model 1 = f0(t)
Model 2 = f0(t)+bgi
Model 3 = f0(t)+fj (xij)+γ1Z
Model 4 = f0(t)+fj (xij)+γ1Zi+bgi
where, fo (t) is the log baseline hazard effect, (bg) is the cluster-specific random effect, fj(xij) is a non-linear effect of the continuous covariates xij which takes both the linear and the quadratic forms as earlier described.
Model 1 contains only the baseline, Model 2 contains baseline and frailty, Model 3 contains the baseline, the functional forms of the continuous covariates and the categorical covariates while Model 4 is Model 3 with the cluster-specific random effect (frailty). In the context of this study, models 1 and 3 are the misspecified models since frailty is ignored in the analysis, while models 2 and 4 are correctly specified. We thus compare models 1 and two when no covariate is controlled in the analysis and models 3 and 4 when covariates are controlled in the analysis.
Estimation in this study was done using Empirical Bayesian inference based on Restricted Maximum Likelihood ((REML) technique. The approach has also been incorporated into the BayesX (Belitz et al., 2009), a software used for all the analyses. It is software for Bayesian inference in structured additive regression models.
Table 1 presents the results of correctly specified and misspecified models under the three baseline hazards combinations. These include the values of the three criteria; Likelihood Ratio (LR), Akaike Information Criterion (AIC) and Bayesian Information Criterion (BIC), for the linear and quadratic functional forms of continuous covariates. Based on the three criteria, model with smaller value will be preferred.
Under the linear functional form of continuous covariates, it is observed that when the combination of baseline hazards involves two Weibulls (Weibull-Weibull) and no covariate is controlled in the analysis, the correctly specified model is substantially better than the misspecified model. This is so under LR and AIC criteria but no remarkable difference in performance improvement is seen under BIC. For example, LR values is 436.34 for correctly specified model (Model 2) compared to 450.56 for misspecified model (Model 1). The corresponding values of AIC are 449.61 and 454.71 for correctly specified and misspecified models, respectively. However, the values of BIC for correctly specified and misspecified models are 459.18 and 459.32, respectively.
The performances under the combination of two Lognormal baseline hazards (Lognormal_Lognormal) are generally in the same direction as for the combination of two Weibulls. However, the differences in performances in favour of correctly specified models here are not as substantial as for the combination of two Weibulls. For example, the values of LR are 524.12 for correctly specified model compared to 539.61 for misspecified model. The corresponding values of AIC are 542.76 and 547.13, respectively while for BIC, the values of correctly specified model is 553.23 and that of misspecified model is 554.45. When the baseline hazard involves combination of Weibull and Lognormal distributions (Weibull-Lognormal), the results also generally follow the pattern of two Weibulls and two Lognormals.
Table 1: | Values of the criteria for linear and quadratic functional forms under the three baseline hazard combinations |
However, correctly specified model outperformed the misspecified one under LR and AIC more substantially than for the two Lognormal baseline hazards but less substantially than for two Weibull baseline hazards.
When covariates are controlled (Models 3 and 4), the pattern of performances of correctly specified models compared to the misspecified models under the three criteria is similar to when no covariate is controlled. However, there are slight improvements in correctly specified models when covariates are controlled during analysis than when no covariate is controlled under LR and AIC which is not clearly captured by BIC.
The results under quadratic functional form of continuous covariates show that though the values of the three criteria are slightly higher, the pattern of the performances of the correctly specified models compared to the misspecified models are generally similar to those under the linear functional form. Also the behaviours of the three criteria in capturing the model performances under the various combinations of baseline hazards are similar to those of linear functional form of continuous covariates. For example, as observed from the table, under LR and AIC, best performances of correctly specified models over the misspecified models (whether or not covariates are controlled) are found in models with combination of two Weibull baseline hazards and worst performances are found in the models that involve two Lognormal baseline hazard combinations.
We carried out a simulation study involving various combinations of two baseline hazards which included combination of two Weibull baseline hazards (Weibull-Weibull), combination of two Lognormal baseline hazards (Lognormal-Lognormal) and combination of Weibull and Lognormal baseline hazards (Weibull-Lognormal) using two functional forms of continuous covariates, namely linear function and quadratic function, under the framework of model misspecification. Three model diagnostic tools were used for model comparison, namely, Log-likelihood Ratio (LR), Akaike Information Criterion (AIC) and Bayesian Information Criterion (BIC). It was observed that correctly specified models generally performed better than the misspecified models across all the baseline hazards combinations and under both linear and quadratic functional forms of continuous covariates. Although, slight improvements in performances were observed when covariates were controlled over when no covariates were controlled, the improvements were not remarkable.
In conclusion, the simulation studies have shown that models with combination of two Weibull baseline hazards was most responsive to model misspecification, followed by those involving combination of Weibull and Lognormal baseline hazards, while the models involving the combination of two Lognormal baseline hazards was least responsive to model misspecification.
A possible extension of this study is to investigate the performances under various sample sizes and different censoring percentages.