INTRODUCTION
India is burdened with type 2 diabetes and curable drug is emerging as one
of the major therapy problems. Seyyednozadi et al.
(2008) have shown the importance of physical activity to control the diabetes.
Namazi et al. (2011) have found in a controlled
clinical trial that hydro alcoholic extract of nettle has decreasing effect
of TAC and SOD in type 2 diabetes patients. Sayyad et
al. (2011) have demonstrated through computers simulation modeling for
healthcare practice to preventing type 2 diabetes in India.
However, to better investigate the effect of treatment, one is often interested in evaluating how a biomarker of interest changes over time. This study is involved to provide clinical evidence in the search of effective treatment to the type 2 diabetes patients. The biomarker of interest has been considered with serum creatinine changes over the drug trial period.
UK Prospective Diabetes Study Group (1988) has confirmed
that the metformin therapy alone has a 32% lower risk to develop any diabetes
related complications in obese patients. Purnell and Hirsch
(1997) have observed the effectiveness of metformin in non obese diabetic
patients. Ertel (1998) have observed that pioglitazone
group of drugs ameliorates for insulin resistance associated with Noninsulindependent
Diabetes Mellitus (NIDDM) without stimulating insulin release.
The presence of missing observations in the data may be a reason for the inconsistent
results. The reliability of the estimation goes down due to the high number
of missing observation in the data set. In some cases, the missing observation
could generate biased results in the inference on the response. Uysal
(2007) has proposed an algorithm by radial basis function to deal with missing
value problem in time series data analysis. Kayaalp and
Cevyk (2001) have applied estimation of missing observation in longitudinal
data analysis to obtain the observation on water temperature. It has been found
that the estimation through MCMC is very useful in this context. Suzuki
et al. (2009) have applied the MCMC in the Linear Mixed modeling
to estimating the optimal IBD vaccination timing in the DNA level. Jin
et al. (2006) have developed the algorithm to estimate continuoustime
speed data. It has also been concluded that the mentioned methods is useful
in timeseries modeling in presence of missing observations. Nath
and Bhattacharjee (2011) have applied the Bayesian approach longitudinal
data analysis in type 2 diabetes patients of India.
It is not unusual in studies for some sequence of measurement to terminate
early for reason outside the control of this investigator and the affected unit
is called as dropout. It might therefore be necessary to accommodate dropout
in the modeling process to obtain correct inference for scientific interest.
The preliminary work with missing observation has been done by Rubin
(1976, 1987). They have presented an important distinction
between different dropout values processes. To deal with the missing observations
it is important to know the type of the missing observation present in the data,
although it is difficult to understand the type of dropout. A dropout in the
data is called as missing completely random (MCAR) if the dropout is independent
of both observed data and unobserved data and Missing at Random (MAR) if, conditional
on the observed data, the dropout is independent of unobserved measurement;
otherwise it is called Missing at Nonrandom (MNAR). The likelihood approach
can be applicable if the dropouts process is random and this situation is ignorable
dropout Little and (Rubin, 1987). As Little
(1993, 1994, 1995) has generalized
the different class of models under the arena of Patternmixture Modeling (PMM).
Statistical analysis with nonignorable missing is based on the assumption of
the missing data mechanism (Diggle and Kenward, 1994).
Little (1993, 1994) has considered
the PMM to stratify the incomplete data by the pattern of missing values and
formulate the distinct models within each stratum. He has also presented a broad
discussion on randomeffects in the PMM to deal with the dropouts in 1995. Hogan
and Laird (1997) have extended the PMM by permitting the censored dropout
times, as might arise when there are late entrants to a trial and interim analyses
are performed. Follmann and Wu (1995) and Wu
and Bailey (1988) have used the conditional linear model to permit generalized
linear models without any parametric assumption on the random effects model.
DeFronzo et al. (1981, 1985)
have found that the skeletal muscle can be affected due to high amount of insulin
secretion. Skeletal muscle is one of the major target organs due to the presences
of metabolic creatinine in it (Zierath et al., 2000).
Martin (2003) has obtained that the skeletal muscle mass
problem is associated with the type 2 diabetes. Currently there is insufficient
evidence to suggest that a combined drug therapy protects the skeletal muscle
mass in type 2 diabetic patients. Harita et al. (2009)
have found the association of serum creatinine with type 2 diabetes. They have
concluded that the low serum creatinine availability as the reason to develop
type 2 diabetes in a study of nonobese middleaged Japanese men. It has also
been assessed whether a combined drug therapy could prevent the progression
of serum creatinine secretions to protect skeletal muscle or not. The normal
ranges of serum creatinine are from 0. 44 to 0. 83 mg dL^{1} for women
and 0. 56 to 1. 23 mg dL^{1} for men (Yamamoto
et al., 2009). To the best of our knowledge, no clinical trials for
type 2 diabetes drug therapies has been handled the missing observations with
PMM.
The goal of this study is to compare the drug treatment effect by the low mean serum creatinine value during different visits. Different procedures have been applied to deal with missing observations in the data. The performance of combined drug therapy i.e., metformin with pioglitazone and gliclazide with pioglitazone has been compared in reducing the serum creatinine level. Here, the missing observation in the serum creatinine of patients’ at different visits have been handled by the LOCF, EM algorithm and pattern mixture modeling (i.e., MAR and MNAR).
MODEL SPECIFICATION
The Four different applied methods have been discussed below:
LOCF (last observation carry forward): In the clinical trial the missing data can be classified with noninformative (essentially random ;a study subject moves to another city) and informative (nonrandom, the study subject removed from the trial by the investigators). However, it is very much difficult to know whether data are noninformatively or informatively missing. The Last Observation Carried Forward (LOCF) is a commonly used way of imputing data with dropouts. Recently, Food and Drug Administration (FDA) has approved LOCF as the preferred method of analysis. It can be used as an imputation method when the data are longitudinal. The last observed value (nonmissing value) is used to fill in missing values at a later point in the study.
EM algorithm: In case of EM algorithm (Dempster
et al., 1977), the imputed value Y can be obtained from the observed
value of X by using the estimated value of
and
, i.e.:
However, the Eq. 1 is free with error term. It is based on the maximum likelihood estimates and converges to the population parameters.
The model can be called by:
where,
are the consistent estimated value of α_{0}, β_{0}
obtained from final iterations.
Missing at random (MAR): In longitudinal studies to deal with the missing
observations the widely used method is MAR, where the missing observations Y_{mis}
are independent with the observed response and covariates in the model. Fitzmaurice
et al. (2004) has proposed to use MAR as the default method to handle
missing observation due to its independence with observed response and covariates.
In this context, Fitzmaurice et al. (2004) has
also considered a model for the joint distribution of (Y_{1}, Y_{2},
S), where (Y_{1}, Y_{2})^{T} is the bivariate response
and S is a binary indicator of observed Y_{2}. Fitzmaurice
et al. (2004) has applied the PMM for the bivariate normal data by:
where, for pattern sε{0, 1}, the parameters are {μ_{1}^{(s)}, μ_{2}^{(s)}, σ_{11}^{(s)}, σ_{22}^{(s)}, σ_{12}^{(s)}}. We have reparameterized the model in Eq. 3, in term of the marginal distribution of Y_{1} and the conditional distribution of Y_{2} given Y_{1}, such that:
Here, we have assumed that the observed data and missing observation are in normal distribution and MAR respectively. The case J = 3 is applied due to presences of 3 observations of each patients. Let (Y_{1}, Y_{2}, Y_{3})^{T} denote the full data. In the model the dropout has been modeled in terms of number of observations, S = Σjs_{j}. The full data model has been factored as P (Y_{1}, Y_{2}, Y_{3}, s) = P_{s} (Y_{1}, Y_{2}, Y_{3}) P (s). where S follows the multinomial distribution with P_{s} = P(S = s).
The observed data response distributions within each pattern are specified using normal distributions:
Now consider identifying P_{s} (y_{mis}y_{obs}) under MAR, Start with S = 1. We need to identify P_{1} (y_{2}, y_{3}y_{1}) which can be factored as:
The MAR constraints implies:
The distribution P_{>2}(y_{2}y_{1}) can be called
as the mixture Roy and Daniels (2008) of the normal
distribution from pattern S = 2 and S = 3:
where, φ is the multinomial parameters with:
The model can be specified by the given Table 1.
In these models, we have applied the pattern by Sε1, 2, 3. The presence of nonidentified component has been denoted by *. The model is as follows:
Table 1: 
The non italic cell are fully observed data and italic cell
are partially observed data of serum creatinine in different visits 

Here, β and τ are the regression parameter and variance component, respectively.
The model p (y_{mis}/y_{obs}, s) p (y_{obs}/s) p (s) can be separated with p (y_{mis}/y_{obs}, s), p (y_{obs}/s) and p (s).
To make out, p_{1}(y_{2}/y_{1}), we have applied p_{1} (y_{2}/Y_{1}) and:
all the possible values of y_{1}, has been computed by, α_{j}^{1} = α_{j}^{≥2} for j = 0, 1.
The values of p_{1}(y_{3}/y_{1}, y_{2}) and p_{2}(y_{3}/y_{1}, y_{2}), has been obtained by solving the regression parameter and variance component of β_{i}^{S} = β_{0}^{3} for s = 1, 2 and i = 1, 2, 3 and τ^{S}_{i = 3} = τ^{3}_{3}.
The full data model is reparameterised to ζ = (ζ_{S}, ζ_{M}). Where, ζ_{S} has been generated from and ζ_{S} are elaborated to ζ_{S} = α_{0}^{(1)}, α_{1}^{(1)}, τ_{2}^{(1)}, β_{0}^{(s)}, β_{1}^{(s)}, β_{2}^{(s)}, S = 1, 2 and ζ_{M} = μ^{(s)}, σ^{(s)}, α_{0}^{(≥2)}, α_{1}^{(≥2)}, τ_{2}^{(≥2)}, β_{0}^{(3)}, β_{1}^{(3)}, β_{2}^{(3)}, τ_{3}^{(3)}, φ_{1}, φ_{2}, φ_{3} for s = 1, 2, 3, respectively.
Here, ζ_{S} stands for full pattern of all observation and ζ_{M} for the missing observations. The distance from MAR has been obtained by ζ_{S} in place of ζ_{M} by a new parameter Δ with the function of ζ_{S} = h (ζ_{M}, Δ).
The process is as follows:
Now, the reparametrization of β has been done by,
with s = 1, 2 and i = 1, 2, 3.
The reparametrization of τ is denoted by,
and:
The full set of Δ is represented by, Δ = (Δ_{α}, Δ_{β}, Δ_{τ}).
Where:
In this model, the Δ has been assumed as the hierarchical prior for the posterior distribution of ζ_{M}.
Missing not at random (MNAR): MNAR is the condition where the dropout
is not related to the observed dependent variable. The MNAR can be expressed
with the help of indicator variable. The mean value of serum creatinine obtained
through the MNAR approach is based on the Pattern Mixture Modeling (PMM) procedure.
The MNAR has been performed in the platform of MAR with the assumption of, Δ_{α}
= 0, Δ_{β} = 0 and Δ_{τ} = 1 (Daniels
and Hogan, 2007).
The point is to be noted that in the PMM, the prior value of Δ does not
influence the posterior mean value. Hogan and Daniels (2002)
have separated the prior specification of Δ into the three parts by, .
Where:
In this study, the ,
have been reformed to
to obtain the value of Δ by:
The subjective assumption is required to formulate the prior distribution of the parameters. In this analysis, Δ has been assumed to follow the uniform distribution. The variability of different approaches has been compared with the mean value changes of serum creatitinine in different visits.
Application: The secondary data set has been obtained from a clinical trial in Menakshi Mission Hospital, Tamil Nadu in 2008. The data have been obtained by randomized, double blind, parallel group samples. The part of the data set consists of serum creatinine sample of 100 patients: 50 of them are grouped under treatment 1(metformin with pioglitazone) and the rest of them are grouped under treatment 2 (gliclazide with pioglitazone).
This study is involved to model the response of serum creatinine on the type of therapies over a period of one year for various patients’ profiles. Due to missing observations, difficulties have been faced in the response of the both groups. Patients are observed thrice with serum creatinine level during the 12months study period.
Patient’s meeting eligibility criteria are randomized to 1:1 by (a) metformin with pioglitazone and (b) gliclazide with pioglitazone, Fig. 1 shows a plot of serum creatinine changes over a period of time. It shows that in MNAR procedure the reductions of serum creatinine among the patients due to gliclazide with pioglitazone are faster in comparison to metformin with pioglitazone. In 1st month’s visits, the gap between mean serum creatinine value of metformin with pioglitazone and gliclazide with pioglitazone has been increased after 12th month’s visits. Subjects who have received the metformin with pioglitazone during the followup period are in higher serum creatinine from the study initiation to study end than subject with gliclazide with pioglitazone of type 2 diabetes.

Fig. 1(ad): 
Mean changes serum creatinine obtained in different drug
groups with different techniques 
Treatment options for type 2 diabetes in adolescents and youth have been randomized to parallel group trial consisting of a screening visit, a 26 month singleblind runin period and a treatment period up to 12 months. The group under the therapy metformin with pioglitazone is observed with 14% (i.e., 7 out of 50) and 16% (i.e., 8 out of 50) missing observations in the 2nd and 3rd visits, respectively. The group under the therapy gliclazide with pioglitazone is observed by 8% (i.e., 4 out of 50) and 20% (i.e., 10 out of 50) missing observations in the 2nd and the 3rd visits, respectively. The observed data are displayed in Table 2. In the first stage, Expectation Maximization (EM) algorithm and LOCF have been applied, to deal with missing observations. In the second stage, MAR and MNAR assumptions have been carried out under the PMM approach.
Statistical analysis: The posterior means of serum creatinine in both
the drug group has been obtained with the Monte Carlo Markov Chains (MCMC) of
20, 000 iterations using R2WINBUGS software. The first 5000 burnin values have
been discarded. The inferences of the mean value for both the therapy groups
are given in Table 3. Uncertainty about Δ is reflected
in the increased posterior standard deviation for parameters that rely on Δ
(means at 2nd and 3rd) but importantly, not for fully identified parameters
(means at baseline). In this analysis, an informative prior for Δ and the
posteriors are summarized in Table 3. The mean change has
been obtained in different drug groups using different procedures are shown
in Fig. 1.
RESULTS AND DISCUSSION
The numbers of patients that contribute to each visit are shown in Table 2. In both the drug groups the mean of serum creatinine has been reduced over the period of one year’s observation. In both the group there are large drop in the observed mean serum level at second visit. At the end of study, the means in both the groups are similar and remain rather constant and they are higher than at baseline. In the, 2nd and the 3rd visit the mean (sd) values have been obtained for the metformin with pioglitazone group through LOCF algorithm are 1.33 (1.75), 1.05 (0.12) compared to 1.28 (1.76) and 1.04 (0.13) by EM algorithm, respectively. However, the same result has been found in case of gliclazide with pioglitazone group by both the algorithm LOCF and EM (Table 2). The MNAR method produces the mean value at 1.05 (0.33) and 1.05 (0.96) in the 2nd and 3rd visits for the drug treatment group metformin with pioglitazone. The MNAR reveals the mean results for gliclazide with pioglitazone group by 1.05 (0.83) and 1.07 (0.62) for the subsequent visits. A small increase in the mean change is observed under imputation schemes of MAR and MNAR (Table 3). It can be concluded that MAR and MNAR produce same result with the different values of standard errors. The mean change in the (metformin with pioglitazone group is fairly robust to the regression imputation strategies, indicating similarities between the observed serum creatinine level i.e., metformin with pioglitazone group and gliclazide with pioglitazone group. However, in all the approaches, the absolute differences in mean serum creatinine after 1 year are small, as shown in Table 3.
Table 2: 
In different visits data availability description of serum
creatinine 

Table 3: 
Posterior mean (s.d.) for serum creatinine level at each
time point, stratified by treatment group 

Values are as Mean±SD. MAR: Missing at random, MNAR:
Missing not at random, EM: Expected maximization algorithm, LOCF: Last observation
carry forward 
Methods to deal with the statistical information in type 2 diabetes drug therapy
trials for which the biochemical parameters are missing is a neglected area
in the literature. An observable solution is to overlook a trial with missing
biochemical parameters. In this paper, our aim is to compare two treatment therapies
for type 2 diabetes patient in South Indian population. This piece of work provides
an application device to solve this problem by effectively predicting the missing
observations that occurred in clinical trials. The pattern mixture model approach
can be useful to compare drug treatment performance in different visits. In
this study, it has been found by MAR and MNAR approach that metfomin with pioglitazone
is more effective to reduce the serum creatinine level as compared to gliclazide
with pioglitazone. The presence of metformin is useful to reduce the level of
serum creatinine. Aleisa et al. (2008) have confirmed
that the metformin can be useful to avert the cardiac and hepatic toxicity in
cancerous diabetes patients. The pattern mixture approach has been studied as
a method for correction for informative dropout. Analysis is complicated by
presence of dropouts, since once a patient has been discontinued the treatment
due to lost to follow up or clinical coordinator fails to observe, no longitudinal
measurement of the covariates of interest can be collected thereafter. Thus,
while analyzing such data, specific modeling is needed for appropriate and unbiased
interpretations. Actually, PMM allows incorporating prior assumption for the
drop out process in the drug treatment results. It does not differ much. First,
a LOCF has been conventionally used to handle several types of drop outs to
obtain the estimates of treatment effect directly related to the cumulative
incidence of dropouts (Ambrogi et al., 2008;
Satagopan et al., 2004). Secondly, an EM algorithm
has been used for inference purposes (Williamson et al.,
2008). Actually, the Bayesian approach of Daniels et
al. (2007) and Chi and Ibrahim (2006) has motivated
our analysis of a Bayesian alternative that allows posterior inference for any
parameter of interest. Thus, we have applied a fully Bayesian approach, implemented
via MCMC methods using R2WinBUGS software (Faucett and Thomas,
1996; Chi and Ibrahim, 2006; Guo
and Carlin, 2004; Ibrahim et al., 2007).
Such a Bayesian method for PMM modeling of longitudinal data is very recently
reported (Hu et al., 2009). We illustrated in
this paper how the PMM approach may affect the results. We have compared the
results of the pattern mixture approach with the results of the commonly used
LOCF and EM algorithm. The LOCF and EM algorithm have produced inverse results
in comparison to MAR and MNAR approach. Present results suggest that the treatment
effect of the metformin with pioglitazone is effective to reduce more serum
creatinine level as compared to gliclazide with pioglitazone.
CONCLUSIONS
In the study, we have presented a novel model to obtain serum creatinine level in type 2 diabetes patients and, therefore, the drug effect comparison. Present results confirm that MAR and MNAR are powerful tools for longitudinal data analysis and, consequently, for the actual application to the drug effect comparison in clinical trial. We have employed Markov Chain Monte Carlo methods to effectively estimate Serum Creatinine values for different visits in type 2 diabetes patients. The missing observation techniques have been applied in the drug treatment effect comparison in clinical trial data. We believe that more research is needed in this area. On the basis of the results reached we think the model can be applicable to area like Oncology, Aids and in other disease specific clinical trial. The combination of metformin with pioglitazone is more effective to reduce the serum creatinine and hence the risk of type 2 diabetes. The MAR and MNAR approach attaches the efforts of how the trial could change our opinion about the treatment effect. The statistical models with prior information need to be considered regarding information on lost observations in drug treatment effect comparison.