HOME JOURNALS CONTACT

Journal of Applied Sciences

Year: 2007 | Volume: 7 | Issue: 7 | Page No.: 951-957
DOI: 10.3923/jas.2007.951.957
Meta Analysis of Panel Data Generated by a Set of Randomized Controlled Trials
Robin Antoine, Ashok Sahai and Peter Chami

Abstract: It is not uncommon to have access to panel data generated by a set of similarly randomized controlled trials. In this context, researchers often employ pooling methods to evaluate the efficacy of pharmaceutical regiments. One of the simplest techniques used to combine individual studies results is the fixed effects model. This assumes that a true effect is equal for all studies. An alternative and intuitively more appealing method, is the random effects model. The purpose of this study is to address the estimation problem of the fixed effects model. Furthermore, it presents a simulation study of an efficient estimation of a mean true effect using panel data and a random effects model in order to establish appropriate confidence interval estimation for both the models.

Fulltext PDF Fulltext HTML

How to cite this article
Robin Antoine, Ashok Sahai and Peter Chami, 2007. Meta Analysis of Panel Data Generated by a Set of Randomized Controlled Trials. Journal of Applied Sciences, 7: 951-957.

Keywords: panel data, fined effect model confidence interval estimation and Meta analysis

INTRODUCTION

Meta-analysis represents the statistical analysis of a collection of analytic results motivated by the desire to integrate the findings in the context of a medical research investigation. In fact, such analyses are gaining currency in medical researches, where information on efficacy of a treatment is available from a number of clinical studies with similar treatment protocols, at different situates. It is quite often true that when a study is considered separately, the data it contains (as generated by a randomized control trial) is too small or too limited in its scope to lead to unequivocal or generalizable conclusions concerning the effect of the treatment(s) under investigation. Consequently, combining the findings of similar studies across various places/hospitals is often an attractive alternative, which could be used to strengthen and support the evidence about treatment efficacy.

A number of methods are available to construct the confidence limits for the overall mean effect for the meta-analysis of the panel data in the context of a random/fixed effects model generated by randomized controlled trials. A popular and simple method is the one proposed by Der Simonian and Laird (1986). It is worth noting, in the context of panel data/meta-analysis, that the simplest statistical technique for combining the individual study results is based on a fixed effects model. In the fixed effects model, it is assumed that the true effect is the same for all the studies generating the panel data. On the other hand, a random effects model allows for the variation in the true effect across these studies and is, therefore, more realistic a model.

Halvorsen, in a systematic search of the first ten issues published in 1982 of each of the four weekly journals (NEJM, JAMA, BMJ and Lancet) found only one article (out of 589) that considered combining results using formal statistical methods. The basic difficulty one faces in trying to combine/integrate the results from various studies, is generated by the diversity among these studies in terms of the methods they employ, as well as the design of these studies. Moreover, as a result of different patient populations and varying sample sizes, each study has a different level of sampling error, as well. Hence, while integrating the results from such varied studies, one ought to assign varying weights to the information stemming from respective studies; these weights reflecting the relative value of each piece of information (Halvorsen, 2006). In this context, Armitage (1984) highlighted the need for great care in developing the methods for drawing inferences from such heterogeneous, though logically related, studies. DerSimonian and Laird (1983) observed that, in this setting, it would be more appropriate to use a regression analysis to characterise the differences in study-outcomes.

In the context of a random effects model for the meta/panel-data analysis, there are a number of methods available to construct the confidence limits for the overall mean effects. Sidik and Jonkman (2002) proposed a simple confidence interval for meta-analysis, based on the t-distribution. Their approach, making use of a simulation study, is quite likely to improve the coverage probability of the DerSimonian and Laird (1986)’s approach. In the present paper we propose a more efficient construction of this confidence interval. A simulation study has been carried out to demonstrate that our methods not only improve the coverage probability of both of the aforesaid methods but are, most likely, better than those methods in terms of ‘Relative Bias’ (RB), as well.

THE PROBLEM FORMULATION

The statistical inference problem is concerned with using the information from k independent studies in the meta-analyses. Let the random variable yi stand for the effect-size estimate from the ith study. It would be beneficial to note here that some commonly-used measures of effect size are mean difference, standardized mean difference, risk difference, relative risk and odds-ratio. As the Odds-Ratio (OR), which is of particular use in retrospective or case control studies, is mostly used, we would confine ourselves to it for the simplicity in our paper. Nevertheless, there is no loss of generality since the details of this paper are analogously valid for the other measures of effect-size.

Let nti and nci denote the sample sizes and let pti and pci denote the proportions dying for each of the treatment (t) and control (c) groups, where i stands for the designation of the study number: i = 1. .n. Also, let xti and xci denote the observed number of the deaths for the treatment and the control groups respectively, for the study number i. We note that for the ith study, the following gives the observed log-odds ratio (log (ORi) and the corresponding estimated variance.

The important point to be noted at this stage is that the estimated (σi)2 is rather a very close estimate of the respective population variance (σi)2 and that it is closely analogously available for the population variances for the cases of other measures of the effect size. For example, if the effect size yi happens to be the difference in proportions, pti-pci, we estimate the population variance (σi)2 by:

Now, we might note that the general model is specified as follows:

Wherein,

And,

Wherein,

Hence, essentially the model comes to be:

It is also important to note that whereas ∂istands for the random error across the studies, εi represents the random error within a study and that ∂I and εi are assumed to be independent. Also, the parameter τ2 is a measure of the heterogeneity between the k studies. We will refer to it in our paper as the heterogeneity variance, which it is often called.

Perhaps the important and the most crucial element in the panel-data/meta analysis is the challenge of developing an efficient estimator of this heterogeneity variance τ2. DerSimonian and Laird (1986) proposed and used the following estimate:

(1)

Wherein,

and the weighted estimate of the mean effect is given by:

(2)

Also, herein the weight wi is assumed to be known. Earlier, we noted that the estimated (σi)2 is rather a very close estimate of the respective population variance (σi)2. Therefore, most usually the sample estimate (σi)2 is used

in place of , so that is used in (2) and estimated

Recently, Sidik and Jonkman (2002) proposed a simple confidence interval for the meta-analysis.

This approach, consisting in the construction of the confidence interval based on the t-distribution, significantly improved the coverage probability compared to the existing most popular DerSimonian and Laird (1986)’s approach, as outlined above.

It is worth noting, in the above context, that recently Brockwell and Gordon (2001) presented a comprehensive summary of the existing methods of constructing the confidence interval for meta-analysis and carried out their comparisons in terms of their coverage probabilities.

While, the most-commonly-used/popular method of DerSimonian and Laird (1986) random effects method led to the coverage probabilities below nominal level, the profile likelihood interval of Hardy and Thompson (1996) led to the higher coverage probabilities. However, the profile likelihood approach happens to be quite cumbersome computationally and involves an iterative calculation as does the simple likelihood method presented in Brockwell and Gordon (2001). On the other hand, Sidik and Jonkman (2002)’s proposition of a simple approach for constructing a 100 (1-α) percent confidence interval for the overall effect in the random effects model, pursuing the pivotal inference based on the t-distribution, uses no iterative computation like the popular method of DerSimonian and Laird (1986).

Moreover, the Sidik and Jonkman (2002)’s proposition has a better coverage probability than that of DerSimonian and Laird (1986). Consequently, while DerSimonian and Laird (1986)’s confidence interval for meta-analysis used to be the most popular/commonly-used confidence interval, that of Sidik and Jonkman (2002)’s happens to be rather-the-best one in terms of the most important count, namely that of the coverage probability, on which the confidence intervals are compared and rated.

Therefore, our motivation is basically to attempt the improvement of these two methods for constructing the Confidence Intervals for an interval estimate for the overall mean effect across the k studies, using the panel/meta-data generated by these studies. The improvement was targeted mainly at the improved coverage probabilities, but eventually the proposed method for the construction of the proposed confidence interval estimator happened to perform better than these two methods. Also, very often, it happens to perform better on another important count of comparison of such confidence intervals, namely that of Relative Bias (RB), as revealed by the comparison using a Simulation Study.

THE PROPOSED CONFIDENCE

INTERVAL ESTIMATE

As noted in the last section, the important and the most crucial element in the panel-data/meta analysis is the challenge of developing an efficient estimator of this heterogeneity variance τ2.

DerSimonian and Laird (1986)’s approximate 100 (1-α) percent confidence interval for the general mean effect μ, using the random effects model, is given by:

(3)

Wherein, Also,

is evaluated using:

To construct an alternative simple confidence interval for the general mean effect μ, using the random effects model, assuming that

Recently Sidik and Jonkman (2002) proposed an improvement. They, subject to the assumption that

correct weights ((i.e., essentially that), being close estimates), noted that:

(4)

They showed that Zw and Qw are independently distributed. Hence, it follows that:

(5)

This, thence, led to Sidik and Jonkman (2002)’s proposition of an approximate 100(1-α) per cent confidence interval for the general mean effect μ, using the random effects model, is given by:

(6)

Also, under the assumption of known weights,

(7)

It is very significant fact at this stage to note that both DerSimonian and Laird (1986)’s, as also Sidik and Jonkman (2002)’s 100(1-α)the general mean effect μ, using the random effects model, are approximate, in as much as their validity depends on the extent to which the underlying assumptions are true. Thus, essentially, it boils down to how efficient our estimate of the inter-study heterogeneity variance τ2 is. We might as well note here that:

If the estimate of (τ2), i.e., = 0, the random effects model reduces to the fixed-affect model.

Furthermore, we might mention here that more efficient estimation of the inter-study heterogeneity variance τ2 is the key motivating factor behind our proposed methods for improving the coverage probability.

In both the papers, namely those of DerSimonian and Laird (1986) and Sidik and Jonkman (2002), the estimation of this inter-study heterogeneity variance τ2, as is nicely described in Brockwell and Gordon (2001), is as follows:

The two-stage random effects model:

Wherein,

Wherein,

That could well be re-written equivalently as:

yi = μ + ∂I + εi; i = 1, ..., k;

Wherin, εi ≈ N(0, σi2) and ∂I ≈ N(0, τ2)

As noted earlier, under the assumptions that ’s correct weights ((i.e., essentially that being close estimates) and that ∂I and εi are independent (all assumptions being well-known to be quite reasonably tenable), we have (to the extent of the approximation due to the extent of the tenability of the aforesaid assumptions):


(8)

In the above,

(9)

Now, assuming that τ2 is known, we have:

(10)

It is interesting to note that the random effects model confidence intervals for μ are expected to be generally wider than those constructed under fixed effects model simply due to the fact that:

As τ2 is unknown in practice we ought to estimate it. Simonian and Laird (1986) derived an estimate of τ2, using the method of moments, by equating an estimate of the expected value of Qw to its observed value, .

Therefore, we note that if t is the solution of the above equation, we have:

(11)

So as to ward off the possibility of a negative value of t (which will be an unacceptable value of τ2, as any variance could not be negative), we define:

(12)

Using (11) in (9), we get the (wherein ,…, K to be used in (8), while the estimated variance of

(13)

Both, DerSimonian and Laird (1986)’s and Sidik and Jonkman (2002)’s propositions of an approximate 100(1-α) percent confidence interval for the general mean effect μ using the random effects model as in 3 and 4, respectively), use the μ generated by the aforesaid of the value of as in (13).

Essentially, our proposal for improved Confidence Interval (Cis) estimates of the general mean effect μ consist solely in a more efficient estimation of in (13). For this purpose, the following results are needed:

Lemma: If an estimate, say ‘s2’ (usual unbiased sample variance estimator) of the

population variance, say ‘σ2’ is based on a random sample X1, X2, … Xk from a

Normal population N (Θ, σ2), we have:

{(k-1).s2}/σ2 ˜ χ2 (k-1) (the Chi-Square distribution on ‘(k-1)’ degrees of freedom).

Further, we have:

(14a)

(14b)

Proof: As, in the case of the random sample from a normal distribution, it is rather very well-known that the sample variance s2 is a complete sufficient statistic for the population variance σ2. Therefore, Uniformly Minimum Variance Unbiased Estimator (UMVUE) of 1/σ2 is simply its unbiased estimator. Now, using the fact that {(k-1).s2}/σ2 ≈ χ2(k-1), it could easily be shown that:

That establishes the truth of (i) of the above Lemma.

For establishing the truth of the part (ii) of the above Lemma, we note that we have to find the optimal value of k in the class of the estimators k. (1/s2) so that the Mean Square Error (MSE) of the thus-optimal estimator is minimal.

Now,

MSE (k.(1/s2)) = E[1/(k.s2)-1/σ2]2 = (1/k)2. E(1/s4)-2. (1/k). (1/σ2). E (1/s2) + 1/σ4. Hence the optimal value of k = [E (1/s4)]/[E (1/s2). (1/σ2)]. Now, again using the fact that {(k-1).s2}/σ2 ≈χ2(k-1), it could easily be shown that:

E [(1/s2)] = [(k-1)/(k-3)]. (1/σ2) and that E [(1/s4)] = [(k-1). (k-1)]/[(k-3). (k-5)]. (1/σ4).

Thence:

k = (k-1)/(k-5), as in (14b). That shows the truth of (ii) of the above lemma and that the Minimum MSE Estimator (MMSEE) of 1/σ2is 1/(k.s2).

QED: Hence, we propose the following modified more efficient CI, modifying the say, Ordinary DerSimonian-Laird (1986) Estimator (ODLE) and modifying the say. Ordinary Sidik-Jonkman (2002) Estimator (OSJE) defined, respectively, in (3) and (4) above.

We would call our estimators as the Modified DerSimonian-Laird (1986) Estimator (MDLE) and as the Modified Sidik-Jonkman (2002) Estimator (MSJE), respectively.

Essentially, the sole difference between ODLE and MDLE, as also between OSJE and MSJE consists in replacing k in (3) and (4), respectively by (k*+ k)/2 for the modifications under the TWO approaches consisting in the use of both the UMVUE and the MMSEE estimation of 1/σ2, whereas σ2 herein stands for the heterogeneity variance τ2 and the parameter τ2 is essentially a measure of the heterogeneity between the k studies. The mean-value (k*+ k)/2 is used to take care of BIAS (through MVUE), while managing the MSE (through MMSE) at the optimal level, too: a sort of compromise.

THE SIMULATION STUDY

The format of the Simulation Study in our paper is to compare the Original DerSimonian-Laird (1986) Estimator (ODLE) and the Original Sidik-Jonkman (2002) Estimator (OSJE) with our estimators Modified DerSimonian-Laird (1986) Estimator (MDLE) and as the Modified Sidik-Jonkman (2002) Estimator (MSJE), respectively, is the same as that in Sidik-Jonkman (2002).

To compare the simple confidence interval based on the t-distribution with the DerSimonian and Laird interva in terms of coverage probability, we performed a simulation study of meta-analysis for the random effects model. Throughout the study, the overall mean effect μ is fixed at 0.5 and the error probability of the confidence interval, α, is set at 0.05. We only use one value for μ because the t-distribution interval based on the pivotal quantity in (3) and the DerSimonian and Laird interval are both invariant to a location shift. Three different values of τ2 are used: 0.05; 0.08 and 0.1. For each τ2, three different values of k (namely 10, 20 and 60 to keep the comparisons modestly) are considered. The number of simulation runs for the meta-analysis of k studies is 11 000. The simulation data for each run are generated in terms of the most popular measure of effect size in meta-analysis, the log of the odds ratio. That is, the generated effect size yi is interpreted as a log odds ratio (it could alternatively be the mean effect of the ith study, as well).

For given k, the within-study variance σi2 is generated using the method of Brockwell and Gordon (2001). Specifically, a value is generated from a chi-square distribution with one degree of freedom, which is then scaled by 1 = 4 and restricted to an interval between 0.009 and 0.6. This results in a bimodal distribution of σi2, with the modes at each end of the distribution. As noted by Brockwell and Gordon, values generated in this way are consistent with a typical distribution of σi2 for log odds ratios encountered in practice.

For binary outcomes, the within-study variance decreases with increasing sample size, so large values of σi2 (close to 0:6) represent small trials included in the meta-analysis and small values of σi2 represent large trials.

The effect size yi for i=1;…; k is generated from a normal distribution with mean μ and variance: (σi)2 + (τ)2.

For each simulation of the meta-analysis, the confidence intervals based on the t-distribution and the DerSimonian and Laird method are calculated, along with those of our proposed estimators Modified Sidik-Jonkman (2002) Estimator (MSJE) are calculated. The numbers of intervals containing the true μ are recorded for all four methods. The proportion of intervals containing the true μ (out of the 60,000 runs) serves as the simulation estimate of the true coverage probability.

The results of the simulation study are presented in the tables (Nine Tables) in APPENDIX. From the tables, it can be seen that the coverage probabilities of the interval based on the t-distribution are larger than the coverage probabilities of the interval using the DerSimonian and Laird method for each τ2 and all values of k. Interestingly, our proposed estimator Modified DerSimonian-Laird (1986) Estimator (MDLE) performs even better than that. Although the coverage probabilities of the confidence interval from the t-distribution, like other methods, are below the nominal level of 95%, they are higher than the commonly applied interval based on the DerSimonian and Laird method, particularly when k is small. This suggests that the simple confidence interval based on the t-distribution is an improvement compared to the existing simple confidence interval based on DerSimonian and Laird’s method. Incidentally, MDLE is the best. The most remarkable fact is that our proposed estimator Modified Sidik-Jonkman (2002) Estimator (MSJE) turns out to be the best in terms of the Coverage Probability.

APPENDIX-I

Number of Studies Seminal To Panel/Meta-Data Analysis: K = 10.

Table 1: Performance parameters of CIs for τ2 = 0.05 and 1-α = 0.95

Table 2: Performance parameters of CIs for τ2 = 0.08 and 1-α = 0.95

Table 3: Performance parameters of CIs for τ2 = 0.10 and 1-α = 0.95
Number of studies seminal to Panel/Meta-Data Analysis: K = 20

Table 4: Performance parameters of CIs for τ2 = 0.05 and 1-α = 0.95

Table 5: Performance parameters of CIs for τ2 = 0.08 and 1-α = 0.95

Table 6: Performance parameters of CIs for τ2 = 0.10 and 1-α = 0.95
Number of studies seminal to Panel/Meta-Data analysis: K = 60

Table 7: Performance parameters of CIs for τ2 = 0.05 and 1-α = 0.95

Table 8: Performance parameters of CIs for τ2 = 0.08 and 1-α = 0.95

Table 9: Performance parameters of CIs for τ2 = 0.10 and 1-α = 0.95

REFERENCES

  • Armitage, P., 1984. Controversies and achievements in clinical trials. Cont. Clin. Trials, 5: 67-72.


  • Brockwell, S.E. and I.R. Gordon, 2001. A comparison of statistical methods for meta-analysis. Stat. Med., 20: 825-840.
    CrossRef    Direct Link    


  • DerSimonian, R. and N. Laird, 1983. Evaluating the effect of coaching on SAT Scores: A meta-analysis. Havard (Ed). Rev., 53: 1-15.


  • DerSimonian, R. and N. Laird, 1986. Meta-analysis in clinical trials. Controlled Clin. Trials, 7: 177-188.
    CrossRef    Direct Link    


  • Halvorsen, K., 2006. Combining Results from Independent Investigations: Meta-analysis in Medical Research. In: Medical Uses of Statistics, Bailar, J.C. and F. Mosteteller, (Eds.), New England J. Med., Boston


  • Hardy, R.J. and S.G. Thompson, 1996. A likelihood approach to meta-analysis with random effects. Stat. Med., 15: 619-629.
    CrossRef    Direct Link    


  • Olkin, I., 1994. Invited commentary: Re: A critical look at some popular meta-analytic methods. Am. J. Epidemiol., 140: 297-299.


  • Sidik, K. and J.N.A. Jonkman, 2002. A simple confidence interval for meta-analysis. Sta. Med., 21: 3153-3159.
    CrossRef    Direct Link    


  • Villar, J., E.M.E. Mackey, C. Guillermo and D. Allan, 2001. Meta-analyses in systematic reviews of randomized controlled trials in perinatal medicine: Comparison of fixed and random effect models. Stat. Med., 20: 3635-3647.
    CrossRef    Direct Link    

  • © Science Alert. All Rights Reserved