21 julio, 2011

Discrepancies between Meta-Analyses and Subsequent Large Randomized, Controlled Trials

A second funnelplot to visualize the filedrawe...Image via Wikipedia
Large randomized, controlled trials are generally considered the gold standard in evaluations of the efficacy of clinical interventions. However, since such trials are not always available, clinicians increasingly rely on meta-analysis to support their choice of clinical strategies. Critics have emphasized the intrinsic weaknesses of meta-analysis.1-5 Pooled results incorporate the biases of individual studies and embody new sources of bias, mostly because of the selection of studies and the inevitable heterogeneity among them.
Although much has been said about the strengths and weaknesses of meta-analysis, there are limited data systematically comparing the results of meta-analyses of several small trials with those of large randomized, controlled trials. Villar et al.6 reviewed 30 meta-analyses of various interventions in perinatal medicine from the Cochrane data base. They recalculated the results of each meta-analysis after removing the largest trial from the analysis and then compared the results with those of the large trial that had been removed. They found a kappa of 0.46 to 0.53 and a positive predictive value of 50 to 67 percent. We compared the results of a series of systematically compiled large randomized, controlled trials with those of the relevant meta-analyses that had been published previously.

METHODS

Data Base

We searched the New England Journal of Medicine, the Lancet, the Annals of Internal Medicine,and the Journal of the American Medical Association and retrieved all large randomized, controlled trials (those in which 1000 patients or more were studied) that were published between January 1, 1991, and December 31, 1994. All the trials had to have adequate statistical power to detect the desired benefit specified by the authors. Adequate power was defined on the basis of the a priori calculations of power reported by the authors in the Methods sections of their articles. We then searched for meta-analyses of similar topics that had been published before the large randomized, controlled trial. Our search included the references listed in the randomized trials and computerized searches of Medline without language restrictions. We then compared each trial with the set of meta-analyses corresponding to it and selected only those meta-analyses that coincided with the trial with regard to the similarity of the populations studied, the therapeutic intervention, and at least one outcome. We studied the principal and secondary outcomes.
For each outcome that was studied in both the large randomized, controlled trial and the meta-analysis, we determined whether the results were positive (indicating that treatment resulted in a better outcome) or negative (indicating that treatment resulted in an equal or worse outcome) at the conventional level of statistical significance (P<0.05). Two investigators working independently of each other reviewed each trial and its corresponding meta-analyses. Discrepancies were resolved by consensus, with the help of a third investigator. To quantify the effect of interobserver variation, we performed a sensitivity analysis; the statistical calculations were performed with the data obtained by consensus and were repeated with the data that corresponded to the opinion of the dissenting investigator.

Statistical Analysis

Two-by-two tables were used to calculate the degree of agreement between the large randomized, controlled trial and its associated meta-analysis as expressed by the kappa statistic and its 95 percent confidence interval, as well as the sensitivity, specificity, positive predictive value, and negative predictive value. The point estimates in each pair were compared by using a test statistic constructed as the difference in the proportions or means divided by the square root of the sum of the variances.
The odds ratios of the randomized, controlled trial and the meta-analysis were represented graphically. When the result of the meta-analysis was not presented as an odds ratio for a dichotomous outcome, we computed the odds ratio and its 95 percent confidence interval by the fixed-effects Mantel–Haenszel method.7 When no odds ratio could be computed for a meta-analysis that represented the size of the treatment effect, we transformed the odds ratio in the corresponding randomized, controlled trial into an effect size by treating the proportion for each group as the mean of a distribution of 0's and 1's.8 These transformations were made only to permit graphic representation and did not affect the P values reported in the corresponding papers. Figure 1FIGURE 1Odds Ratios and 95 Percent Confidence Intervals for Clusters of Studies in Which the Findings of Large Randomized, Controlled Trials Were Compared with the Results of One or More Meta-Analyses on the Same Subject, in Which at Least One Common Outcome Was Studied. shows the odds ratios computed by the fixed-effects method, and Figure 2FIGURE 2Treatment Effects and 95 Percent Confidence Intervals after Transformation of the Odds Ratios in Clusters of Studies in Which the Findings of Large Randomized, Controlled Trials Were Compared with the Results of One or More Meta-Analyses on the Same Subject, in Which at Least One Common Outcome Was Studied. shows the effect sizes obtained by transformation of the odds ratios. P values of less than 0.05 were considered to indicate statistical significance. All the calculations and statistical tests were done with the SAS statistical package (SAS Institute, Cary, N.C.).

RESULTS

We identified 12 large randomized, controlled trials to which 19 meta-analyses corresponded in terms of the populations studied, the therapeutic interventions, and at least one outcome. Since both the primary and the secondary outcomes were considered, a total of 40 outcomes coincided and were included in the analysis.
Table 1TABLE 1Agreement or Disagreement between Randomized, Controlled Trials and Meta-Analyses in 40 Cases in Which the Two Were Compatible. shows the data on which we based our evaluation of the performance of meta-analysis as a predictor of the results of subsequent large randomized, controlled trials. The meta-analysis occupied the role usually assigned to a diagnostic test being assessed, whereas the trial was considered the gold standard. Table 2TABLE 2Variables Measuring the Ability of Meta-Analyses to Predict the Results of Large Randomized, Controlled Trials. shows the results in terms of sensitivity, specificity, and negative and positive predictive values. The results for the consensus opinion are all in a range of values (65 to 70 percent) that corresponds to the values usually obtained in average diagnostic tests. The kappa statistic, which measures agreement beyond that due to chance alone, was 0.35 (95 percent confidence interval, 0.06 to 0.64). Kappa values at or below 0.40 are considered to represent fair-to-slight agreement. Table 2 also shows the results of the sensitivity analysis, which compares the results obtained when the calculations were made on the basis of the consensus between investigators with those obtained when the calculations were based on the dissenting investigator's opinion.
Figure 1 and Figure 2 show the results graphically and include the most pertinent information about each cluster of comparisons. They show that independently of their statistical significance, the point estimates were on the same side of 1.0 in Figure 1 and on the same side of 0 in Figure 2 in 32 of the 40 comparisons (80 percent). No situation was found in which the point estimates were both statistically significant and on opposite sides of 1.0 or 0. All the disagreements thus occurred because one result showed a statistically significant treatment effect, whereas the other indicated that such an effect was lacking. There was a statistically significant difference between the randomized clinical trial and the meta-analysis in 5 of the 40 comparisons (12 percent).
Five positive outcomes from four meta-analyses10,28,31,37 that used fixed-effects models were followed by negative randomized clinical trials. We had the information needed to redo the statistical analyses with random-effects models for four of these outcomes,10,28,31 and the results in all four remained statistically significant.
We found very good agreement between the meta-analysis and the randomized clinical trial with regard to the following six clinical matters: the effect of magnesium on overall mortality in patients with myocardial infarction,12,13 the effect of treatment for hypercholesterolemia on coronary events and mortality from cardiovascular causes among patients with coronary heart disease,14-16 the effect of vitamin A supplementation on mortality from all causes and mortality from diarrhea among children in developing countries,18-20 the effect of angiotensin-converting–enzyme inhibitors on the mortality of patients with congestive heart failure,21,22 the effect of adjuvant therapy on disease-free survival in patients with breast cancer,32,33 and the value of multiple interventions as compared with single interventions in smoking cessation.38,39
Considerable divergence was evident in several other cases. With regard to the effects of late thrombolysis (thrombolysis performed at least six hours after the first symptoms of myocardial infarction)9-11 and nitroglycerin on mortality in patients with myocardial infarction, the meta-analyses were positive, whereas the results of the subsequent large randomized, controlled trials were on the positive side of 1.0 but were not statistically significant. In these instances statistical power could not have been the issue, because the randomized, controlled trials included more patients than the meta-analyses. With regard to the question of preventing intrauterine growth retardation with low-dose aspirin in women at risk of preeclampsia, a clearly positive meta-analysis28 with only 394 patients was followed by a very large randomized, controlled trial with 9364 patients that had negative results.27 Despite a negative meta-analysis,35 a large randomized, controlled trial34 showed that sodium reduction decreases diastolic blood pressure, whereas in the case of calcium supplementation the reverse occurred.34,37
Since the decision to conduct a large randomized, controlled trial could have been made when clinicians and researchers saw a meta-analysis as inconclusive, we examined whether the meta-analysis had already been published at the time the first patient was randomized in the corresponding clinical trial. Four of the 12 trials9,21,30,38 had evidently been started and most probably designed after the publication of the corresponding meta-analysis. Of these four trials, two9,30 (evaluating the merits of thrombolysis and treatment with nitroglycerin) had results that diverged from those of the meta-analysis — that is, a negative randomized, controlled trial did not confirm the findings of a positive meta-analysis.

DISCUSSION

Few will disagree with the use of the large randomized, controlled trial as the gold standard in the evaluation of the efficacy of therapeutic interventions. All the meta-analyses except one that were found by our process of systematic research had been published in major peer-reviewed journals, where they were in a position to influence clinical practice.
The strategy we used to decide whether a given meta-analysis corresponded to a specific randomized, controlled trial raises certain methodologic issues. For the studies to qualify, the population studied, the therapeutic intervention, and at least one outcome had to be similar. In some cases, such similarity could involve judgment and thus be subject to variation between observers. By having two investigators decide independently on the appropriateness of each match, we could quantify the variation and adjust for it. The sensitivity analysis (Table 2) shows that our findings were essentially the same both when the calculations were based on consensus and when they were based on the opinion of the dissenting investigator. Another methodologic issue is raised by the dichotomous classification of the results as positive or negative. The reason for choosing this approach was that the outcome of interest was whether the results of the meta-analysis should be applied to clinical practice. Clinical decisions tend to be dichotomous in that a treatment is said either to work and be recommended or not to work and not to be recommended.
According to our analysis, if there had been no subsequent randomized, controlled trial, the meta-analysis would have led to the adoption of an ineffective treatment in 32 percent of cases (100 percent minus the positive predictive value) and to the rejection of a useful treatment in 33 percent of cases (100 percent minus the negative predictive value). It is important to recognize that these measures of disagreement, which are constructed from the perspective of medical decision making, tend to overstate the degree of statistical discrepancy. This is evident from the fact that in no case was there a divergence in which the randomized clinical trial and the meta-analysis gave statistically significant and opposite answers. Furthermore, wherever the point estimates were located in relation to the “no difference” line, the difference in results between the meta-analysis and the randomized, controlled trial was statistically significant for only 5 of the 40 comparisons (12 percent); this does not appear to be a large percentage, since a divergence in 5 percent of cases would be expected on the basis of chance alone.
In our study, 46 percent of the divergences in results involved a positive meta-analysis followed by a negative randomized, controlled trial. There are several reasons why a meta-analysis might have positive results that would not be confirmed by a subsequent trial. Publication bias refers to the tendency of investigators to preferentially submit studies with positive results for publication, and the tendency of editors to accept them. A meta-analysis that excluded unpublished studies or did not locate and include them would thus be more likely to have a false positive result. The systematic exclusion of papers written in languages other than English (the “Tower of Babel” bias40) can add to the publication bias. In our sample, the use of the fixed-effects model, which narrows the confidence interval, does not appear to account for the statistically positive meta-analyses whose findings were not subsequently confirmed by a randomized trial, since the four studies that could be reanalyzed by the random-effects model remained positive and continued to have statistically significant results when that reanalysis was done.
The remaining 54 percent of identified divergences involved a negative meta-analysis followed by a positive randomized, controlled trial. The heterogeneity of the trials included in the meta-analysis may partially account for divergence of this type, since meta-analysis assumes that such variation is mostly caused by random error, rather than by differences in the characteristics of the selected studies. A properly done meta-analysis involves the a priori determination of strict standards to ensure that the criteria used for the inclusion of patients, the administration of the principal treatment, and the ascertainment of outcome events are similar in all the trials selected. Although according to these strict criteria the protocols of the selected trials look very similar, their application usually yields very different products. The patients enrolled in comparable trials may belong to the same basic population, but even small differences in the criteria for diagnosis, coexisting conditions, severity of disease, and age will produce very different groups of patients. Differences in doses, time to onset, and duration of therapies can also produce substantial disparity among trials that are included in meta-analyses with the intention of evaluating a therapeutic intervention. The choice of concomitant therapies and the degree of leeway in their administration can also affect the results. Changes in medical practice over time may also account for important differences in concomitant therapies, since the trials included in a given meta-analysis are often conducted over a period of a decade or more.
How should clinicians use meta-analyses, given that systematic comparison with randomized clinical trials shows that they have poor predictive ability? Most will agree that if a large, well-done randomized trial has been conducted, practice guidelines should be strongly influenced by its results. The question arises when the only available evidence is from a series of small randomized, controlled trials. The simplest solution, and currently the most popular one, has been to rely on the results of a meta-analysis. Our findings seem to indicate that summarizing all the information contained in a set of trials into a single odds ratio may greatly oversimplify an extremely complex issue. The popularity of meta-analysis may at least partly come from the fact that it makes life simpler and easier for reviewers as well as readers. However, oversimplification may lead to inappropriate conclusions.
The result of this study would appear to encourage readers to go beyond the point estimates and confidence intervals that represent the aggregate findings of a meta-analysis and, as Cook et al. have suggested,41 look carefully at the studies that were included and evaluate the consistency of their results. When the results are mostly on the same side of the no-difference line, the meta-analysis merits more confidence. Others may consider following the advice of Horwitz42 and appraising each trial separately. Although such an approach is admittedly more laborious, it has the advantage of allowing pragmatic clinicians to benefit from the diversity of studies by distinguishing the effects of treatment among them.
We are indebted to Dr. Jean-François Boivin for helpful criticisms and to Ms. Hélène Harnois and Ms. Anita Massicotte for clerical assistance.

SOURCE INFORMATION

From the Research Center, Hôtel-Dieu de Montréal Hospital, and the Department of Medicine, Faculty of Medicine, University of Montreal — both in Montreal.
Address reprint requests to Dr. LeLorier at the Research Center, Hôtel-Dieu de Montréal, 3850 St. Urbain St., Pavilion Marie de la Ferre, 2nd Fl., Montreal, QC H2W 1T8, Canada.


No hay comentarios.:

Publicar un comentario

Write here your comment