Data analysis
The analysis was undertaken on an intention-to-treat basis: patients were analyzed according to treatment allocated, irrespective of whether they received that treatment. The outcomes used were 1) OS, defined as the time from random assignment to death from any cause, censoring patients who had not died at the date last known alive; 2) PFS, defined as the time from random assignment to first documented progression.
Statistical analysis of the overall hazard ratio (HR) for OS and PFS was calculated using Version 2 of the Comprehensive Meta analysis program (Biostat, Englewood, NJ, USA). A statistical test with a P-value less than 0.05 was considered significant. HR >1 reflected more deaths or progression in AI-containing regimens group, and vice versa. Between-study heterogeneity was estimated using the χ2-based Q statistic.21 The I2 statistic was also calculated to evaluate the extent of variability attributable to statistical heterogeneity between trials. Heterogeneity was considered statistically significant whenPheterogeneity<0.05 or I2>50%. If heterogeneity existed, data were analyzed using a random effects model. In the absence of heterogeneity, a fixed-effects model was used. The presence of publication bias was evaluated by using the Begg and Egger tests.22 All P-values were two sided. All confidence intervals (CIs) had a two-sided probability coverage of 95%.
RESULTS
Search results
A total of 320 potentially relevant studies were retrieved electronically, 307 of which were excluded for the reasons shown in Figure 1. Thirteen published RCTs with subgroup analysis assessing the efficacy of AIs in NSCLC according to different histologies were included in the meta-analysis.15,23–34The baseline characteristics of each trial are listed in Table 1. A total of 10,035 patients were available. Six trials were performed in first-line settings, and seven in second-line. According to the inclusion criteria of each trial, patients were required to have adequate renal, hepatic, and hematologic function. The quality of each study was roughly assessed according to the Jadad scale. Ten trials had Jadad score of 5,15,24,25,27–32,34 and three trials had Jadad score of 3.23,26,33
(To view a larger version of Table 1, click here.)