Abstract
Keywords
Introduction
Individual participant data (IPD) meta-analysis of randomised trials is a key method to identify and investigate differential treatment effects (effect modification) in medical research.1–4 Single trials may lack the statistical power to detect detailed subgroup differences in treatment effects.4–6 For this reason, IPD meta-analyses provide an important opportunity to increase power to detect genuine effect modification.2,7 A two-stage IPD approach for estimating treatment effect modification mitigates aggregation bias by first evaluating the effect modification within each individual study and then combining the results in the second stage. 8 However, missing data in individual studies are common and pose challenges when performing an IPD meta-analysis. While sporadically missing data can be managed within a single study, systematically missing data might lead to the exclusion of entire studies.
Systematically missing data is present when one or more variables are not available in one or more studies.9–11 Variables might be systematically missing for different reasons (e.g., different survey instruments or measurement devices, or lack of information) and often pose practical as well as methodological challenges such as the risk of decreasing the ability to evaluate a broader range of effect modifications. Multiple imputation (MI) is a popular method that can be used to retain studies with systematically missing values in IPD meta-analyses.10,12–14 In brief, MI ensures that missing data are replaced by their corresponding imputation samples, resulting in
While some studies have looked at systematically missing covariates including confounders, 9 few have assessed scenarios on when a key effect modifier (EM) might be missing. Therefore, in this analysis we consider the use and evaluation of a two-stage imputation approach to impute systematically missing values of an EM in an IPD meta-analysis with a small number of trials. Identifying, validating, and successfully analysing treatment effect modifications in clinical practice is challenging. Therefore, it is crucial to consider all available studies in an IPD meta-analysis to avoid losing vital information. 19 The rationale for using MI in this context is primarily to retain all studies in the analysis, thereby assessing treatment effect heterogeneity across the entire population.
The remainder of this paper is structured as follows. First, we describe the method employed for imputing systematically missing values. Subsequently, we outline the structure of the simulation study, encompassing the data generating mechanism (DGM), analytical model, and performance criteria.20,21 The results of the study are then presented, followed by an application of the imputation method to an IPD meta-analysis of 10 randomised controlled trials assessing the efficacy of postoperative radiotherapy (PORT) in patients with completely resected non-small cell lung cancer. We conclude with a discussion of the strengths and limitations of the proposed approach.
Conditional quantile imputation
In this section, conditional quantile imputation (CQI) is introduced for discrete systematically missing data in IPD meta-analysis. The approach extends that of the standard approach of imputing categorical data in single studies. An overview of CQI for continuous data can be found elsewhere.22,23 An explanation of the notations used throughout the paper follows. Let the index
The imputation model
An imputation model is specified in studies
For all studies in
For all studies in
Data generating mechanism
We first describe the DGM for the simulation of a single trial followed by a description of the DGM of multiple trials forming the IPD meta-analysis. We then outline the different scenarios considered for the simulation study.
Single trial
We considered a randomised controlled trial (
The outcome random variable,
A common and heterogeneous DGM for the generation of multiple trials was considered. We now describe the parameters underlying the Weibull survival model for a common and heterogeneous treatment effect modification.
The hypothesised treatment effect modification underlying the individual time-to-event outcomes was a beneficial
The EM,
Next, the regression coefficients of the two product terms between the treatment,
Scenarios
We generated IPD meta-analyses under a common and heterogeneous treatment effect modification with
Imputation models
In all scenarios, we imputed the systematically missing values for
The following imputation model differs from the initial one in a way that it is incomplete. We specified the imputation model omitting all product terms,
Based on the observed and imputed data of the trials included in the IPD meta-analysis, a two-stage approach was used to obtain an estimate of the regression coefficients of the multivariable Weibull survival model. In a two-stage approach, study-specific regression coefficients are first estimated within each study based on imputed data and then combined across studies with a multivariate meta-regression model according to a common or heterogeneous treatment effect modification.
27
The inverse variance method and the restricted maximum likelihood were used for the common and heterogeneous (random) treatment effect modification, respectively.
24
Rubin’s rules were used to aggregate the estimates of the regression coefficients across imputations. Following standard notations,
25
we briefly outline Rubin’s rules for a single parameter. Starting from the
Furthermore, a multivariate Wald type test, conducted with a Type I error of 5%, for the hypothesis of no effect modification translated into testing that the two regression coefficients
The principle measures to assess the performance of the two-stage imputation method were bias and coverage.
20
Bias was computed as the distance between the average estimated regression coefficients, (
Results
The results of the performance of CQI in all scenarios for the common and heterogeneous treatment effect modification are shown in Tables 1 and 2, respectively. The average estimates of the regression coefficients describing the effect of the treatment,

Sampling distribution of the estimated mortality hazard ratios conferred by the treatment at different levels of the low (

Sampling distribution of the estimated mortality hazard ratios conferred by the treatment at different levels of the low (
Log hazard ratios (average and performance measure) for the treatment effect at different levels of the EM by the complexity of the imputation models and by the size of the studies with systematic missing data on the EM.
Results are presented for 1/6 and 1/3 of the studies with systematically missing data. All simulated data was generated under a
Log hazard ratios (average and performance measure) for the treatment effect at different levels of the EM by the complexity of the imputation models and by the size of the studies with systematically missing data on the EM.
Results are presented for 1/6 and 1/3 of the studies with systematically missing data. All simulated data was generated under a
The absolute value of the bias after using CQI to impute the systematically missing values for the EM was less than 0.016 for all effect estimates for the specification of the imputation model with two product terms. Bias was comparably low for scenarios with 6 and 12 studies, regardless of the number of studies with systematically missing data on the EM. The largest bias with the congenial imputation model across all scenarios was found for the scenario with 12 studies when 1/3 of the studies had systematically missing data on the EM with 0.006,
The coverage of the treatment effect estimates were close to 95% in all scenarios. Coverage was at 95.7%, 93.7%, and 95.7% for the beneficial, null, and harmful treatment effect of the EM, respectively, for 12 studies when a sixth of studies had systematic missing values on the EM. With a third of the studies having systematically missing data, the coverage for the same scenario remained very similar at 95.4%, 93.6%, and 94.0% for the beneficial, null, and harmful treatment effect of the EM, respectively. The small differences in coverage are explained by random variability in the estimates. When the imputation model included no product terms, coverage was 95.0%, 97.8%, and 95.2% for the three levels of the EM, respectively.
Notably, the precision in all effect estimates increased after using MI to retain trials with systematically missing values on the EM. The largest gains in precision were made when a higher proportion of studies with missing data was imputed. This is evident in the consistently smaller ESE and corresponding MSE between the complete case analysis and MI with the congenial specification of the imputation model. For example, the ESE for the complete case analysis in the scenario of 12 trials with 4 trials having systematically missing data was 0.110, 0.098, and 0.123 for the low, null, and harmful treatment effect, respectively. After retaining the four trials in the analysis, the ESE was reduced to 0.102, 0.089, and 0.115, respectively. In addition, the MSE was lower (about 50%) for the scenarios with 12 studies compared to the scenario with only six studies, regardless of the number of studies with systematically missing data on the EM. Table 1 contains all values for the scenarios of the common treatment effect modification.
Heterogeneous treatment effect modification
Overall, bias was lower compared to the scenarios of common treatment effect modification. In all scenarios simulated under a heterogeneous treatment EM, the absolute value of the bias remained less than 0.009 for all effect estimates under the congenial specification of the imputation model. There were no stark differences in the magnitude of bias between the scenarios. With 12 studies, when 1/3 of the studies were imputed, the bias for all levels of the EM remained below 0.002 under the specification of the imputation model with two product terms. Under the incomplete (zero product terms) specification of the imputation model, large bias was introduced in all parameters in all scenarios. The absolute value of the bias increased up to 0.069 for the estimates of the EM. Here, with 1/3 of the studies having systematically missing data on the EM, bias was highest for the additional effect of the harmful EM (
Coverage remained close to 95% in most scenarios of the heterogeneous treatment effect modification. For 6 studies with 1/3 of the studies being imputed, the coverage was at 94.9%, 91.5%, and 94.5% for the beneficial, null, and harmful treatment effect of the EM. For the same scenario with 12 studies, coverage levels were at 94.5%, 93.1%, and 93.2%, respectively. Here, the coverage of the interaction terms itself was marginally closer to the nominal level at 94.5% and 95.1%.
Similar to the common treatment effect modification, precision improved when using MI to retain trials with systematically missing values on the EM, most notable in the smaller ESE and MSE values compared to the complete case analysis. Additionally, the MSE was 50% lower in the scenarios with 12 studies, compared to those with 6.
In summary, in all scenarios under the specification of the congenial imputation model, no substantial bias was found in estimates at all levels of the EM. The magnitude of the bias was not affected with an increase in the number of studies with systematic missing values on the EM. The specification of the imputation model with zero product terms always resulted in an increase in bias of the estimates, regardless of the number of studies with systematically missing data on the EM. Using a two-stage imputation approach to impute systematically missing EM values, we observed an improvement in the precision of effect estimates in all scenarios under the congenial specification of the imputation model compared to the complete case analysis.
The effect of PORT on survival at different stages of the disease
In this section, we illustrate the use of CQI for a systematically missing EM in an IPD meta-analysis of trials. The main question was about the effectiveness of PORT in patients with completely resected non-small cell lung cancer at different stages of the disease. A multivariate IPD meta-analysis of PORT versus surgery alone on survival in patients with resected non-small-cell lung cancer at different stages of the disease was performed.
28
The outcome in all trials was time from randomisation to death from all causes or censoring, whichever came first. The disease stage of the patients was measured in three different levels ranging from stage I to III. The 10 trials included in this study had a varying sample size ranging from 69 to 316 participants per trial with a total of 1642 participants and 1082 deaths across all 10 trials. A common treatment effect modification underlying the trials was estimated with a two-stage multivariable Cox proportional hazard regression model with
The data used in this paper is a publicly available, real-data based simulated example with similar characteristics to the PORT data and is only used for the illustration of the presented imputation method. 29
In this sample of data, only using the seven studies with complete information on disease stage, the statistical test did not provide a strong indication against the hypothesis of a homogeneous effect of PORT on mortality across stages of the disease (
Similar to the complete case analysis, the statistical test when using all 10 studies after using CQI to impute the systematically missing values for disease stage in three trials did not provide a clear indication against the hypothesis of a homogeneous effect of PORT on mortality across stages of the disease (
Mortality HRs, 95% CIs, and SEs conferred by postoperative radiotherapy at different stages of the disease in complete case and MI datasets.
The individual participant data meta-analysis included 10 trials, 7 with complete information on the disease stage, 3 with systematically missing data on disease stage. Conditional quantile imputation was used to impute the systematically missing data based on the trials with complete information. Data were analysed with a two-stage multivariable Cox-regression model. MI: multiple imputation; HR: hazard ratio; CI: confidence interval; SE: standard error.
Mortality HRs, 95% CIs, and SEs conferred by postoperative radiotherapy at different stages of the disease in complete case and MI datasets.
The individual participant data meta-analysis included 10 trials, 7 with complete information on the disease stage, 3 with systematically missing data on disease stage. Conditional quantile imputation was used to impute the systematically missing data based on the trials with complete information. Data were analysed with a two-stage multivariable Cox-regression model. MI: multiple imputation; HR: hazard ratio; CI: confidence interval; SE: standard error.
The described example of imputing the three trials with systemically missing values on disease stage suggests a beneficial use of MI in this scenario. While the empirical results of the analysis did not change substantially compared to the complete case analysis, all studies are retained in the analysis without introducing substantial bias and pointing towards a harmful effect of PORT on survival in the later stage of the disease. By using MI to retain the three trials with systematically missing values on disease stage, we were able to include an additional 455 participants including 297 mortality cases in the analysis. While no substantial differences in the mortality HRs between the complete case analysis and the analysis with MI were estimated, the precision in the effect estimates at all levels of disease stage increased, without distorting the association that was estimated in the complete case analysis. This is in line with the results presented from the simulation study. Overall, in the example of the effect of PORT on survival at different stages of the disease, using MI to retain three trials with systematically missing values on disease stage indicates a worthwhile use to increase the generalisbility and precision of clinical effect measures without introducing bias.
Discussion
This simulation study evaluated a two-stage imputation method based on conditional quantiles to assess its performance on retaining studies with systematically missing EMs in IPD meta-analyses. We evaluated the feasibility of imputing systematically missing EM in IPD meta-analysis with a limited number of trials (6 and 12) under a common and heterogeneous treatment effect modification. The results demonstrated that the bias for all levels of the EM was considerably low for the common and heterogeneous treatment effect modification. Compared to the complete case analysis, using the two-stage imputation approach to retain all trials with missing data in the analysis improved the precision of effect estimates in all scenarios under a congenial specification of the imputation model. The bias increased for all scenarios when the imputation model was incompletely specified, i.e., missing important product terms.
Performance of CQI
First, the proposed approach, CQI, indicated no substantial difference in bias between smaller and larger IPD meta-analysis nor substantial differences when the number of studies with systematically missing data on the EM doubled (i.e., from one sixths to one third). In fact, the performance showed a negligible bias for both common and heterogeneous treatment effect modification IPD meta-analysis that is comparable with biases reported for other MI methods. 14 The average effect estimates after using CQI were comparable with those of the complete case analysis. Even though a complete case analysis can be justified when assuming missing completely at random (MCAR), it is suboptimal when the fraction of studies with missing EM is large. As we demonstrated in this simulation study, this can lead to a reduction in the precision of effect estimates and substantial loss of data. Bias increased substantially when the imputation model was incompletely specified with zero product terms (i.e., not congenial with the substantive outcome model). This is in line with previous work that has demonstrated the importance of correctly specifying the imputation model, whereby failing to do so leads to biased inference. 16 Sensitivity analyses are often used in practice to explore the implications of different imputation models.30–32
Second, coverage was close to the nominal level for all scenarios that were assessed. Some small deviations from the desired 95% were observed in scenarios with a larger number of studies and when a third of the studies had systematically missing data on the EM. In the same scenarios, similar deviations from 95% were seen for the complete case analysis. This is in line with simulation results presented by Resche-Rigon and White 11 indicating slight undercoverage in the presence of only systematically missing data.
Third, we showed that using a two-stage imputation approach in the context of systematically missing data in IPD meta-analysis improved the precision of effect estimates compared to the complete case analysis. That is, lower ESE and MSE across all scenarios were estimated reflecting the better use made of the data. In addition, by including all trials in the analysis, the generalisbility of study findings potentially increases, although it is difficult to quantify this numerically. In summary, CQI resulted in an analysis with (1) low bias, (2) coverage close to the nominal 95% level, and (3) small variance.
Assumptions
Despite satisfactory performance benefits of the proposed method, certain assumptions need to be discussed. First, we assumed that the underlying treatment effect at different levels of the EM is consistent across all trials. This assumption supports the interpretation of observed variations in treatment effects across levels of the EM as true interactions, as opposed to artefacts from trial-level heterogeneity, such as differences in study design, populations, or implementation. Second, we assumed that all trials collected similar information on other covariates used to inform the imputation model. This assumption is more likely to be satisfied in prospective IPD meta-analyses, where common data collection, analysis plans, and harmonisation strategies are implemented, thereby reducing the risk of systematic missing values. 33 In retrospective IPD meta-analyses, however, data collection may have occurred at different time points and protocols may vary across studies, increasing the risk of systematic missingness. 34 Related to this, we also assumed that the EM has the same number of categories (or levels) across trials and was measured consistently across studies. In IPD meta-analyses, harmonisation of covariates is a crucial step prior to implementing any MI approach. Therefore, EMs with different distributions across trials must be harmonised before applying the two-stage imputation approach. Last, we addressed systematic missing data under a MCAR assumption. As previously mentioned, a complete case analysis is theoretically justifiable, albeit suboptimal due to the significant loss of data. Under a MAR assumption, the approach to impute systematically missing values of the EM would not differ. While the results may become more sensitive to the choice of predictors included in the imputation model assuming MAR, simulation studies have shown small performance differences between MCAR and MAR in multilevel data settings for sporadically and systematically missing data.11,35
Limitations
Despite demonstrating an approach for imputing a systematically missing EM in IPD meta-analyses with a limited number of trials, this study is subject to a number of limitations. First, we aimed at simulating simplified, yet complex enough, scenarios that are related to realistic challenges in IPD meta-analysis, whilst remaining accessible for researchers that are faced with such problems. While there is an abundance of scenarios that were not considered, we are confident that the chosen scenarios can give an intuition about the performance of the CQI method and observation of a general trend of when MI is worth considering in IPD meta-analysis to impute systematically missing EM. Second, we solely focussed on the impact of systematically missing data and did not consider scenarios with sporadically and systematically missing data at the same time. Third, in all our simulations we used 30 imputations. We did not test whether the performance, in particular the coverage, changes with an increasing number of imputations due to reasonable computation time. Last, we did not consider larger degrees of heterogeneity across studies in our simulations due to the focus on IPD meta-analyses with a limited number of studies.
Phases of methodological development
Based on the phases of methodological research in biostatistics according to Heinze et al.,
36
this work can be categorised as part of phase I to II developments of a two-stage imputation procedure for systematic missing covariates in IPD meta-analysis in line with previous work.
11
Facilitating software is available in Stata and provided in the resources linked to this study. Further analysis is needed to test the approach in a variety of settings such as different outcomes, varying degrees of heterogeneity and DGMs. In particular, future continuation on the refinement of this approach can be directed towards:
Extending the approach to multivariate imputation as proposed in Resche-Rigon and White:
11
Further software implementation of CQI into MI with chained equation is a logical step to make it more widely applicable in scenarios with sporadically and systematically missing values. An extension of CQI to continuous systematic missing covariates using quantile regression has been recently presented.
37
Integrating random effects into the two-stage imputation procedure to increase heterogeneity between imputed datasets: Given the small number of studies in this analysis, a random-effects model would provide a poor estimate of the variability between studies. However, using a common-effects model to derive average regression coefficients for assigning values to the missing EM can result in more homogeneous imputed datasets.
This simulation study presented and evaluated the use of a two-stage imputation procedure - CQI - to impute systematically missing EMs in IPD meta-analyses with a limited number of trials. The absolute bias for common and heterogeneous effect IPD meta-analyses was less than 0.016 and 0.007, respectively, with coverage close to its nominal value across all levels of the EM. In addition, CQI improved the precision of pooled effect estimates compared to a complete case analysis excluding trials with systematically missing values on the EM. An incomplete specification of the imputation model resulted in biased inference even if the proportion of studies with systematically missing data was small.
