Abstract
Keywords
Introduction
Clinical trials are considered the gold standard research design to provide evidence that can transform clinical care. 1 Monitoring of clinical trial data is an essential component of high-quality trials, supporting data integrity and the protection of trial participants. 2 The global guideline outlining the best practice approach to clinical trial conduct, the International Conference on Harmonisation Good Clinical Practice (ICH-GCP) E6 guideline, released an addendum in 2016 recommending that clinical trial sponsors develop a systematic, prioritised, risk-based approach to monitoring clinical trials; termed risk-based monitoring. 3 Risk-based monitoring promotes flexible monitoring plans that are proportionate to the risk to participant safety and data quality. 4 This approach advocates a shift from frequent on-site visits and 100% source data verification (SDV) towards a combination of monitoring activities, including centralised and remote monitoring, and targeted SDV of data critical to trial results, and has been demonstrated to reduce monitoring costs by up to 35%. 5
Despite the recommendation of risk-based monitoring and associated advantages, a number of studies on monitoring practices in clinical trial settings have highlighted that there has been varied uptake of risk-based monitoring, ranging from 21% to 100% uptake,6–10 with a lack of knowledge and expertise in undertaking risk-based monitoring reported as major barriers to its uptake.6,7 Contributing to the lack of knowledge is the suboptimal reporting of monitoring approaches used in clinical trials, with only half of protocol papers for clinical trials found to include information on monitoring. 11 There is also limited evidence on the effectiveness of different monitoring activities for ensuring data quality. A recent Cochrane review 12 of the effectiveness of different monitoring strategies found that the evidence base is limited in both quality and quantity. While there was moderate certainty of evidence that a risk-based approach to monitoring was as effective as extensive on-site monitoring regarding detection of major or critical monitoring findings in the four common error domains of informed consent, patient eligibility, endpoint assessment, and serious adverse event reporting, there was low certainty of evidence in results for comparisons of most other monitoring strategies.
Although there is currently a lack of knowledge regarding the effectiveness of different risk-based monitoring strategies, targeted SDV of critical data is still commonly implemented as part of the data monitoring plan for clinical trials.6,13 While incorporation of targeted SDV into the monitoring plan is consistent with the recommendation of a prioritised, risk-based approach to monitoring, there is limited guidance on how to optimise its impact on data quality. A review of SDV methods used for quality assurance in clinical trials found heterogeneity in SDV methods and a lack of evidence to support a best practice approach. 14 Given that SDV is a resource-intensive method of assuring data quality, estimated to consume up to 25% of the total clinical trial budget, 15 there is an urgent need to provide greater evidence to guide its implementation in trials.
Using data from a multicentre randomised controlled trial in paediatric patients undergoing cardiac bypass surgery, this study sought to (1) assess the efficiency and impact of targeted SDV implemented as part of the risk-based monitoring approach and (2) using simulated trial datasets, determine the impact on trial outcomes of using a reduced degree of SDV compared to the level used in the trial or a reduced cohort size.
Methods
This is a secondary analysis of data that was collected as part of the NITric oxide during cardiopulmonary bypass to improve Recovery in Infants with Congenital heart defects (NITRIC) trial which recruited 1371 (1364 included in intention-to-treat analysis) patients across six paediatric cardiac centres in Australia, New Zealand, and the Netherlands. Briefly, the project aim was to determine the effect of nitric oxide applied in the cardiopulmonary bypass oxygenator versus standard care on ventilator-free days in children undergoing cardiac surgery. 16 The trial found that the number of ventilator-free days, as well the secondary outcomes measured, were not significantly different between the nitric oxide and standard care groups. During the trial, data relating to patient baseline characteristics, surgical procedure and intervention characteristics, primary outcome measures, secondary outcome measures and safety outcomes 17 were collected in an online REDCap study database, hosted by The University of Queensland.18,19 The data for one study site (82 patients) were excluded from our analyses as a separate monitoring process was performed due to local regulations and processes.
A risk-based data monitoring plan was developed for the NITRIC trial. 16 Prior to the development of the monitoring plan, a risk assessment was conducted to determine the extent of targeted SDV for critical data items identified by the Study Management Committee. On-site and remote monitoring was conducted routinely to verify that data recorded in the REDCap study database were accurate, complete, and verifiable from source documentation. SDV of all critical data items relating to eligibility, randomisation stratification and the calculation of primary and secondary outcomes was undertaken for every enrolled patient. SDV on data items relating to cohort descriptors was completed for a random sample of 10% of enrolled patients (eTable 1 in Supplement). The original REDCap study database was enhanced to facilitate the SDV of the critical data items, maintaining a record of the original data value and the data value verified from source data. Study data were updated based on the findings during SDV. For our assessment, the updated study data are referred to as the monitored dataset, and the original study data are referred to as the unmonitored dataset.
Outcomes
To assess the efficiency of targeted SDV implemented as part of the monitoring approach for the NITRIC trial, time spent undertaking SDV and source to database error rates were calculated. Time spent undertaking SDV was estimated using the time from the start to finish of SDV for each patient as recorded in the logging data available in the REDCap study database. Source to database error rates were calculated as the number of errors divided by the number of data points monitored, expressed as a percentage.
Statistical analysis
Time associated with monitoring was calculated overall and per trial site. Source to database error rates were calculated by treatment group for data monitored across all critical data items and categories of critical data items, and compared using a two-sample test of proportions. Results are presented descriptively, and the percentage difference and 95% confidence interval of source to database error between treatment groups is reported for all categories of critical data.
To assess the impact of targeted SDV on NITRIC trial outcomes, the adjusted estimates of difference between treatment groups were calculated for the unmonitored dataset and monitored dataset across all outcomes undergoing 100% SDV. The adjusted estimate of difference between treatment groups for primary and secondary outcomes was calculated using the same models generated for the NITRIC trial, for the unmonitored dataset and the monitored dataset, which were adjusted for trial site, age at randomisation and cardiac lesion type. 17 Models included logistic regression for binary outcomes and linear or quantile regression for continuous outcomes, dependent on the distribution of the outcome.
Simulation study
A simulation study was then conducted to determine the effects of reduced levels of SDV and reduced cohort sizes on the adjusted estimate of difference between treatment groups for the primary NITRIC trial outcome and any other trial outcomes with >10% source to database error. To assess the impact of reduced levels of SDV, a proportion of records were sampled from the monitored dataset, which then replaced the corresponding records in the unmonitored dataset. The proportions of monitored data that were sampled ranged from 10% to 90% in 10% increments. One thousand bootstrapped samples were executed for each increment. The adjusted estimate of the difference between treatment groups for each outcome was then calculated for each generated dataset. The distribution of the adjusted estimates across the 1000 replicates for each increment is displayed.
To assess the impact of SDV on reduced cohort sizes, 1000 bootstrapped samples of size 100 to 1000 (increasing by 100 increments) were drawn from the monitored dataset and the matching records were drawn from the unmonitored dataset. The adjusted estimates of the difference between treatment groups were then calculated using the sampled monitored and unmonitored datasets. The distribution of the adjusted effects for the monitored and unmonitored datasets across the 1000 replicates for each sample size are reported. Analyses were undertaken in StataSE version 16.0 (StataCorp Pty Ltd, College Station, Texas).
Results
In total, 106,749 critical data items across 1282 participants were verified from source data, either remotely or on-site during the NITRIC trial (Table 1). The total time spent monitoring was estimated at 365 hours, with a median and interquartile range (interquartile range [IQR]) of 10 (7, 16) minutes per participant. Across the five trial sites the median (IQR) time spent monitoring ranged from 5 (4, 8) minutes to 14 (9, 23) minutes per participant, with an increased duration of monitoring at trial site 2 (eFigure 1 in Supplement).
Source to database error rate by treatment group across critical data categories.
CI: confidence interval.
Source to database error rate
Across all data points monitored, the source to database error rate for treatment A and treatment B groups, respectively, was 3.1% and 3.2% (Table 1). Data items for patient eligibility had the lowest rate of error (0% and 0.1% error rate for treatment A and treatment B, respectively), while other data items (i.e. those exclusive of eligibility, randomisation strata and primary and secondary outcomes) had the highest rate of error (6.8% and 5.8% error rate for treatment A and treatment B, respectively). Across all data categories, minimal differences were found in the rate of error between treatment groups.
All outcomes undergoing 100% SDV had a low rate of error (<10%), with the exception of two secondary outcomes: creatinine at 24 hours post-intensive care unit (ICU) admission, with an overall error rate of 17.6% and acute kidney injury at 48 hours post-ICU admission, with an overall error rate of 10.3%. Error rates for each outcome were distributed evenly across treatment groups (eTable 2 in Supplement). Across the five trial sites, the overall source to database error rate was <5%, ranging from 1.6% to 4.4% (eFigure 2 in Supplement).
Impact on study outcomes
There was minimal variation in outcome distributions between the unmonitored and monitored datasets (Table 2 and eTable 3 in Supplement). Accordingly, there was minimal variation in the adjusted estimate of the difference between treatment groups for each outcome between the monitored and unmonitored datasets. For the primary outcome of ventilator-free days, the adjusted estimate of difference and 95% confidence interval (CI) was −0.009 (−0.25, 0.23) for the unmonitored dataset compared to −0.012 (−0.24, 0.22) for the monitored dataset. The secondary outcome with the highest error rate – creatinine at 24 hours post-ICU admission – had the largest difference in the adjusted estimate of difference between datasets; however, the difference remained minimal (unmonitored: 2.05 (95% CI: −0.46, 4.57); monitored: 1.50 (95% CI: −0.87, 3.88)) (Table 2).
Treatment effect for unmonitored and monitored datasets for key outcomes.
CI, confidence interval; ICU, intensive care unit.
Adjusted for age at randomisation, lesion type and site.
Monitored for 100% of trial participants.
Monitored for 10% of trial participants.
Simulation study: degree of SDV effect and sample size effect
Reduced degrees of SDV and reduced cohort sizes were assessed for the primary outcome of ventilator-free days and for secondary outcomes with > 10% source to database error-creatinine at 24 hours post-ICU admission and acute kidney injury at 48 hours post-ICU. For ventilator-free days, the median adjusted estimate of difference between treatment groups for datasets with varying levels of SDV demonstrated minimal variation from the original adjusted estimate of difference for the monitored dataset of −0.012 [95% CI: −0.24, 0.22], ranging from −0.009 [IQR: −0.009, −0.007] for 10% SDV to −0.012 [IQR: −0.012, −0.010] for 90% SDV (eTable 4 in Supplement). For both creatinine at 24 hours post-ICU admission and acute kidney injury at 48 hours post-ICU, datasets with varying levels of SDV levels also demonstrated minimal variation from the original adjusted estimate of difference (Figure 1).

Adjusted estimate of difference between treatment groups at different degrees of source data verification for trial outcomes; creatinine at 24 hours post-admission to the intensive care unit (17.6% error rate) (a), acute kidney injury at 48 hours post-admission to the intensive care unit (10.3% error rate) (b). Box-whisker plots indicate the median and IQR. The dashed horizontal line indicates the adjusted estimate of difference between treatment groups of the outcome for the monitored dataset.
For ventilator-free days, only minor differences in the median adjusted estimated of difference were seen between the unmonitored and monitored dataset for reduced cohort sizes varying in 100 participant increments between 100 and 1000 participants (eTable 5 in Supplement). For a cohort size of 100 participants, the median adjusted estimated of difference for the unmonitored dataset was −0.041 [IQR: −0.185, 0.150], compared to −0.035 [IQR: −0.184, 0.160] for the monitored dataset. For 1000 participants, the median adjusted estimated of difference for the unmonitored dataset was −0.022 [IQR: −0.037, −0.004], compared to −0.015 [IQR: −0.035, 0.005] for the monitored dataset. For both creatinine at 24 hours post-ICU admission and acute kidney injury at 48 hours post-ICU, datasets across the range of cohort sizes were also found to have minimal variation in the adjusted estimate of difference between the unmonitored and monitored dataset (Figure 2).

Adjusted estimate of difference between treatment groups for monitored and unmonitored datasets at different cohort sizes for trial outcomes; creatinine at 24 hours post-admission to the intensive care unit (a), acute kidney injury at 48 hours post-admission to the intensive care unit (b). Box-whisker plots indicate the median and IQR.
Discussion
Risk-based monitoring for clinical trials is advocated by regulatory bodies;20–23 however, there is a lack of evidence-based guidelines for the effective implementation of monitoring activities encompassed in this approach. Our study sought to explore the effectiveness of targeted SDV, which is often the focus of on-site and remote monitoring for clinical trials. Despite a significant time investment and a risk-based approach, targeted SDV implemented as part of the risk-based monitoring approach for the NITRIC trial was found to have minimal impact on trial outcomes.
Given that monitoring of clinical trials is estimated to consume a large portion of the study budget,15,24 selecting the appropriate monitoring strategy should be subject to an evaluation of cost versus benefit to data quality. A substantial time investment of 365 hours was associated with targeted SDV, with some variation in duration across trial sites. The key factor contributing to the increased duration of monitoring at trial site 2 was the monitor’s unfamiliarity with the electronic medical record system. The overall estimated time spent monitoring is likely to be a conservative estimate of the total time investment as it does not consider the additional time required for site personnel to prepare trial documents for monitoring or to facilitate remote monitoring via screen share technology. This is particularly relevant for trial site 5, whereby substantial time was spent by the site requesting paper-based medical records from archive and locating relevant data points in each record before the monitor recorded verification results in REDCap. However, despite a significant time investment, the extent and type of errors identified using targeted SDV resulted in little variation in outcome estimates between the unmonitored and monitored dataset, indicating that SDV had minimal impact on both primary and secondary outcomes for the NITRIC trial.
Even where the extent of error was greatest at close to 18% for creatinine at 24 hours post-ICU admission, which was due to a change in the data definition during the trial, the impact on the outcome estimate was minimal. As creatinine values are used to define acute kidney injury an error rate of >10% was also seen for this outcome, but similarly the impact of this extent of error on the outcome estimate was minimal. The change in definition of creatinine value during the trial warranted targeted additional SDV due to the increased risk of error. However, when prior knowledge or experience can be used to assess the risk associated with critical data points as low, and this risk remains low throughout the trial, a more cost-effective approach to monitor low risk data would be justified.
Centralised monitoring used in combination with targeted SDV has been promoted in the literature as an effective strategy for monitoring data in clinical trials.5,25 Centralised monitoring is characterised by the central review of accumulating trial data to evaluate for inconsistent data or unusual data trends within or across trial sites. The anomalies identified by centralised monitoring can be used to select sites or trial data for targeted SDV for further assessment of potential issues with data quality. 3 This approach lends itself to the detection of systematic errors which are likely to have an impact on the integrity of trial data as opposed to the random errors that are often detected when SDV is used in isolation. 26 Despite the benefit of reducing the extent and frequency of SDV, surveys of monitoring practices have shown that centralised monitoring is not being utilised extensively in some clinical trial settings with reported barriers to implementation being a lack of expertise in centralised monitoring procedures and greater statistical and technological requirements.6,7 In addition, there is a broad variety of reported centralised monitoring methods,27–29 but a lack of certainty regarding their associated implementation and maintenance costs, which may be a further barrier to adoption. 27 While the COVID-19 pandemic has prompted the increased adoption of centralised monitoring for new trials, 25 greater guidance based on more robust measures of the cost-effectiveness of centralised monitored methods is needed to ensure this trend continues and extends to all clinical trials settings.
Our assessment of the impact of targeted SDV was limited to findings for one paediatric clinical trial, which recruited more participants than 94% of clinical trials conducted in the paediatric critical care setting. 30 To support generalisability of our findings to clinical trials with smaller sample sizes, we undertook a simulation study with bootstrapped datasets of varying sizes. While our evaluation of the effect of targeted SDV for reduced cohort sizes demonstrated limited impact on trial outcomes, studies have demonstrated that SDV efficiency is inversely related to the sample size.31,32 The lack of sample size effect seen in our simulation could be due to the nature or extent of error seen for the outcomes assessed. For the NITRIC trial, data collection was almost always performed by clinical staff with extensive experience in intensive care data sets which will have contributed to the low rate of low impact errors seen across the outcome variables. In smaller studies where rates of error for outcome variables are larger or more systematic in nature due to factors such staff inexperience or poorly defined outcomes, a greater impact on outcomes may be seen. Stronger evidence is required to determine the value of targeted SDV for trials with smaller sample sizes. Our assessment was also limited to findings for one method of targeted SDV. Different methods of targeted SDV have been described in the literature14,33,34 and while the method employed for the NITRIC trial was found to have limited impact, more robust evidence is needed regarding the different contexts and manners in which targeted SDV is likely to have a more significant impact on data quality.
Targeted SDV implemented as part of the NITRIC trial was found to be time consuming and ultimately had minimal impact on the interpretation of trial results. For trials of this magnitude of sample size, an investment in the expertise and technology required to facilitate centralised monitoring will reduce the reliance on SDV and likely improve the cost-effectiveness of data monitoring. However, more comprehensive guidance on the practical implementation of different data monitoring strategies and their suitability based on trial characteristics and trial specific risks is required to enhance the cost-effectiveness of the risk-based monitoring approach applied to each trial. Such guidance can only be achieved by further efforts to promote the reporting of monitoring approaches used in clinical trials, such as improving reporting guidelines for monitoring in clinical trial protocols. There is also a need to strengthen the evidence-base for monitoring evaluation in terms of quantity and consistency of evaluation measures. We support the call by Cragg et al. 27 to standardise monitoring evaluation studies to enable the synthesis of cross-study evidence, as well as promote the availability of evidence about monitoring practices. This will provide researchers with greater direction in the appropriate selection and implementation of effective, evidence-based monitoring strategies for their trials.
Supplemental Material
sj-docx-1-ctj-10.1177_17407745231222019 – Supplemental material for Assessing the impact of risk-based data monitoring on outcomes for a paediatric multicentre randomised controlled trial
Supplemental material, sj-docx-1-ctj-10.1177_17407745231222019 for Assessing the impact of risk-based data monitoring on outcomes for a paediatric multicentre randomised controlled trial by Renate Le Marsney, Kerry Johnson, Jenipher Chumbes Flores, Shelley Coetzer, Jennifer Darvas, Carmel Delzoppo, Arielle Jolly, Kate Masterson, Claire Sherring, Hannah Thomson, Endrias Ergetu, Patricia Gilholm and Kristen S Gibbons in Clinical Trials
Footnotes
Author contributions
Declaration of conflicting interests
Funding
Supplemental material
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
