Abstract
Keywords
Numerous clinical studies have been conducted to compare the outcomes of anterior cruciate ligament (ACL) reconstruction (ACLR) using various graft options in pediatric patients.
24
ACLR in the pediatric population differs from that in the adult population since surgical techniques must be utilized to minimize the risk of growth disturbance while still providing tibiofemoral stability during pivoting sports activities.
28
Comparative pediatric ACLR autograft studies have provided valuable evidence regarding the efficacy and safety of patellar tendon, quadriceps tendon, and hamstring tendon autografts in the pediatric population.5,7,22,24,34 However, it is essential to critically assess the stability of the conclusions drawn from these studies, as the significance of the study results may be influenced by a small number of outcome events. Critically assessing the stability of study conclusions is particularly important given the current controversy regarding pediatric ACLR graft selection and the equivocal nature of outcomes among different autografts.15,21,37 The equivocal outcomes of pediatric ACLR graft selection can be quantified using a statistical measure called the
Since its introduction
11
in 1990, the concept of the FI has gained recognition as a valuable tool for evaluating the fragility of research studies within various medical disciplines. By quantifying the number of events or outcomes required to nullify the statistical significance of a result, the FI indicates the robustness of the study’s results.11,24,30 The
Although several studies have investigated the outcomes of ACLR using different autograft types in pediatric patients, the FI of these studies has not been explored.8,22,33 Exploring the statistical validity of this body of pediatric ACLR literature is critical, as multiple novel surgical techniques are rapidly being developed, all of which have implications with regard to potential growth disturbance and rerupture rate.
This study aimed to assess the vulnerability and reliability of current research on pediatric ACLR graft choices by evaluating the FI of comparative clinical trials. We hypothesized that the FIs would reveal significant fragility, underscoring the need for careful consideration of the robustness of these research conclusions.
Methods
Primary research published between 2010 and 2023 that investigated comparative outcomes of different autograft types for ACLR in pediatric patients was queried for this study. The initial search strategy involved a well-established methodological querying of the PubMed and Embase online databases for studies related to the ACL or ACLR in pediatric patients.9,12,13,29,38 The titles and abstracts of the retrieved studies were screened by 3 authors (G.S., S.A., and P.H.) for relevance to pediatric ACLR utilizing autografts. Studies were excluded if they met any of the following criteria: (1) no dichotomous outcomes generated, or no
Fragility Metrics
To assess the stability and reliability of the reported outcomes in these studies, the mean FI,
Outcomes Assessed
The outcomes were grouped into objective outcomes—including graft failure and postoperative complications such as arthrofibrosis—and clinical or patient-reported outcomes such as return to play. In addition, the reported
The mean FI, FQ, RFI, and RFQ for all included outcome events were calculated along with their interquartile ranges. Three subgroups were analyzed for significant differences using independent
The data analysis was conducted utilizing Excel Version 16.80 (Microsoft) and R programming language Version 4.3.2 (R Core Team). Descriptive statistics were employed to summarize the fragility data and generate subgroup comparisons.
Results
A total of 1139 studies were initially screened, resulting in 50 studies meeting the initial search criteria. From this final pool, 6 studies were selected for the final analysis. The flow chart of study inclusion is depicted in Figure 1. In these studies, bone-patellar tendon-bone (BPTB), quadriceps tendon, hamstring tendon, and iliotibial band grafts were compared (Table 1).

Identification of studies for inclusion via databases and registers.
Characteristics of the Included Studies (N = 6) a
BPTB, bone-patellar tendon-bone; FI, fragility index; FQ, fragility quotient; HT, hamstring tendon; ITB, iliotibial band; LOE, level of evidence; NA, not available; QT, quadriceps tendon; RFI, reverse fragility index; RFQ, reverse fragility quotient.
Data are presented as means. The number of values used to calculate the mean FI and RFI are included in parentheses.
The mean FIs of the included studies ranged from 0 to 3, with an overall mean of 1.5, indicating that on average, <2 events would annul the statistical significance of the reported outcomes if changed to nonevents. The mean FQ ranged from 0 to 0.01, with an overall mean of 0.006, suggesting that, on average, around 0.6% of the sample size would need to be altered to nullify the statistical significance of the outcome (Table 1). The mean RFI and RFQ were calculated to measure the fragility of nonsignificant results with reported
No difference was observed between the magnitude of fragility between graft complication or clinical versus functional or patient-reported outcomes (
Overall Fragility Data and Analysis of Subgroups a
Data are presented as mean (IQR). FI, fragility index; FQ, fragility quotient; IQR, interquartile range; LTFU, lost to follow-up.
Discussion
The mean magnitude of fragility indices for all comparative outcomes was 2.875, indicating that a mean of <3 events would need to be reversed to alter the statistical significance of most findings within these studies of pediatric autograft ACLR. An approximate FI of 3 suggests the similar vulnerability of the conclusions in pediatric orthopaedic ACLR studies to the previous orthopaedic literature reporting similar FI values in sports medicine studies,
17
and studies focusing on surgical techniques and rehabilitation in pediatric ACL tears.9,23,33 The American Academy of Orthopaedic Surgeons guidelines indicate that an FI ≥2 is desirable.
6
Although the fragility of the negative findings in these studies met the desired standard, the positive findings, which achieved statistical significance (
Notably, none of the included studies conducted an a priori power analysis, and only 1 study 3 conducted a post hoc analysis. Britt et al 3 describe conducting a power analysis that was underpowered at β = 0.8. Power analyses are a crucial component of strong comparative clinical studies that help determine minimal sample sizes 31 and can help guide researchers to reduce fragility, ensure adequate sensitivity, estimate effect size, and assess the risk of type 2 errors in their final analysis.2,4,32 Therefore, we suggest orthopaedic researchers perform a priori power analyses during the study design phase and conduct post hoc analyses to ensure the validity of their findings. When considering the type of outcome, our study revealed no significant difference in fragility between groups of outcomes measuring concrete events such as graft rerupture and patient-reported outcomes such as return to play and functional recovery (Table 2). Patient-reported outcomes have previously faced criticism for their perceived lack of precision, unsubstantiated correlations with overall outcomes, increased susceptibility to recall bias, and inherent challenges with interpretation.10,14,19,20,27 However, through a fragility analysis, patient-reported outcomes can be compared with objective outcomes to help orthopaedic surgeons assess their congruence, evaluate the robustness and quality of patient-reported outcomes, and inform patient-centered clinical decision-making.
The accuracy of patient-reported outcomes is also supported after clinical rehabilitation of ACL tears within the nonoperative setting, 16 suggesting that the inclusion of both concrete and patient-reported outcomes can provide an accurate assessment of treatment outcomes and contribute to the overall validity and clinical applicability of research findings. The characterization of FI and FQ by previous studies demonstrates the moderate vulnerability of the patient-reported outcomes in pediatric ACLR relative to other areas of orthopaedic research.18,25 The most statistically robust conclusions that demonstrate significance drawn from this body of literature are from Maheshwer et al, 21 where an FI of 3 was generated from their analysis comparing the higher rate of retear in hamstring autograft ACLR to BPTB autograft at >2 years of follow-up. This finding suggests that only 3 event reversals would be needed to change the outcome’s statistical significance, indicating moderate fragility. An FI of 0 was generated in analyzing retear rates in 13- to 15-year-old patients who received either hamstring or BPTB autografts, 18 signifying that even a single event change would affect the study’s conclusions, demonstrating extreme fragility. The context provided by these results is critical for our study, as it underscores the variability in statistical robustness across different studies. For patient management, these findings highlight the necessity for clinicians to critically evaluate the robustness of the evidence when making decisions about autograft selection for pediatric ACLR. The fragility of some studies suggests that clinical decisions should not rely solely on statistically significant findings but also consider the FI and other qualitative factors to ensure more reliable outcomes.
Among the nonsignificant results, notable findings emerged from studies such as Morgan et al 25 and Kilkenny et al. 18 Morgan et al reported comparable rerupture rates between BPTB and hamstring autografts, yielding an FI of 7. Similarly, Kilkenny et al observed no disparity in outcomes among 13- to 15-year-old patients who underwent BPTB or hamstring autograft repair, resulting in an FI of 7. Morgan et al reported the lowest FI in our analysis, scoring 0, when investigating the 15-year follow-up of BPTB versus hamstring graft repair and contralateral ACL rupture rates.
Limitations
This study has several limitations. One such limitation is that the FI was not able to be calculated for nondichotomous data. Therefore, several studies and outcomes that examined nondichotomous outcome data in the setting of pediatric autograft ACLR were excluded, as these were unable to be examined with fragility methodology. The outcomes were grouped into graft rupture or arthrofibrosis findings, or clinical and patient-reported outcomes, which was a post hoc analysis performed after the conclusion of the literature search. This review provides a critical outlook on the strength of the studies examining autograft choice in pediatric ACLR, but as autograft choices exhibit individualized indications, the randomization of graft choice was not considered here. We primarily focused on evaluating population-level analysis, neglecting other patient-specific factors such as age, skeletal maturity, and activity level, which play a crucial role in determining tailored treatment approaches.22,28,37 Additionally, the lack of long-term follow-up studies limited our understanding of the durability and functional outcomes associated with different graft options.
Conclusion
The findings of comparative studies investigating outcomes of pediatric ACLR with different autografts were found to be subject to vulnerability when evaluated using fragility metrics. There was a lack of statistically robust data adequately describing the similarities and differences in outcomes between various pediatric ACLR autograft choices. Many outcomes in the literature may be statistically fragile and may require further investigation. Future comparative study analyses should consider evaluating pediatric ACLR studies with long-term follow-ups with fragility metrics to ensure more reliable conclusions.
